A NEW TOOL FOR DISCOVERING TRANSCRIPTIONAL REGULATORS OF CO-EXPRESSED GENES PREDICTS GENE REGULATORY NETWORKS THAT MEDIATE ETHYLENE-CONTROLLED ROOT DEVELOPMENT

Gene regulatory networks (GRNs) are defined by a cascade of transcriptional events by which signals, such as hormones or environmental cues, change development. To understand these networks, it is necessary to link specific transcription factors (TFs) to the downstream gene targets whose expression they regulate. Although multiple methods provide information on the targets of a single TF, moving from groups of co-expressed genes to the TF that controls them is more difficult. To facilitate this bottom-up approach, we have developed a web application named TF DEACoN. This application uses a publicly available Arabidopsis thaliana DNA Affinity Purification (DAP-Seq) dataset to search for TFs that show enriched binding to groups of co-regulated genes. We used TF DEACoN to examine groups of transcripts regulated by treatment with the ethylene precursor 1-aminocyclopropane-1-carboxylic acid (ACC), using a transcriptional dataset performed with high temporal resolution. We demonstrate the utility of this application when co-regulated genes are divided by timing of response or cell-type specific information, which provides more information on TF/target relationships than when less defined and larger groups of co-regulated genes are used. This approach predicted TFs that may participate in ethylene-modulated root development including the TF NAM (NO APICAL MERISTEM). We used a genetic approach to show that a mutation in NAM reduces the negative regulation of lateral root development by ACC. The combination of filtering and TF DEACoN used here can be applied to any group of co-regulated genes to predict GRNs that control coordinated transcriptional responses.


Introduction
Transcription factors (TFs) are important regulators of gene expression across all kingdoms of life. In Arabidopsis thaliana, nearly 7% of all protein-coding genes are predicted to encode TFs (1,851 out of 27,655;Yilmaz et al. 2011;Cheng et al. 2017). In plants, the beststudied TFs include those that control developmental patterning and responses to hormones and environmental cues. Yet only a small fraction of all TFs have been well-characterized; 1,152 out of 1,851 predicted TFs in AGRIS have no gene name and/or no description, and another 375 gene descriptions contain the word -putative‖ (Yilmaz et al. 2011). The TFs whose functions are known are neither sufficient to control the diversity of responses that are seen in the transcriptome across different developmental stages and experimental conditions, nor the myriad of growth and developmental responses achieved under these conditions. Given the number of uncharacterized TFs and the variety of plant signals and responses, matching TFs with biological function remains a considerable task.
In Arabidopsis, traditional mutant screening has been a powerful tool in the discovery of key gene products that control hormone and stress response pathways, including important TFs (Page and Grossniklaus 2002). Yet, because this technique relies on striking phenotypes that result from mutations in individual genes, there is a point at which a particular pathway becomes saturated; that is, all genes that can be discovered by a particular screen have already been identified (Pollock and Larkin 2004;Tokunaga et al. 2014). Additionally, transcription factors often form complexes with other TFs, show redundancy across TF families, or are part of larger signaling networks, all of which may result in subtle or absent phenotypes when single TF genes are mutated (e.g. Xu et al. 2020). Although creating double, triple, and higher order mutants can A c c e p t e d M a n u s c r i p t reveal roles for TF families, it is often a long process without guaranteed results. New tools are needed for identifying TFs linked to particular responses.
The rise of affordable technologies to monitor genome-wide changes in transcript abundance has given researchers a new suite of tools for understanding TF function. Microarray and RNA-Seq can reveal transcriptional responses with high spatial (Brady et al. 2007;Tang et al. 2009;Bargmann et al. 2013) and temporal resolution (Lewis et al. 2013;Harkey et al. 2018).
TFs of interest can be further characterized by performing transcriptomics with TF mutants or overexpression lines. However, these techniques won't find all TFs involved in a process because these TFs are not necessarily transcriptionally responsive themselves, and so will not always be identified in a transcriptomic dataset; for example, the abundance of transcripts encoding the TF EIN3 (ETHYLENE INSENSITIVE3), which plays a central role in many ethylene responses, does not change in abundance in response to ethylene (Chao et al. 1997). Coregulated genes can also be analyzed for common promoter elements to suggest candidate TFs.
However, untargeted motif searching methods are limited by the background noise inherent in looking for a short motif (typically around ten base pairs) among promoter sequences that are one thousand or more base pairs long (Bailey and Elkan 1994;Bailey et al. 2009). If a particular TF's motif is found, its presence in a promoter region does not guarantee regulation of that gene by that TF. For example, one recent paper found that 91% of all genes across 45 different species contained an auxin response element (AuxRE) (Lieberman-Lazarovich et al. 2019), but it is unlikely that all of these genes are auxin responsive based on the numbers of auxin responsive transcripts (Vanneste et al. 2005;Stepanova et al. 2007;Lewis et al. 2013), and it is not possible to predict which auxin response factors (ARFs) might bind to a given promoter based on this data alone. Downloaded from https://academic.oup.com/insilicoplants/advance-article/doi/10.1093/insilicoplants/diaa006/5885168 by guest on 19 August 2020 A c c e p t e d M a n u s c r i p t Perhaps the best method for identification of a TF's downstream targets is examination of direct TF-target interactions through techniques such as yeast one-hybrid assays (Ouwerkerk and Meijer 2001) or ChIP-Seq (Robertson et al. 2007), the latter of which can identify the binding motif of a particular TF. Typically, ChIP-Seq is used after a TF has already been identified as important and is performed using chromatin samples isolated under specific growth and developmental conditions, leading to identification of only a subset of binding sites for a given TF (Bartlett et al. 2017). Although it is possible to perform ChIP-Seq across many TFs to create a new type of screen for transcriptional regulators, this approach would be both costly and timeconsuming to repeat for multiple signaling pathways, responses, and tissue types.
Another new method has been developed called DNA Affinity Purification (DAP)-Seq, which can be performed in vitro, allowing binding sites for a large number of TFs to be identified (Bartlett et al. 2017 providing a wealth of new information on targets of TFs (O'Malley et al. 2016). This DAP-Seq data creates an opportunity to narrow down the list of potential regulators in an unbiased way.
We have used the ethylene response in roots to demonstrate the utility of combining DAP-Seq data with co-expression data to provide new insights into transcriptional regulators that control plant response to the hormone ethylene (Harkey et al. 2018(Harkey et al. , 2019. Ethylene regulates many aspects of plant development, including hypocotyl elongation (Bleecker et al. 1988;Guzman and Ecker 1990;Chang et al. 1993), responses to stress (Wang et al. 2013;Dey and Vlot 2015), fruit ripening (Cara and Giovannoni 2008;Chen et al. 2018;Fabi and Ramos do A c c e p t e d M a n u s c r i p t Prado 2019), and formation of lateral roots (Ivanchenko et al. 2008;Negi et al. 2008) and root hairs (Tanimoto et al. 1995).
One of the first Arabidopsis signaling pathways identified by mutant screens was the ethylene signaling pathway (Merchante et al. 2013). Researchers used the ethylene -triple response‖ during which dark-grown seedlings treated with ethylene, or its biosynthetic precursor 1-aminocyclopropane-1-carboxylic acid (ACC), have shorter, thicker hypocotyls and roots, and an exaggerated apical hook, compared to their air-grown counterparts (Bleecker et al. 1988;Guzman and Ecker 1990;Chang et al. 1993). These screens identified ethylene-insensitive (EIN) or Ethylene-Response (ETR) mutants as tall seedlings among a lawn of short ethylene responders. Ethylene response genes identified include receptors such as ETR1 (Chang et al. 1993) and essential signaling proteins such as Ethylene Insensitive 2 (EIN2) (Alonso et al. 1999) and Constitutive Triple Response 1 (CTR1), a kinase that in the mutant has constitutive ethylene response (Kieber et al. 1993). These studies also revealed the basics of an ethylene-regulated transcriptional network. The EIN3 TF is necessary for most ethylene responses in dark-grown seedlings, and EIN3-LIKE (EIL) family members play a smaller role (Chao et al. 1997;Solano et al. 1998 In further support of EIN3-independent transcriptional responses in roots of light-grown seedlings, ein3 and ein3/eil1 mutants still exhibited root developmental changes in light-grown seedlings in response to ACC (Harkey et al. 2018 However, DAP-Seq data may be able to provide insight into other transcriptional mediators of ethylene response by asking if clusters of genes encoding transcripts that respond to ethylene or ACC are enriched in genes targeted by specific TFs.
We have developed a tool for analyzing all DAP-Seq data across groups of co-regulated genes to -screen‖ for transcription factors that may control their concurrent changes in expression and have applied this tool to ethylene-regulated gene clusters. This tool enables the user to identify TFs which may be important for the co-regulation they observe. We demonstrate the utility of clustering co-regulated genes based on the similarity of their transcript abundance across time and space to find potential regulators that are evident only when clusters of co-A c c e p t e d M a n u s c r i p t expressed genes are examined. We have used this tool to analyze transcriptional data from roots treated with ACC to increase ethylene levels, revealing TFs that may act downstream of this hormone signaling to control changes in root development. Several TFs that are tied to root development showed enrichment of their targets within this group of ACC-regulated transcripts.
The NAM (No Apical Meristem) TF shows reduced transcript abundance after ACC treatment (Harkey et al. 2018) and a mutant in this TF has enhanced lateral root formation and reduced response to ACC, consistent with the loss of a transcriptional regulator. A T-DNA insertion mutant for this TF had altered lateral root development and ACC developmental responses suggesting a role for this TF in ACC-mediated changes to lateral root development. This finding supports the feasibility of identifying important TF target interactions using this method and approach.

TF DEACoN: A new tool for predicting TF regulation of co-expressed genes
To look for groups of genes that are enriched for targets of specific transcription factors, we developed a new tool called TF Discovery by Enrichment Analysis of Co-expression Networks (TF DEACoN). This tool can be used to generate hypotheses about transcriptional regulators of groups of co-expressed genes. This app was written in R Shiny (R Core Team 2014; Chang et al. 2020) and is available to use at aharkey.shinyapps.io/tfdeacon; the code is available under an open source GPL-3.0 license at https://github.com/aharkey/TFDEACoN. TF DEACoN works by searching for enrichment of transcription factor targets in groups of co-expressed genes. The input for TF DEACoN is a list of Arabidopsis locus identifiers (called the query) which the user has identified as co-expressed in a specific tissue or A c c e p t e d M a n u s c r i p t developmental context, in response to a treatment, or in a mutant. The statistical basis for the analysis is a comparison of two ratios: the ratio of TF targets in the query to the total number of genes in the query, and the ratio of all TF targets in the genome to the number of genes in the whole genome. A TF that is important for driving transcription of a cluster of co-expressed genes would be expected to have a higher ratio of targets in the query than in the entire genome. TF DEACoN compares these two ratios for each transcription factor for which DAP-Seq data is available using Fisher's exact test. It also calculates a log2 fold change (logFC) in which the query ratio is divided by the genome ratio, such that a positive logFC means there is a higher proportion of TF targets in the query than in the entire genome, and a negative logFC means there is a lower proportion in the query. If the logFC is positive and Fisher's exact test gives a significant p value, we can consider this TF to show target enrichment in the query genes. After calculating these values for every TF, TF DEACoN uses a Benjamini-Hochberg p-value correction for multiple comparisons. Similarly, a maximum p-value removes any results which are not statistically significant and allows the user to choose between three levels of stringency.
A c c e p t e d M a n u s c r i p t Selecting -Submit‖ will display a list of TFs in the Results panel; if no filters are selected, all TFs will be displayed, but if filters are selected, only those TFs which meet the cutoff for substantial and/or significant enrichment will be displayed. If the user is unsure of which cutoff to apply before running an analysis, they may apply a cutoff after results are displayed. The results include each TF's gene identifier (TF ID), the TF family (TF family) and TF name (if known) (Gene name(s)), the number of genes in the genome as well as the query that have been found to be targets by DAP-Seq (Genome/Query target count), and a ratio of targets to total number of genes for both the whole genome and for the query (Genome/Query ratio), the logFC of target enrichment to the genome (logFC), the p value, and the adjusted p-value from the Fisher's exact test comparing the genome and query ratios. By default, results are sorted by adjusted p-value, but the user can select different categories to sort by; a search bar also enables users to find a specific TF or TF family. A CSV (comma-separated values) file of results can be downloaded using the button below the results table. Any logFC or p-value filters that are active at the time of download will also be applied to the downloaded file, such that only genes which pass the filters will be included in the download. To download all results, both filters should be unchecked before downloading. There is also an additional Query panel that allows the user to confirm that all the genes in the input were recognized.

TF-DEACoN finds enriched targets in clusters of genes with similar kinetics of response to a hormone
Previously we generated a transcriptomic dataset that examined the kinetics of ACC response in roots of light-grown seedlings across eight treatment times (Harkey et al. 2018). This dataset was filtered to identify genes that were expressed above background and that showed However, when we ran the 10 individual clusters, targets of 97 TFs were enriched in at least one cluster with a logFC ≥ 1, and 31 of those were enriched in two or more clusters. TFs with enrichment in two or more clusters are included in Table 1, which reports the p-values for this enrichment for each TF in each of 10 ACC response clusters. Additional information about the percentage of TF targets in each cluster can be found in Supplemental Figure 2, and TF DEACoN output for all TFs in these 10 clusters can be found in Supplemental File 1. This A c c e p t e d M a n u s c r i p t demonstrates the utility of clustering based on the temporal response before searching for TF enrichment.
The EIN3 TFs was enriched in targets in cluster 1 and 4, with the greatest enrichment in cluster 4, matching what we previously calculated manually (Harkey et al. 2018). In that study, we found that cluster 4 was also enriched in gene ontology terms related to ethylene response genes, and it included many ethylene-related genes which are known to be regulated by ethylene itself (Harkey et al. 2018). This suggested a role for EIN3 in controlling a subset of ethyleneinduced transcripts in multiple tissue types but suggested that other TFs were controlling the remaining transcriptional responses.

ACC response clusters identify candidate TFs for ACC-regulated root development
Several TFs identified as being enriched in binding targets within the ACC temporal clusters by TF DEACoN are linked to root developmental processes that are regulated by ethylene. One TF that was enriched in multiple clusters (2, 3, 4, and 6) was NAC4/ANAC080/ANAC07, which is a member of the NAC (NAM, ATAF1/2, and CUC2) family of TFs. Previously NAC4 transcript abundance was reported to be nitrogen-responsive in roots (Vidal et al. 2014). To determine whether the transcript abundance of these NAC4 targets identified by DAP-Seq are regulated by NAC4, we searched for overlap between the TF DEACoN predicted targets and an RNA-Seq dataset in which this TF was overexpressed in root protoplasts (Brooks et al. 2019).
To illustrate relationships between these datasets we generated an UpSet plot, which is shown in Figure 2. An alternative to a complex Venn diagrams, UpSet plots show overlap between sets of data as a bar graph. Each combination of datasets, or overlap group, is

Genes which respond to ACC or ethylene under many different conditions are enriched for targets of EIN3, an important ethylene response regulator
While the 449 ACC-responsive genes from Harkey et al. 2018 represent ACC response in roots of light-grown plants, we were also interested in the subset of those genes which respond to ACC or ethylene under multiple growth conditions. We previously identified a group of 139 ethylene-responsive genes in a meta-analysis of the ACC time-course experiment described above and two other independent experiments that looked at root-specific ethylene or ACC responses (Harkey et al. 2019). When we ran this group, which we termed the -gold standard‖ of root ethylene responses, through TF DEACoN, we found many TFs enriched, including 15 A c c e p t e d M a n u s c r i p t which had a significant p value (p < 0.05) and a logFC greater than 1 (Table 2). We also separated these -gold standard‖ genes by direction of response to ethylene or ACC; 2 TFs whose targets were not enriched in the whole group were enriched in the subset of up-regulated genes, and 10 TFs were similarly only enriched in the subset of down-regulated genes, demonstrating again how groups of genes with more similar responses can reveal TFs not found in general groups of genes. The TF DEACoN results for this -gold standard‖ set and its subsets can be found in Supplemental File 1.
Interestingly, EIN3 was the most highly enriched TF in the entire -gold standard‖ set (Table 2, logFC 2.34, p < 0.001). It was also highly enriched in the subset of these genes which show ethylene-dependent increases in transcript abundance (logFC 2.98, p < 0.001). In contrast, the gold standard transcripts which are down-regulated after ACC and ethylene treatment are not enriched for EIN3 targets (logFC 0.67, p = 0.46), consistent with a report that genes downregulated by ethylene may be regulated by chromatin remodeling (F Zhang et al. 2018). It is important to note that this group of genes whose transcript decreases does show substantial enrichment in TFs other than EIN3, suggesting this group may be regulated by transcriptional repressors. This analysis shows that groups of genes identified by meta-analyses are also suitable for mining via TF DEACoN. The high enrichment of EIN3, a known ethylene regulator, in this group of genes which have consistent response to ethylene in roots under different growth conditions also serves as a proof of concept, showing that key regulators can be identified by this bioinformatic approach. This suggests that previously unknown TFs that coordinate expression of groups of genes can also be identified. A c c e p t e d M a n u s c r i p t

ACC-responsive genes show cell-type specific patterns of expression in the root
We also tested the hypothesis that we could identify TFs that control ethylene response by grouping ACC-responsive genes based on cell-type expression patterns. There is currently no root cell-type specific transcriptomic dataset performed with ethylene or ACC treatment, so we utilized a dataset generated under growth conditions matched to the untreated ACC controls (Brady et al. 2007). While this comparison has limitations, it can still provide information on where ACC-responsive genes are expressed in untreated roots. The 449 ACC-responsive transcripts were clustered based on their expression across this root cell-type specific microarray dataset (Brady et al. 2007). This data was normalized across each gene on a 0 to 1 scale so that relative levels of expression could be more directly compared. The normalized data was used to perform k-means clustering of the 449 transcripts, resulting in 12 clusters. These clusters of ACC-responsive genes with similar cell type expression patterns were represented in a heatmap in Figure 3. The corresponding ACC response of the genes in each cell type cluster is shown below the cell type expression pattern of these genes.
Many of these clusters showing cell-type localization of ACC-responsive genes contain transcripts with high levels of abundance in one or a few cell types. For example, clusters 2 and 11 contain genes that are highly expressed in lateral root cells and pericycle cells, from which lateral roots form, respectively, suggesting that genes in these clusters may be important for the ACC-inhibition of lateral root initiation (Ivanchenko et al. 2008;Negi et al. 2008;Lewis et al. 2011). Cluster 3 genes are expressed in the columella, suggesting a role in ethylene-inhibited root elongation and/or gravitropism (Buer et al. 2006;Swarup et al. 2007). Clusters 4, 10, and 12 show more complicated expression patterns, but when clusters 4 and 10 are compared it is evident that transcripts in these two clusters show expression patterns that are opposite across the A c c e p t e d M a n u s c r i p t alternating epidermal hair and non-hair cell files, suggesting these might contain genes that allow ethylene stimulated root hair initiation and elongation (Tanimoto et al. 1995;Dolan 2001;Feng et al. 2017).
To determine how cell-type accumulation of transcripts overlays upon the ACC transcriptional response, we generated a heatmap that included both the cell-type clusters of the

Cell type ACC-responsive clusters have very different TF enrichment patterns
Using TF DEACoN, we analyzed 12 root cell-type clusters of ACC-responsive transcripts to identify candidate TFs. TFs with targets enriched with logFC > 0 in at least one cluster are reported in Table 3, with the p-values for each TF shown in each cluster to indicate the statistical significance of this enrichment relative to the genome. To provide additional information the number of targets in each cluster is reported as a percentage of the total targets for each TF in Supplemental Figure 3. Supplemental File 1 includes the complete TF DEACoN output for each cluster. There were no TFs with target enrichment in clusters 8 to 12, likely because of their small size. An interesting pattern emerged when looking at all TFs enriched A c c e p t e d M a n u s c r i p t across all clusters. Out of 123 TFs whose targets are enriched in at least one cluster, 96 TFs have enriched targets in cluster 3, which contains transcripts that show a pattern of highest expression in the columella. This led us to ask whether this pattern of many TF targets being enriched in columella-expressed genes is due to an uneven distribution of TF representation in the DAP-Seq data, or a unique feature of our dataset.
We used the same methods as above to cluster the cell-type data from the entire genome (Supplemental Figure 4). We analyzed these genome-wide cell-type clusters using TF DEACoN and report the distribution of TF families within clusters of the whole genome compared to the ACC-responsive cell-type clusters ( Figure 4). Overall, some TF families are more highly represented in the DAP-Seq data than others; for example, 49 NAC TFs are in the DAP-Seq data, but only 3 MADs TFs are, despite them having a similar number of family members (96 and 109, respectively). This skews the distribution of the whole genome cell-type clusters toward families with data for more TFs, but with notable exemptions. The AP2-EREBP family has the most DAP-Seq data, with 60 out of 138 family members represented, and high numbers of these TFs have targets enriched in some of the whole genome clusters (2, 4, 5, 6, 10, and 12), while others have no AP2-EREBP family members (1, 3, 8, 9, and 11). This suggests some TF families may be important in only some root cell types.
We compared patterns of enriched TF binding to genome-wide and ACC-response celltype clusters. In genome-wide cell-type clusters AP2-EREBP targets were prevalent, the ACCresponsive cell type clusters had few of these same targets. For example, genome columella cluster 10 has enrichment of targets of 35 of these AP2-EREBP TFs, while the ACC-responsive cell type columella cluster 3 was enriched in binding of only 4 members of this TF family ( Figure 4). A similar pattern is seen for the maturing xylem, with 24 TFs represented in the Downloaded from https://academic.oup.com/insilicoplants/advance-article/doi/10.1093/insilicoplants/diaa006/5885168 by guest on 19 August 2020 A c c e p t e d M a n u s c r i p t genome cluster (12), but none in the ACC-responsive cluster (1). In contrast, both cell types have the highest number of NAC family members in their ACC-responsive clusters. This suggests that perhaps NACs are of particular importance for ACC response in these cell types, where AP2-EREBP TFs are not. For comparison, the clusters that include -all protophloem‖ and phloem companion cells are similarly distributed in the ACC-responsive cluster (7) as in the whole genome (11), with most of the TFs belonging to the NAC and C2C2-Dof families, suggesting that the TFs which are important in this cell type may be less dependent on ACC/ethylene. These patterns show how partitioning target genes by cell-type expression may be of use for revealing transcription factor networks, in combination with TF DEACoN.

TF DEACoN analysis leads to identification of a role for NAM in ethylene response
One TF-encoding gene became interesting since it was present in several of our comparisons. This TF, called NAM (No Apical Meristem), because of its sequence similarity to a petunia TF that is required for shoot development, showed reduced transcript abundance upon treatment with ACC (Harkey et al. 2018). Targets of this TF were enriched in the columella celltype cluster 3, and NAM itself is expressed most highly in the columella, phloem, and phloem pole pericycle (Brady et al. 2007), suggesting a role in root development. Lastly, when we looked for information about other TFs whose targets were enriched in the ACC time course clusters, we identified JKD, a TF that was shown to bind to the NAM regulatory region using ChIP-Seq of root samples (Moreno-Risueno et al. 2015). This led us to examine ethylene responses in roots of an insertion mutant in the NAM gene. Both in the absence and presence of exogenous ACC, nam mutants had significantly more lateral roots and longer primary roots, than wild type, as shown in Figure 5. However, when lateral root density was calculated by dividing A c c e p t e d M a n u s c r i p t the lateral root number by primary root length, nam had higher lateral root density than wild-type plants, and the ACC-induced reduction in lateral root density was lost, suggesting a role for NAM in regulating ACC-inhibited lateral root development. Figure 6 contains a model that integrates the NAM regulatory circuit within the ethylene responsive gene regulatory network.

Discussion
Identification of transcription factors that control signaling and/or developmental responses is challenging because of the large number of putative TFs that still have no known function, and the fact that TFs which have already been characterized in one context may have additional roles in yet unknown contexts. Additionally, previously characterized TFs likely work in cooperation with other TFs whose function has not been previously described. For example, in ethylene signaling, EIN3 appears to primarily control up-regulated genes that respond to ethylene under all conditions, but many genes are down-regulated by treatments with ethylene or its precursor ACC, especially in roots of light-grown seedlings (Chang et al. 1993;Harkey et al. 2018Harkey et al. , 2019. Techniques are needed to narrow down the list of potential regulators in a given context. We have developed a tool that utilizes existing data to make predictions about which We have demonstrated that using genes clustered by kinetics of hormone response or cell type expression, rather than using large groups of genes which respond to a treatment with disparate kinetics or tissue specific expression patterns, yields many more candidates for TF In both ACC kinetic clusters and ACC cell type clusters, we also identified BIRD transcription factors (Long et al. 2015). The BIRD TF JKD has binding targets in ACC response clusters 1 and 2. One of its targets is NAM, which is also bound by EIN3, suggesting a mechanism by which ACC response and developmental responses may be tied. This network is highlighted in Figure 6. In ACC cell type cluster 3, enriched in genes with columella expression, target enrichment was found for RAVEN/IDD5, another BIRD TF involved in root tissue We found additional interesting candidate genes to mediate root ethylene responses in the cell-type clusters of ACC-regulated genes. Many TFs were significantly enriched in targets in at least one cluster, with a smaller number enriched in multiple clusters. Only two TFs were enriched in three clusters: ANAC043 and ANAC096. ANAC043, also known as NAC SECONDARY WALL THICKENING PROMOTING FACTOR1 (NST1) has been implicated in thickening of secondary cell wall in flower tissues, suggesting it could also play a role in this A c c e p t e d M a n u s c r i p t process in the roots (Mitsuda et al. 2007;Zhong and Ye 2015;Q Zhang et al. 2018;Zhang et al. 2020). ANAC096 knockout mutants and overexpression lines have both shown that this TF mediates abscisic acid-inhibited root elongation. This makes ANAC096 an interesting candidate as a mediator of ethylene's effect on root elongation.
One drawback of the method used here for identifying cell-type specific expression of ACC-responsive genes is that the cell-type expression was measured under standard growth conditions, without the addition of hormone. Some transcripts may not be detected in certain cell types until the addition of hormones; others may show may be strongly reduced in abundance in some cell types with subtle changes in other cells, so this is an imperfect method of separating hormone-responsive genes by cell type. Ideally, cell type data would be collected under the same hormone response conditions. No such dataset currently exists for root ethylene/ACC response, but the growing ease of single-cell sequencing makes this type of experiment increasingly accessible, and we look forward to applying TF DEACoN to additional datasets defined by both spatial and hormone response. Yet, this approach of combining two data sets is the hallmark of systems biology in which overlap between data sets suggests testable hypotheses about function.
For example ACC cell-type clusters 1 and 8 include transcripts with high abundance in maturing xylem and endodermis, respectively, and because most of the genes in these clusters go down with ACC treatment, the concerns described above are not an issue. These groups may inform future experiments to look at ethylene responses in these cell types.
We have demonstrated how TF DEACoN may be used to form hypotheses about which TFs may be involved in a process of interest. When we examined the TFs suggested by TF DEACoN using data from multiple approaches, NAM became particularly interesting because there is evidence that places it downstream of ACC signaling, and upstream of ACC-induced the Binding Factor Enrichment tool, which uses a similar strategy to TF DEACoN. It differs in that it employs both the DAP-Seq dataset, but also a variety of ChIP-Seq datasets. When we ran A c c e p t e d M a n u s c r i p t clusters discussed here through Plant Regulomics, we found significant enrichment of the same TFs as those given by TF DEACoN, though the adjusted p-value differed as they used a different statistical test (Plant Regulomics employs a modified Fisher's exact test) and performed a higher number of comparisons since they included ChIP-Seq data. In our experience, TF DEACoN returns results significantly more quickly. We believe these are complementary tools and highlight the simplicity of TF DEACoN, enabling rapid analysis of multiple clusters as demonstrated here, in comparison to the multifunctionality and complexity of Plant Regulomics.
Here we reveal the combined benefits of clustering genes based on the temporal responses to perturbations and developmental patterns to obtain additional information about regulators that cannot be seen when a larger group of genes is searched on either interface. This suggests an area of future improvement for TF DEACoN to incorporate these additional layers of information, which could be implemented in two ways. First, datasets that are likely to have broad appeal, such as those involving circadian rhythm (Michael et al. 2008), response to stresses, or cell type expression patterns could be built into TF DEACoN. Users could choose a dataset to use for clustering their genes, and the tool would automatically sort their gene list into groups based on this data and then run the TF target enrichment analysis on these groups individually. Second, the option to provide cluster numbers for each gene in the query list would enable users to pre-cluster their list of genes on whatever criteria or data they may find interesting, and then feed these clusters into TF DEACoN so that it performs the TF target enrichment analysis on each user identified cluster separately, but simultaneously.
Other directions for further development of TF DEACoN include the addition of datasets beyond DAP-Seq, such as ChIP-Seq and RNA-Seq. While these datasets do have their limitations for broad application to many experimental questions, they could prove useful for A c c e p t e d M a n u s c r i p t other researchers with similar interests. For example, in the case of NAC4 described above, RNA-Seq of samples isolated from root protoplasts overexpressing NAC4 revealed increased transcripts which are either direct or downstream of NAC4 targets (Brooks et al. 2019). Even though these transcripts did not align perfectly with NAC4 targets revealed by DAP-Seq, they did reveal enrichment of NAC4 targets in the same clusters of genes as the DAP-Seq data.
Confirming results by multiple datasets can further narrow down TFs of interest. These additions could extend the usefulness of TF DEACoN to a wide variety of researchers in the plant community.
In summary, we have produced a tool which enables researchers to make use of public data to generate hypotheses about transcriptional regulators of their process of interest and shown how refining lists of target genes based on patterns of expression can strengthen these predictions. We have applied these methods to investigate transcriptional networks downstream of ethylene signaling in Arabidopsis roots, and demonstrated their utility by identifying multiple candidate TFs from groups of genes clustered based on response to ACC and based on cell type expression. We report the confirmation of one of these predicted TFs, NAM as, a previously unknown regulator of root development.

TF DEACoN Development
The tool -TF Discovery by Enrichment Analysis of Co-expression Networks‖ (TF DEACoN, a nod to the Wake Forest University mascot) was developed using R Shiny

Datasets used in analysis
The list of DAP-Seq target genes for each TF (as determined by O'Malley et al. 2016) was downloaded from http://neomorph.salk.edu/dap_web/pages/browse_table_aj.php. These text files listing TF-target relationships for each TF were grouped into one master file to be used by TF DEACoN. The ACC time course data is available as a supplemental file (Harkey et al. 2018), as well as at GEO, accession number GSE84446. The cell type specific dataset was downloaded from http://www-plb.ucdavis.edu/labs/brady/software/BradySpatiotemporalData/, and is also available on GEO, accession number GSE8934.

UpSet plot
The UpSet plot in Figure 2

Phenotypic analysis
Arabidopsis (Arabidopsis thaliana) Col-0 seeds were purchased from Lehle Seeds. The nam mutant was obtained from the Arabidopsis Biological Resource Center, stock number WiscDsLox364F11 / CS852962, donated by Patrick Krysan, Michael Sussman, and Rick Amasino. The mutant was genotyped using two PCR reactions, one with wild-type gene-specific primers, and another with one gene-specific primer and one T-DNA primer to detect the insertion. Only plants homozygous for the insertion in the NAM gene (AT1G52880) were used for experiments. Additionally, RNA isolated from roots of wild-type (Col-0) and nam mutant A c c e p t e d M a n u s c r i p t plants were used in qPCR reactions using primers for the NAM transcript, which showed that nam has lower NAM transcript abundance than wild-type. M a n u s c r i p t      (60) in red, and the lowest (0) in white. Figure 5. The nam mutant has altered lateral root development. Arabidopsis wild-type (Col-0) and nam mutants were grown on MS media for 5 days, then transferred to either control media or media containing 1µM ACC and grown for an additional 5 days. All lateral roots and the entire primary root were quantified and used to calculate lateral root density as (number of lateral roots / primary root length). * p < 0.05 treatment compared to control within genotype; + p < 0.05 nam compared to Col-0, within treatment.