Exploiting natural variation in crown root traits via genome-wide association studies in maize

Root system architecture (RSA), which is determined by the crown root angle (CRA), crown root diameter (CRD), and crown root number (CRN), is an important factor affecting the ability of plants to obtain nutrients and water from the soil. However, the genetic mechanisms regulating crown root traits in the field remain unclear. In this study, the CRA, CRD, and CRN of 316 diverse maize inbred lines were analysed in three field trials. Substantial phenotypic variations were observed for the three crown root traits in all environments. A genome-wide association study was conducted using two single-locus methods (GLM and MLM) and three multi-locus methods (FarmCPU, FASTmrMLM, and FASTmrEMMA) with 140,421 SNP. A total of 38 QTL including 126 SNPs were detected for CRA, CRD, and CRN. Additionally, 113 candidate genes within 50 kb of the significant SNPs were identified. Combining the gene annotation information and the expression profiles, 3 genes including GRMZM2G141205 (IAA), GRMZM2G138511 (HSP) and GRMZM2G175910 (cytokinin-O-glucosyltransferase) were selected as potentially candidate genes related to crown root development. Moreover, GRMZM2G141205, encoding an AUX/IAA transcriptional regulator, was resequenced in all tested lines. Five variants were identified as significantly associated with CRN in different environments. Four haplotypes were detected based on these significant variants, and Hap1 has more CRN. These findings may be useful for clarifying the genetic basis of maize root system architecture. Furthermore, the identified candidate genes and variants may be relevant for breeding new maize varieties with root traits suitable for diverse environmental conditions.

and the dynamic soil environment, have crucial functions affecting plant productivity and tolerance to environmental stresses [3]. Research regarding plant roots has been limited by the complexity of phenotyping the underground plant parts and because there is relatively little relevant genetic information available to breeders. However, a recent study confirmed that modifying the root architecture can increase resource use efficiency and yields [4], which has prompted plant scientists to focus more of their attention on plant roots. Clarifying the genetic mechanism regulating root development is critical for enhancing crop performance and increasing food security.
Root system architecture (RSA) is the spatial deployment of roots. Crown roots form the majority of the maize root system's backbone [5]. To characterise the effects of RSA on crop performance, several phenotyping methods involving artificial conditions (e.g., on germination paper) and natural conditions (e.g., field trials) have been developed to evaluate genotypic variations and the utility of root traits [6]. Moreover, shovelomicsbased experiments have been widely used to investigate soil resource acquisition by maize crown roots [7]. Furthermore, maize crown root traits, including crown root number [8], diameter [9], and growth angle [4], are associated with the above-ground performance of plants as well as the nutrient content and grain yield. Decreasing the number of roots in maize inbred lines may lead to deeper root growth and enhanced acquisition of water and nitrogen in dry and low-nitrogen soils [8,10]. In low crown root number phenotypes, roots may be too spatially dispersed to sufficiently acquire soil resources, low crown root number supports a deep root system, so the shallow portion of deep roots can acquire shallow N resources while the deep portion can explore deep soil for water resources [10]. Since water and nitrate enter deeper soil strata over time, root systems with rapid exploitation of deep soil would optimize water and N capture in most maize production environments [11]. The root angle is a crucial factor influencing how deep roots can grow in the soil to obtain water and nutrients. RSA that are narrow and vertically oriented are typically associated with increased drought tolerance [12]. In maize, rice and wheat, root growth depth is positively associated with yield under drought conditions [4,13,14]. Relatively steep root angles increase rooting depth and drought tolerance in rice [4] and common bean [15]. The root angle also affects nitrogen capture, with steep maize root angles under low-nitrogen conditions potentially increasing nitrogen acquisition by 15%-50% as the soil nitrogen content decreases [16,17]. Compared with thin roots, thick roots have greater mechanical strength, leading to better anchorage and lodging resistance, protection against herbivory, and enhanced soil penetration [9]. Inbred lines exhibiting high nitrogen use efficiency (NUE) reportedly have larger root diameters than lines with a lower NUE under low-nitrogen conditions, and most genotypes decreased root diameter under stress when averaged across nodes [9]. An exposure to nitrogen stress induces several maize genotypes to increase their root diameter. Thus, the root diameter can influence adaptive stress responses. Accordingly, breeding maize lines with modified crown roots may lead to the development of new cultivars suitable for various stress conditions.
Exploring the natural variations in crown root traits may lead to new insights into root development, while also revealing elite allelic variations that can enhance root performance. Many recent quantitative trait locus (QTL) mapping studies regarding the number of maize crown roots have been conducted [18]. For example, QTL analyses in single and multiple environments have been performed for the crown root angle (CRA), the crown root diameter (CRD), and the crown root number (CRN) in a recombinant inbred line population. A total of 46 QTL were detected in a single environment and 25 QTL were identified in multiple environments, with most loci confirmed as minor-effect QTL for crown root traits [19]. Two major QTL for the total brace root number were identified by Ku et al. [20], and the largest additive effect was 16.4%-17.9%. Cai et al. detected five QTL for CRN at three maize growth stages, including one consensus QTL in chromosome bin 10.4 [21]. There have been relatively few studies on the maize CRA. Only 10 root angle-related QTL were identified in two maize-teosinte F 2 populations [22]. In rice, six major QTL for root angle (DRO1, DRO2, DRO3, DRO4, DRO5, and qSOR1) have been identified, and DRO1 and qSOR1 have been cloned [23,24]. Additionally, DRO1 is the first cloned gene associated with the root growth angle. Its expression is negatively regulated by auxin and the encoded protein helps mediate root tip cell elongations related to asymmetrical root growth and the downward bending of the root in response to gravity [4]. A previous study confirmed that qSOR1, which is a DRO1 homologue, is also negatively regulated by auxin, and is predominantly expressed in root columella cells to influence root gravitropic responses [24]. All of these genes are potential targets for root system architecture (RSA)-related breeding. Regarding the root diameter, only one QTL associated with CRD has been detected under well-watered and water stress conditions [25]. Moreover, 21 root architecture traits were evaluated in three recombinant inbred populations, but only one root diameter-related QTL was identified [26]. Six QTL were identified for four root anatomical traits which is directly related to root diameter [27].
To date, only a few maize studies have identified QTL and allelic variations associated with RSA development in the field. As an alternative to traditional QTL mapping and map-based cloning, a genome-wide association study (GWAS) can identify the genes and allelic variations responsible for natural phenotypic diversity. In this study, the CRN, CRA, and CRD of 316 maize inbred lines were evaluated in three field trials. The objectives of this study were (i) to study phenotypic variation of crown root traits within a maize association panel, (ii) to identify significant SNPs associated with CRA, CRD and CRN, and (iii) to detected potential candidate genes and natural variations for crown root development.

Phenotypic variations
A total of 316 inbred lines from a diverse maize panel were evaluated regarding their CRA, CRD, and CRN at maturity in field trials across three environments. Phenotypic analyses confirmed the three crown root traits differed substantially in the examined environments ( Table 1, Fig. 1). The coefficient of variation ranged from 4.72% to 9.87% for the BLUP values for the three environments. The frequency distributions of the analyzed crown root traits revealed a relatively normal distribution in each environment (Fig. 1). A two-way ANOVA indicated the effects of the genotype, the environment, and the genotype × environment interaction was significant for the three root traits (Table 1). On the basis of a correlation analysis of the crown root traits, a weak negative correlation between CRA, CRD and CRN was observed ( Fig. 1).

Genome-wide association study of crown root traits
To clarify the genetic basis of crown root traits, association analyses involving 140,421 SNP markers for the CRA, CRD, and CRN of the 316 genotypes were conducted using the following two single-locus methods: GLM and MLM, and three multi-locus methods: Farm-CPU, FASTmrMLM, and FASTmrEMMA. A total of 482 marker-trait associations were identified ( Table 2, Fig. 2, Table S1). More specifically, FASTmrMLM identified the most significant SNPs (216), followed by GLM (174), FASTmrEMMA (61), FarmCPU (33) and MLM (4). Additionally, 185, 127, and 170 significant SNPs were detected for CRA, CRD, and CRN, respectively, by all different methods. A total of 48, 209, and 59 significant SNPs were identified in 2015SY, 2016SY, and 2017SY, respectively, whereas 166 significant SNPs were detected based on the BLUP value for the three environments. Twenty-nine SNPs were identified by at least two methods, among these 17 loci were identified by single locus and multi-locus methods simultaneously (Table S1). Such as, SNP_5_158203765 and SNP_4_235330942, were detected by four GWAS methods except MLM. After clumped, a total of 38 QTL including 126 SNPs were detected for three root traits. Q7 including 7 significant SNP were detected in 2016SY and BLUP value for CRA. Q8 were detected in 2016SY and BLUP value for CRA by both GLM and MLM. Q17 associated with CRD could detected by GLM, FASTm-rEMMA and FASTmrMLM. SNP_4_235330942 in Q29 were identified by GLM, FarmCPU, FASTmrEMMA Table 1 Descriptive statistics, ANOVA for crown root traits in different environments a the number indicated F value by Two-way ANOVA, and a significant effect by F test was indicated by "**" (P < 0.01)  (Table 3). We compared the positions of the QTL identified in this study with the positions of the QTL reported from previous biparental studies [19]. Of the 38 loci we detected, three overlapped with QTL previously identified from bi-parental mapping studies (Table 3). Q19 associated with CRD was located within 15 CRD6 and av CRD6 identified by a previous QTL mapping. Another QTL associated with CRD, Q21, was located within confidence interval of av CRD8. A CRN QTL-Q23 was located within confidence interval of 17 CRD4.

Determination of candidate genes
A total of 113 potential candidate genes within 50 kb of the 126 significant SNPs were identified based on the GWAS results and the filtered predicted gene set from the annotated B73 reference maize genome (version 5b.60) (Table S2). To examine the expression profiles of these candidate genes, we compared the gene expression levels in different B73 maize tissues. The 25 genes that were not expressed in the crown roots (i.e., FPKM = 0) were eliminated and 88 genes were obtained. After combining the gene annotation information and the expression profiles, we selected 3 candidate genes potentially related to crown root development (Table S2). GRMZM2G141205, which encodes an Aux/IAA-transcription factor, was related with a CRN QTL-Q33. This gene was exclusively expressed in the crown roots. GRMZM2G175910 encoded a cytokinin-O-glucosyltransferase was around SNP_7_103946580 in Q10, which was associated with CRA. This gene was mainly expressed in the roots, stem, internode, cob and seed (Table S2). GRMZM2G138511 encoded a heat shock protein, which was associated with CRN.

Resequencing and a haplotype analysis
To investigate the associations between the allelic and phenotypic variations in the association panel, one gene was selected for resequencing. Our GWAS results identified SNP_6_146692801 (Table S2) (Table 4; Fig. 3A, B). On the basis of these five variants, four major haplotypes, which were present in more than 15 lines, were revealed in all tested inbred lines (Fig. 3C). Hap1, 2 3 and 4 including 31, 80, 186 and 15 lines, respectively. The ANOVA revealed that a significant phenotypic difference in the CRN was observed among the haplotypes, with Hap1 associated with the most crown roots (Fig. 3D).

Plant materials and phenotyping
The panel used in this study was composed 316 inbred lines which was mainly collected in China (Table S3), all the accessions were stock in Key Laboratory of Plant Functional Genomics of the Ministry of Education, Yangzhou, China [28]. The panel was mainly clustered into three groups referred to as Lancaster, Reid, and Tang SPT according to their known pedigrees and germplasm [29]. The panel were planted in three field trials in Sanya from 2015 to 2017. The field trial was conducted in a completely randomized block design of one-row plots with two replications. Each row was 3 m long, 0.5 cm wide and contained 11 plants. Shovelomics was conducted to evaluate three crown root traits at maturation stage [30]. Six healthy plants were randomly selected for excavation using standard shovels in each line. Shoot-borne roots formed in soil are designated crown roots, while shootborne roots initiated above soil level are termed brace roots. The crown root should emerge on the node situated flush with the soil surface. Here, the excavated plant was shaken briefly to remove most of the soil, the brace roots were first cut with a cutter knife, and the crown root on top layer could then be easily distinguished. The roots on top layer of crown root (first layer crown root) were selected to measure crown root angle (CRA), number (CRN), and diameter (CRD, mm) [19]. Root angles were measured with a protractor as degrees from horizontal: horizontal roots were classified as 0 • and vertical roots as 90 • . Selected crown roots were cut off at 1 cm from the top of the root, and root diameter was measured with where μ represents the population mean, G i represents the genotype effect, E j represents the environment effect, GE ij represents genotype-by-environment interaction, B k (E j ) represents the block effect, and e ilk represents the random error. The best unbiased linear predictive value (BLUP) of each trait was also calculated using a mixed linear model in R package "lme4". The "lme4" package was used to estimate genotypic ( σ 2 g ), G-E interaction ( σ 2 ge ) and error variances ( σ 2 e ). The broadsense heritability h 2 = σ 2 g /(σ 2 g + σ 2 ge /e + σ 2 e /re) , e and r represent the number of environments and blocks in each environment [31].

Genotypic data and genome-wide association analysis
The panel of 316 inbred lines was genotyped using the genotyping-by-sequencing strategy [28]. After quality control (missing rate ≤ 20%; minor allele frequency ≥ 0.05) 140,421 SNPs remained for GWAS. The number of SNPs on each chromosome ranged from 10,344 (chromosome 9) to 19,513 (chromosome 1). Most SNPs (50.31%) were located outside of genic regions, of those in genic regions, 14.51% SNP were located within introns, which is more than that in exons (3.50%). The principal component analysis (PCA) was also performed with TASSEL, and the top five principal components were used to create a population structure matrix to control the population structure. Kinship matrices were calculated using the centred IBS method in TASSEL to estimate the genetic relatedness among individuals. A GWAS was conducted with two single-locus methods: GLM (with PC) and MLM (with PC and kinship) in TAS-SEL. FarmCPU [32], FASTmrEMMA [33] and FAST-mrMLM [34] and were used for multi-locus GWAS. FarmCPU was implemented in the R package GAPIT3 with FarmCPU model (with PC). FASTmrEMMA and FASTmrMLM were conducted with the software package mrMLM.GUI V3.2 with five PCs and kinship to control structure. For GLM, MLM and FarmCPU, the P value threshold was 7.12 × 10 −6 (1/n, where n is the number of SNPs). A suggestive LOD threshold value of 3.0 was used for FASTmrEMMA and FASTmrMLM [33,34]. The LD decay was determined by software PopLDdecay [35], and an average LD across all chromosomes decayed to r 2 = 0.20 in approximately 50 kb [36]. The significant SNPs were grouped into one region if the LD between the two neighboring SNPs was r 2 > 0.6. The grouped region was detected in at least two environments or by at least two models was considered as a QTL, and SNPs with the minimum P value in a QTL were considered as the lead SNPs. The positions of the QTL identified in this study were compared the confidence interval of the QTL reported from previous biparental studies to identify overlapped QTL by different populations [19].

Candidate gene analysis and gene-based association mapping
All the potential candidate genes within 50 kb of the detected loci were identified. The expression data and gene annotation information were collected from maizeGDB database (http:// www. maize gdb. org) [37,38]. The physical locations of the genes and SNPs were based on the maize B73 RefGen_V3 genome (version 5b.60).
Genomic DNA was extracted from the fresh young leaves of the 316 inbred lines. The candidate genes from the tested inbred lines were sequenced with the targeted sequence capture technology of the NimbleGen platform [39] by BGI Life Tech Co., Ltd. A multiple Levene's test (P = 0.93) indicated that the variance is homogeneous, so one-way ANOVA was used for comparisons between different haplotypes. The protected LSD post hoc test (a = 0.05) was used for multiple comparison tests. Different letters indicated that significant difference between haplotypes were observed (P < 0.05) sequence alignment was analysed with the MAFFT software (v7.313). Gene-based polymorphisms were identified with TASSEL 5.2, with a minor allele frequency (MAF) ≥ 0.05. The significance of the association between maize SNPs and target traits was conducted with the MLM model (MLM + PCA + Kinship) in TAS-SEL 5.2. The P value threshold to control the genomewide type 1 error rate was 1/n (where n is the number of markers for the candidate gene) [40]. To test the effect of unequal sample sizes between groups, Levene's test was used to test if different groups have equal variances. The result indicated that the variance is homogeneous (P = 0.93), so one-way ANOVA was used for comparisons between different haplotypes. The protected LSD post hoc test (a = 0.05) was used for multiple comparison tests.

Discussion
Because the RSA is a key determinant of plant anchorage and the efficient uptake of nutrients and water, it directly influences crop yield potential [3,12]. Breeding crops with an optimal RSA for capturing soil resources is a promising strategy for increasing yields under stress conditions [41]. However, the limited available information regarding the genetic mechanism underlying RSA development has been an obstacle for molecular markerassisted breeding. Earlier research proved that CRN, CRA, and CRD are critical factors that define the maize RSA [11,16,42]. The challenges associated with evaluating root traits in the field have hindered the characterisation of the genetic mechanisms regulating root traits. Several new methods have recently been developed for relatively high-throughput and accurate analyses of roots under field conditions [43]. In this study, shovelomicsbased experiments were conducted to evaluate the CRA, CRD, and CRN of 316 inbred lines at maturity in three field trials. Moreover, a GWAS was performed to elucidate the natural variations in the crown root traits.
Genome-wide association studies have identified genes responsible for complex plant traits [44]. Analyses involving MLM-based (mixed linear model) single-marker associations are commonly used to detect important loci related to complex plant traits, but their stringent significance thresholds resulting from large-scale multiple testing may prevent the identification of many critical loci [44]. And most quantitative traits are controlled by a few genes with large effects and numerous polygenes with minor effects. However, the current single-marker associations do not match the true genetic model for these traits [33]. In multi-locus GWAS, the Bonferroni correction is replaced by a less stringent selection criterion. Multi-locus approach such as FASTmrMLM and FASTmrEMMA shows high statistical power in QTN detection, model fit and low false positive rate [33,45]. In this study, a GWAS was conducted using two singlelocus methods and three multi-locus methods. A total of 174, 4, 61, 210, and 33 SNPs were identified with GLM, MLM, FASTmrEMMA, FASTmrMLM, and FarmCPU, respectively. Twenty-nine SNPs were identified by at least two methods, including SNP_5_158203765 and SNP_4_235330942, which were detected by four GWAS methods except MLM. We also detected 23 variants in at least two environments. After clumped, a total of 12, 10 and 16 QTL were detected for CRA, CRD and CRN. Linkage and GWAS analyses are two complementary approaches commonly used to clarify the genetic basis of complex traits. Additionally, these methods may be used for cross-validations [46]. We compared our GWAS results with the QTL identified in the recombinant inbred line population, which was included in an earlier evaluation of the same root traits in three field trails [19]. A total of three QTL were located in previous QTL intervals. Four candidate genes located in these QTL. Q6 was overlapped with QTL for seminal number, and two candidate genes were located in the confidence interval [26]. The consistency in the findings confirmed the accuracy of the GWAS results.
Auxin is a common regulator of plant root development. The auxin-TIR1/ABF2-AUX/IAA-ARF-LBD pathway is involved in post-embryonic root development in cereals and Arabidopsis [47]. The Aux/IAA proteins are key regulators of the nuclear auxin response pathway, and many of them can modulate root development in Arabidopsis, rice, and maize [48][49][50]. In maize, mutations to ROOTLESS WITH UNDETECTABLE MER-ISTEMS 1 (RUM1; an Aux/IAA protein) that prevent auxin-mediated degradation lead to lateral root defects [50]. Another Aux/IAA-related gene, rul1, is a duplicated homologue of rum1, which controls SR and LR initiation in maize [51]. In the current study, we identified an Aux/IAA protein-encoding genes, GRMZM2G141205, which were associated with CRN. The natural variations in GRMZM2G141205 were validated by resequencing and a haplotype analysis, and five variants were significantly associated with CRN. Here, two variants were in the upstream region of GRMZM2G141205. A collection of studies has shown that variations in the genes upstream may affect the gene expression and further lead to plant phenotypic alterations. For example, a CACTAlike transposon insertion in the upstream promoter region of ZmCCT10 in maize disrupts gene expression and attenuates photoperiod sensitivity under long-day environments [52]. An SNP in the promoter region of ZEA CENTRORADIALIS 8 (ZCN8) affected flowering time by altering the level of gene expression [53]. SNP-1300 and InDel-481 were the potential variants that may