A Two-Stage Method for Improving the Prediction Accuracy of Complex Traits by Incorporating Genotype by Environment Interactions in Brassica napus

Improving the prediction accuracy of a complex trait of interest is key to performing genomic selection (GS) for crop breeding. For the complex trait measured in multiple environments, this paper proposes a two-stage method to solve a linear model that jointly models the genetic effects and the genotype× environment interaction (G×E) effects. In the first stage, the least absolute shrinkage and selection operator (LASSO) penalized method was utilized to identify quantitative trait loci (QTL). )en, the ordinary least squares (OLS) approach was used in the second stage to reestimate the QTL effects. As a case study, this approach was used to improve the prediction accuracies of flowering time (FT), oil content (OC), and seed yield per plant (SY) in Brassica napus (B. napus). )e results showed that the G×E effects reduced the mean squared error (MSE) significantly. Numerous QTL were environment-specific and presented minor effects. On average, the two-stage method, named OLS post-LASSO, offers the highest prediction accuracies (correlations are 0.8789, 0.9045, and 0.5507 for FT, OC, and SY, respectively). It was followed by the marker× environment interaction (M× E) genomic best linear unbiased prediction (GBLUP) model (correlations are 0.8347, 0.8205, and 0.4005 for FT, OC, and SY, respectively), the LASSO method (correlations are 0.7583, 0.7755, and 0.2718 for FT, OC, and SY, respectively), and the stratified GBLUP model (correlations are 0.6789, 0.6361, and 0.2860 for FT, OC, and SY, respectively). )e two-stage method showed an obvious improvement in the prediction accuracy, and this study will provide methods and reference to improve GS of breeding.


Introduction
In the last three decades, the development of molecular marker technology has provided numerous molecular markers for the most important species [1]. Regarding the use of molecular markers in the selection of a genetic trait, marker-assisted selection (MAS) [2] became a valuable tool in animal and plant breeding in the 1990s and works well for traits with a simple genetic architecture. However, MAS is not suitable for complex traits controlled by multiple genes, many of which have minor effects. erefore, genomic selection (GS) as an advanced form of MAS was first propounded by Meuwissen et al. [3]. Instead of using a subset of significant markers for selection in MAS, GS uses whole genome-wide markers to predict the genome-estimated breeding values (GEBVs) of selected individuals and thus avoids biased marker effects estimates due to the detection process in MAS. However, with high-density molecular markers, the number of markers (p) can vastly exceed the sample size (n), which is referred to as a "large p small n" problem. us, it is impossible to obtain the estimates of markers effects via a linear model by OLS [4].
To deal with the "large p small n" problem, one can impose some constraints on the linear model, resulting in penalized estimation methods, such as ridge regression (RR) [5] and LASSO [6]. RR performs parameter shrinkage only, while LASSO offers both parameter shrinkage and variable selection simultaneously. Both RR and LASSO can generate parsimonious models in the presence of a large number of predictors. e predictor size selected by LASSO is generally less than the sample size (n) [7], so applying OLS to the model selected by LASSO is feasible. e post-model-selection estimator was called "OLS post-LASSO" in the study of Belloni and Chernozhukov [8] and it possesses the advantage of a smaller bias than LASSO. e debiasing in OLS post-LASSO often improves the prediction error of the model [9], and this two-stage process is also known as the relaxed LASSO [10].
Bayesian methods are also applied to fit this "large p small n" problem in GS [11]. In Bayesian inference, the marker effects are considered as random instead of fixed, and the mixed-effects model is often adopted to describe the phenotypic variation. By specifying different priors for the random marker effects, many different models, including BLUP (best linear unbiased prediction) [12], BayesA, BayesB [3], BayesC [13], and Bayesian LASSO [14], have been proposed in GS (see de los Campos et al. [11] for an overview). BLUP is a statistical procedure used to estimate the random effects and is easily obtained by solving the wellknown Henderson's mixed model equations (MME) [15].
us, BLUP and its extensions, including rrBLUP (ridge regression BLUP) [16,17] and GBLUP [18], have become the most widely used methods in GS. Many software packages for those methods, for example, rrBLUP [17] and BGLR (Bayesian Generalized Linear Regression) [19], are freely available online. Recently, the GBLUP model has been extended to multienvironment data. For example, Lopez-Cruz et al. [20] proposed a M × E GBLUP model to accommodate the G × E, and they also compared the M × E GBLUP model with the stratified (within the environment) GBLUP model. e results showed that the prediction accuracy of the M × E GBLUP model was substantially greater than the stratified GBLUP model. e significant increase in the prediction accuracy of using multienvironment models compared with single-environment analysis has been confirmed in many crops, such as maize [21] and rice [22].
B. napus is one of the most important oil crops worldwide. To better understand the genetic control of important agronomic traits in B. napus, a doubled haploid (DH) population, named TNDH population, which was derived from the F1 cross between European cultivar Tapidor and Chinese cultivar Ningyou7, was developed [23]. After several years and trial locations, the phenotypic data were collected from a multienvironment, and the TNDH population has been adopted as reference resources by the OREGIN (Oilseed Rape Genetic Improvement Network) management team. Based on the TNDH population, many QTL for the complex traits have been detected (see Shi et al. [24], etc.). Recently, a high-density genetic map of the TNDH population with a total of 2041 molecular markers was constructed [25]. Using this high-density genetic map, the genomic prediction accuracy of the FT trait in B. napus was evaluated via eight existing models by Li et al. [26]. However, the authors did not incorporate the G × E effects into their study. As previously noted, the G × E effects play a very important role in explaining the variation of the complex traits. Accumulating studies showed that incorporating G × E effects into the GS model could substantially increase the prediction accuracy of the complex trait. erefore, in this study, based on the representative TNDH population, we will evaluate the performance of a two-stage approach via a linear model that jointly models the genetic effects and G × E effects. e objective of the present study is to improve the prediction accuracy of complex traits for B. napus. In contrast to the most commonly used G × E GS models, such as the M × E GBLUP model of Lopez-Cruz et al. [20], we treat the marker effects as fixed instead of random. We assume that the LASSO method can be used to identify the main effect and environment-specific effect QTL. Based on the identified QTL, a parsimonious linear model can be established and the OLS method is used to reestimate the QTL's effects. e performances of this twostage approach named OLS post-LASSO and other comparison methods, including LASSO, M × E GBLUP model, and the stratified GBLUP model of Lopez-Cruz et al. [20], are evaluated in terms of prediction accuracy for FT, OC, and SY.

Genotypic and Phenotypic Data.
A published dataset for the TNDH population was used in this study (see Luo et al. [27] for details). e TNDH population was derived from an F 1 progeny of a cross between a European winter cultivar "Tapidor" and a Chinese semiwinter cultivar "Ningyou7" [23]. e population comprises 182 DH lines grown at five different sites (Wuhan, Jiangling, Daye, Hangzhou, and Dali) in China for over five years (2002)(2003)(2004)(2005)(2006)(2007). Combining the harvest year and the location, a total number of ten environments (year-location combinations) are available and are coded as "S3," "S4," "S5," "S6," "S7," "E7," "N3," "N4," "N6," and "N7," separately. ere are a total of 2041 molecular markers for each DH line genotyped ("A" and "B" were denoted for "Tapidor" and "Ningyou7", respectively) by the Brassica 60K Illumina Infinium SNP array, and a total of 22 traits, including SY, OC, and FT, are collected from all the ten environments. Details of phenotypic and genotypic data and how the TNDH population was developed can be found in Luo et al. [27]. ese 182 TNDH lines, the 2041 markers, and the phenotypic data for three complex traits (SY, OC, and FT) across all the ten environments were used in the present study.

Two-Stage Approach.
Suppose that there are several n population lines (individuals) cultivated in a total of m environments, y ki is individual i's trait value collected from environment k (i � 1, . . ., n, k � 1, . . ., m). Since those n individuals cultivated in different environments share the same genotypes, we use x ij to denote the genotype of individual i at locus j (j � 1, . . ., p). x ij � 1 and 0, respectively, represent A and B genotypes, where p is the number of markers. For jointly modeling the genetic effect and the G × E effect, a G × E linear model by regressing phenotypes on markers across all the multiple environments can be described as follows: 2 Discrete Dynamics in Nature and Society where μ is the overall mean (the intercept term), which is stable across the environments, and μ k is the environment effect (E) that may vary by environment, β j is the main effect across all the environments (Q), α kj is the environmentspecific effect or equivalently the interaction effect between the j th locus and the k th environment (Q × E), and ϵ ki ∼ N(0, σ 2 ) is the residual error. If some β j s or α kj s are not equal to zero, we considered that there exist the main effects or the Q × E effects. Model (1) is very similar to the "M × E GBLUP model" of Lopez-Cruz et al. [20]. In M × E GBLUP model, both the main effects and the environment-specific effects are treated as random effects. Also, the M × E GBLUP model does not include the overall mean and can be expressed as follows: where α kj s are called M × E effects by Lopez-Cruz et al. [20]. Furthermore, ignoring the M × E effect and analyzing data separately in each environment, model (1) was reduced to as the "stratified GBLUP model" by Lopez-Cruz et al. [20]. e stratified GBLUP model can be expressed as follows: where β kj is the effect of the j th marker on the k th environment.
In the study, marker effects, including the main effect and the environment-specific effect, are considered as fixed instead of random. Considering that the number of parameters in model (1) is often larger than the sample size in GS, a two-stage approach, that is, OLS post-LASSO, was used to solve this problem, and the approach can be described as follows.
, θ � (β T , α T ) T , and K � (m+1) p is the number of columns in matrix X; then Z is an mn × m matrix, X is an mn × K matrix, and θ is a Kdimensional column vector. Matrices Z and X can be partitioned by columns, such that Z � (Z 1 , . . . , Z m ) and X � (X 1 , . . . , X K ), and vector θ can be written as θ � (θ 1 , · · · , θ K ) T . en, model (4) can be expressed as Equation (5) is the standard form of a linear model. Considering that the number of predictors (K + m) in model (5) is often much larger than the sample size (mn) in the context of G × E, we use the LASSO method to solve the model first.
e LASSO estimator of model (5) can be obtained by minimizing where ‖ · ‖ denotes the l 2 -norm and λ ≥ 0is a tuning parameter. e tuning parameter can be selected by k-fold cross-validation, for example, using 10-fold cross-validation. As noted by many studies, such as Zou and Hastie [7], the number of nonzero effects selected by LASSO is generally less than the sample size, that is, mn for model (6). erefore, after LASSO, OLS regression using the selected QTL is feasible and possesses some advantages, especially reducing the shrinkage bias of LASSO [8].
us, in the second stage, we adopt the OLS to reestimate the QTL effects.
is two-stage method was referred to as OLS post-LASSO by Belloni and Chernozhukov [8].

Full Data Analysis.
To estimate the marker effects and show G × E relevance and structure in the dataset, model (5) was first fitted to the full dataset using OLS post-LASSO. As noted previously, the markers with nonzero main effects and environment-specific effects by LASSO in the first stage were reported as QTL. Based on the selected QTL, the OLS method was used to reestimate its values. e reestimated values from OLS are reported as the final estimated effects of QTL. We use the t-test to test whether the corresponding reestimated effect of each QTL is equal to zero or not. If the p value of the t-test is less than 0.05, the corresponding QTL is reported as significant QTL. Otherwise, it is reported as a nonsignificant QTL. For the nonsignificant QTL, the corresponding effects are not significantly different from zero or, equivalently, the corresponding effects are ignorable. Meanwhile, the OLS approach can produce the corresponding standard errors (S.E.) of the estimate for the parameters. Based on the standard errors of the estimate for the QTL's effect, we can construct the 95% confidence intervals for the estimate of QTL's effect, including the main effects and the environment-specific Discrete Dynamics in Nature and Society effects. e 95% confidence intervals are calculated by the estimated effects plus or minus 1.96 times the standard errors.
For the linear regression, R-squared is often reported as the measure that represents the proportion of the variation in the dependent variable explained by the model. However, R-squared always does not decrease as more predictors are added to the model. us, R-squared cannot be used to measure the contributions of each predictor. e adjusted R-squared does not increase as more predictors are added; thus it is chosen as the measure to evaluate the contribution of each component of the model. However, we cannot interpret the adjusted R-squared the same as R-squared. Note that the adjusted R-squared is equal to the percentage of the decrement of the MSE from the null model that only contains the intercept term to the alternative model that contains both the intercept term and other components of the model. erefore, we calculate the MSE of both the null model and the alternative model. Meanwhile, we calculate the decrement and the percentage of decrement between them. To better understand the contributions of the three components of the model, that is, E, Q, and Q × E, five alternative models incorporating the three components of the model and its combinations, E + Q and E + Q + Q×E, are evaluated in the article.
ose alternative models with higher values of the adjusted R-squared, that is, higher percentages of the decrement of the MSE from the null model to the corresponding alternative model, are the better models. e corresponding components included in the better models would play an important role in prediction.

Randomly Splitting the Data for Assessing Prediction
Accuracy. For comparison, the existing "M × E GBLUP model" and "stratified GBLUP model" mentioned above are chosen as comparison methods. Meanwhile, the n � 182 TNDH lines across all the ten environments (i.e., m � 10) are chosen as a working example for assessing the prediction accuracy of the two-stage method and the comparison methods. Based on the 182 TNDH lines, for each complex trait, that is, SY, OC, and FT, we merge the phenotypic datasets as a long vector just as described in the methodology from all the ten environments into one dataset. After merging, the sample size is enlarged to 1820 (�nm � 182 × 10).
en, for each merged dataset, we randomly partition it into training and testing datasets at a proportion of 2 : 1.
is random partition is repeated 100 times, resulting in a total of 100 random training datasets and the corresponding 100 random testing datasets.
e marker effects are estimated on each training dataset across environments using LASSO, OLS post-LASSO, and M × E GBLUP model and within each environment using stratified GBLUP model. e GEBVs are computed in the corresponding testing dataset across environments using estimated LASSO, OLS post-LASSO, and M × E GBLUP model and within each environment using the estimated stratified GBLUP model. en, we calculate the correlation between the GEBVs and the observed phenotypes for each trait within each environment. Taking the average across 100 replicated partitions, we obtain the average correlation and report it as prediction accuracy within each environment. Meanwhile, the standard deviation (SD) of the sampling distribution of the prediction accuracy among 100 replicated partitions is also reported to indicate the deviation of the accuracy.

Software.
e minimizing problem of equation (6) can be efficiently solved by Least Angle Regression [29] in R software [30] using "lars" package or Alternating Direction Method of Multipliers (ADMM) [31] algorithms in MAT-LAB software using "lasso" function, which is used in the present study. e M × E GBLUP model (2) and stratified GBLUP model (3) are implemented using the R package BGLR [19].

Marker Effects.
e number of detected QTL and the frequency analysis of significant or nonsignificant QTL are reported in Table 1. From Table 1, we can see that the total number of QTL with main effects varied across traits. ere are a total of 46, 77, and 26 QTL with main effects for FT, OC, and SY, respectively, and there are also a total of 231, 237, and 146 QTL with environment-specific effects for FT, OC, and SY, respectively. For main marker effects, fewer of them have significantly nonzero effects, and the percentages of significant QTL are 39.13%, 32.47%, and 42.31% for FT, OC, and SY, respectively. Equivalently, most identified main effect QTL by LASSO have small or ignorable effects, and the percentages of nonsignificant QTL are 60.87%, 67.53%, and 57.69% for FT, OC, and SY, respectively. For environment-specific marker effects, 15.58%, 16.03%, and 6.85% for FT, OC, and SY, respectively, have effects significantly different from zero. us, most identified environment-specific effects QTL by LASSO have small or ignorable effects. Figures 1-3 show the point estimates and 95% confidence interval (95% CI) of marker main and environmentspecific effects along the chromosomes. e vertical green confidence intervals that overlap the horizontal line of zero contain the value of zero; thus, the corresponding marker effects are nonsignificant. e vertical blue confidence intervals that represent the corresponding marker effects are statistically significant under the significance level of 0.05. As noted from these figures, most of the main and environment-specific marker effects are small and not significantly different from zero. Figures 1(c)-3(c) show the standard error (SE) of environment-specific marker effects for the same detected environment-specific QTL. e positive values (blue lines) of S.E. indicate that the corresponding QTL have environment-specific effects in multiple environments. Figures 1-3 show that few environment-specific effects QTL interact with multiple environments, and most of them display their effects in only one environment. 4 Discrete Dynamics in Nature and Society

Decrement of MSE.
e decrements of the MSE from the null model that only includes the intercept term to the alternative model that includes both the intercept term and one of the components of E, Q, Q × E, E + Q, and E + Q + Q × E are reported in Table 2.
From Table 2, we can see that the MSE of the null model is 198.4734 for FT, 6.2661 for OC, and 0.3503 for SY, respectively. After adding the component of the Q × E into the model, the decrement of the MSE is 189.3257 for FT, 4.6635 for OC, and 0.2143 for SY, respectively, and the corresponding percentage of the decrement, that is, the adjusted R-squared, is 95.3910% for FT, 74.4235% for OC, and 61.1769% for SY, respectively. at means Q × E plays a key role in the model. If adding the combination of E and Q into the null model, the corresponding percentage of the decrement of the MSE is 98.2912% for FT, 74.6738% for OC, and 62.0056% for SY, respectively, which is slightly larger than that of the alternative model that contains both intercept and Q × E. e percentage of the decrement of MSE for the full model that contains all of the three components, E + Q + Q × E, has the highest values of 98.6731%, 88.0751%, and 66.6784% for FT, OC, and SY, respectively. us, the full model is most appropriate to use to predict the complex traits.
Another interesting finding is that the percentage of the decrement of MSE for the model with combined components, such as E + Q and E + Q + Q × E, is not equal to the sum of the percentages of decrement of the MSE for all separated models that contain only one of them.
is is because the main effects QTL and the environment-specific effects QTL are highly correlated. Correlations among them can change the percentages of the decrement of the MSE dramatically from what they would be in a separate model.

Prediction Accuracy.
Respectively, the prediction accuracies of FT, OC, and SY are evaluated (Tables 3-5). From Tables 3-5, we can see that the highest average prediction accuracy among ten environments was obtained using the OLS post-LASSO method (average correlations are 0.8789, 0.9045, and 0.5507 for FT, OC, and SY, respectively). is two-stage method is followed by the M × E GBLUP model (average correlations are 0.8347, 0.8205, and 0.4005 for FT, OC, and SY, respectively). For FT and OC, the third performing method is the LASSO approach (average correlations are 0.7583 and 0.7755 for FT and OC, respectively). However, for SY, the stratified GBLUP method is the third performing method (average correlation is 0.2860).
us, on average, the two-stage method always performs the best in prediction accuracy for all the three complex traits.
Also, although the performances of various methods vary in different environments, the OLS post-LASSO method is advantageous in all the 10 environments except for "N6" for FT and "S5" for SY. e OLS post-LASSO achieves its best accuracy in "S4" for FT (correlation is 0.9333), in "S7" for OC (correlation is 0.9188), and "N6" for SY (correlation is 0.7039). For FT, in the environment "N6," the M × E GBLUP model has higher prediction accuracy than the OLS post-LASSO method (correlations of 0.  Table 3). In other words, the probability of the difference between the average prediction accuracies of the OLS post-LASSO and M × E GBLUP model is less than 0.05. us, the improvement is significant for FT, and this finding is also true for OC and SY as shown in Tables 4 and 5, respectively.

Discussion
Since GS was proposed by Meuwissen et al. [3] in 2001, numerous studies have been performed to increase the prediction accuracy of the trait of interest, and numerous approaches have been proposed for GS in different situations, especially BLUP type methods. As one of the derivations of the BLUP method, the GBLUP method has become a commonly used GS method and has shown success in many situations, such as in the presence of the G × E. In this study, we established a general G × E linear model to simultaneously model the genetic effects and the G×E effects. By treating the marker effects, including the main marker effects across all environments and the environment-specific marker effects, as fixed instead of random, a two-stage method named OLS post-LASSO was used to solve the model and obtain a genomic prediction.
e OLS method was also used by Meuwissen et al. [3] but not in the context of G × E. For using the OLS method in GS, the biggest effects selected by some procedure, such as single segment regression analysis performed by Meuwissen et al. [3], were included. However, this stepwise procedure  Figure 1: e point estimates and 95% confidence intervals (95% CI) for marker main and environment-specific effects and the standard error (SE) of environment-specific effects for FT. (a) Point estimates (red dots) and 95% CI for main effects (the vertical blue and green CIs represent statistically significant or nonsignificant values, respectively). (b) Point estimates (red dots) and 95% CI for environment-specific effects (the vertical blue and green CIs represent statistically significant or nonsignificant values, respectively). (c) SE for environmentspecific effects (the blue and green stems represent environment-specific effects found in multiple environments or only one environment, respectively). 6 Discrete Dynamics in Nature and Society Using the TNDH population as a working example, prediction accuracy was compared for three complex traits (FT, OC, and SY) using four methods, namely, OLS post-LASSO, LASSO, M × E GBLUP model, and stratified GBLUP model. Generally, the two-stage method, that is, OLS post-LASSO, achieved the highest prediction accuracies on average across environments and also achieved the highest prediction accuracies within most of the ten environments. e M × E GBLUP model performed worse than the twostage method but better than the other two approaches, namely, the LASSO and the stratified GBLUP model. Although LASSO performed worse than the M × E GBLUP model, the OLS after LASSO, that is, OLS post-LASSO, outperformed the M × E GBLUP model, and the improvement in prediction accuracy is significant. e results show that the OLS post-LASSO could always outperform the    Discrete Dynamics in Nature and Society  LASSO, whether for FT, OC, or SY. Although the advantage of the two-stage method has been reported by Belloni and Chernozhukov [8], the application in GS and its benefit in increasing predictive ability are first studied in this study.
From the computational aspects of the model, the twostage approach took around 45 minutes to compute (Windows10 Pro with a 1.6 GHz Intel Core i5-8250U processor and 8 GB of memory) in the first stage and around 0.1 seconds to compute in the second stage. e computation time is higher than that of the stratified and the G × E GBLUP models. It took around 30 seconds and 10 minutes for the stratified and the G × E GBLUP models, respectively. However, in the case of G × E, the two-stage method tends to fit easily compared to traditional methods, such as the factor-analytic method. e factor-analytic method tries to simplify a complex covariance structure and, in some cases, for example, in the case of G × E, difficulty in reaching convergence [32].
As a penalized regression method, LASSO was first implemented in GS by Usai et al. [33], and its prediction performance was evaluated by many studies, such as Ogutu et al. [34] and Xu et al. [35]. LASSO as well as GBLUP, the most commonly used method in GS, always outperformed other methods, such as rr-BLUP [34] and support vector machine (SVM) [35]. Based on the FT trait dataset of TNDH population, the study of Li et al. [26] indicated that the average prediction accuracies across the ten environments varied from 0.593 to 0.651 using the existing eight models: rr-BLUP, reproducing kernel Hilbert spaces (RKHS), Bayesian LASSO, BayesA, BayesB, random forest (RF), and SVM (linear kernel and Gaussian kernel). e average prediction accuracies obtained by Li et al. [26] for the FT trait were lower than those in the four methods evaluated in the present study (average correlations are 0.8789, 0.8347, 0.7583, and 0.6789 for OLS post-LASSO, M × E GBLUP model, LASSO, and the stratified GBLUP model, respectively). e stratified GBLUP model performed similarly bad to the eight models evaluated by Li et al. [26] because those methods ignore the G × E effects in the analysis. e results of our study confirmed that incorporating G × E effects into the GS model increased prediction accuracy, which has been noted by many studies, such as Lopez-Cruz et al. [20]. In particular, the two-stage method performed the best for complex traits FT, OC, and SY. e percentage of decrement of MSE by our model (1) (corresponding to the alternative model with E + Q + Q × E) is very close to 100% for the FT trait (98.6731%). is finding indicates that our proposed model fits the FT trait dataset perfectly. However, the performance is reduced when applying the same model (1) to other traits, for example, to OC (the percentage of decrement of MSE is 88.0751%) and SY (the percentage of decrement of MSE is 66.6784%). As noted by Luo et al. [27], the FT shows very high heritability; however, the SY shows low heritability. us, it is reasonable that our proposed model accounts for more variation of FT but less variation of SY. However, even in the case of more complex traits, that is, the OC and SY, the prediction performance of our proposed method remains superior compared with previous approaches, like the M × E GBLUP model and other models.
Although the FT is not complex as SY, the identified number of QTL for FT is larger than that for SY (Table 1). If we only focus on the number of identified QTL, there seems to be some irrationality. We can explain the issue from at least the following two aspects. First, the identified QTL by LASSO is suggestive, and further experimental identification is required. at means the detected QTL may not be the true QTL. Second, from the first subplot (a) of Figure 1, we can see that there exists a major QTL on chromosome C5 for FT, which has the largest main marker effect, and the other significant QTL (the blue lines) have smaller main marker effects than the major QTL. However, we cannot find the  Figure 2: e point estimates and 95% confidence intervals (95% CI) for marker main and environment-specific effects and the standard error (SE) of environment-specific effects for OC. (a) Point estimates (red dots) and 95% CI for main effects (the vertical blue and green CIs represent statistically significant or nonsignificant values, respectively). (b) Point estimates (red dots) and 95% CI for environment-specific effects (the vertical blue and green CIs represent statistically significant or nonsignificant values, respectively). (c) SE for environmentspecific effects (the blue and green stems represent environment-specific effects found in multiple environments or only one environment, respectively).
major QTL for OC and SY (please see subplots (a) in Figures 2 and 3). Meanwhile, from the magnitude of the absolute value of the main marker effects, we can see that it decreases from around 4 for FT to around 1.5 for OC and around 0.3 for SY. Similar patterns can be found for the environment-specific marker effects (please see the subplots (b) of Figures 1-3 for details). us, the finding of our study also supports that the FT trait is not as complex as OC and SY as we expected.

Data Availability
Phenotypic and marker data used in the article can be found in Supplemental file S1.  Figure 3: e point estimates and 95% confidence intervals (95% CI) for marker main and environment-specific effects and the standard error (SE) of environment-specific effects for SY. (a) Point estimates (red dots) and 95% CI for main effects (the vertical blue and green CIs represent statistically significant or nonsignificant values, respectively). (b) Point estimates (red dots) and 95% CI for environment-specific effects (the vertical blue and green CIs represent statistically significant or nonsignificant values, respectively). (c) SE for environmentspecific effects (the blue and green stems represent environment-specific effects found in multiple environments or only one environment, respectively). Supplementary Materials e compressed supplementary file named "S1_TN182 Phenotypic and marker data.zip" includes an Excel file named "TN182 Phenotypic and marker data.xlsx." e Excel file includes four sheets. e "Environment" sheet shows the information about the ten environments, including the name of macro environments, the code of the experiment, and so on. e "Trait name" sheet shows the names, the abbreviation, and the measurement of the three traits which are studied in the paper. e "Phenotype" sheet shows all the phenotypic values of the three traits collected from all the ten environments.

Conflicts of Interest
e "Genotype" sheet shows the genotype matrix of all the individuals. (Supplementary Materials)