Modeling Genotype × Environment Interaction Using a Factor Analytic Model of On-Farm Wheat Trials in the Yaqui Valley of Mexico

2647 The Yaqui Valley in Sonora, Mexico, is agro-climatically representative of a major ecological zone (megaenvironment) comprised of irrigated wheat areas with very low rainfall. In the Yaqui Valley, there are approximately 225,000 ha of land under an irrigation system where canals are used to deliver water from a series of reservoirs for use by farmers as flood or furrow irrigation. Other major representative irrigated wheat areas with very low rainfall are the Indus Valley in Pakistan, the Gangetic Valley in India, and the Nile Valley in Egypt. Around 32 million hectares of wheat (producing 42.7% of the total wheat production in the world) are grown in this type of irrigated environment in the developing world. The wheat-growing periods in these areas are associated with temperate climates; however, sudden rises in temperature before and after flowering are common and salinity problems may occur where drainage is poor (Meisner et al., 1992). The yield expansion during the Green Revolution era of the 1970s was the result of sowing varieties with improved disease resistance and increased yield potential, together with greater use of agricultural inputs. The Yaqui Valley has been the site of the wheat breeding operations of the International Maize and Wheat Improvement Center (CIMMYT) for more than 50 yr. The bread and durum wheat yields in the Valley have progressed at a remarkable rate, but the rate of increase has slowed in the last decades. However, the gap between the high bread wheat (BW) and durum wheat (DW) yields on research stations, and the low yields in farmers’ fields in the Yaqui Valley has been, on average and over the years, about 3 t ha–1 or more. It should be pointed out that farm size in the Yaqui Valley ranges from medium to large, from less than 10 ha to more than 150 ha (average farm size: about 20 ha); the most common farm size is about 10 ha. The grain yield gap between CIMMYT’s main wheat breeding experiment station in the Yaqui Valley and farmers’ fields in the Yaqui Valley stimulates the practice of evaluating advanced bread and durum wheat lines from CIMMYT’s Global Wheat Modeling Genotype × Environment Interaction Using a Factor Analytic Model of On-Farm Wheat Trials in the Yaqui Valley of Mexico

T he Yaqui Valley in Sonora, Mexico, is agro-climatically representative of a major ecological zone (megaenvironment) comprised of irrigated wheat areas with very low rainfall.In the Yaqui Valley, there are approximately 225,000 ha of land under an irrigation system where canals are used to deliver water from a series of reservoirs for use by farmers as flood or furrow irrigation.Other major representative irrigated wheat areas with very low rainfall are the Indus Valley in Pakistan, the Gangetic Valley in India, and the Nile Valley in Egypt.Around 32 million hectares of wheat (producing 42.7% of the total wheat production in the world) are grown in this type of irrigated environment in the developing world.The wheat-growing periods in these areas are associated with temperate climates; however, sudden rises in temperature before and after flowering are common and salinity problems may occur where drainage is poor (Meisner et al., 1992).
The yield expansion during the Green Revolution era of the 1970s was the result of sowing varieties with improved disease resistance and increased yield potential, together with greater use of agricultural inputs.The Yaqui Valley has been the site of the wheat breeding operations of the International Maize and Wheat Improvement Center (CIMMYT) for more than 50 yr.The bread and durum wheat yields in the Valley have progressed at a remarkable rate, but the rate of increase has slowed in the last decades.However, the gap between the high bread wheat (BW) and durum wheat (DW) yields on research stations, and the low yields in farmers' fields in the Yaqui Valley has been, on average and over the years, about 3 t ha -1 or more.It should be pointed out that farm size in the Yaqui Valley ranges from medium to large, from less than 10 ha to more than 150 ha (average farm size: about 20 ha); the most common farm size is about 10 ha.
The grain yield gap between CIMMYT's main wheat breeding experiment station in the Yaqui Valley and farmers' fields in the Yaqui Valley stimulates the practice of evaluating advanced bread and durum wheat lines from CIMMYT's Global Wheat Program directly in on-farm trials in the Yaqui Valley under the agronomic practices in farmers' fields in environments with optimum and reduced irrigation.Data on the performance of durum and wheat lines evaluated during several years in multi-environment trials (METs) in farmers' fields are very valuable for studying and assessing genotype × environment interaction (G×E) and the grain yield stability of durum and bread wheat lines.
The differential response of the genotypes across environments generated G×E variability, which could be due to crossover interaction (COI) or non-COI, that is, the change in scale of a trait measured in different environments characterized by the heterogeneity of genetic variances within different environments.Since phenotype is the result of specific genotype responses a(α) to specific environmental effects b(γ), Gregorious and Namkoong (1986) proposed a joint multiplicative operator between these two effects to reflect their non-additivity (i.e., non-separability) or their additivity (i.e., separability).From Gregorious and Namkoong's perspective, complete separability of effects takes place when each genotype × environment cell is defined only by the sum of the corresponding genotypic and environmental main effects without including the joint multiplicative operator of a(α)b(γ).In multi-environment trials (METs), there is non-separability of these effects (i.e., important COI), and a joint multiplicative operator of the a(α)b(γ) type needs to be defined to study the non-separability (COI) of effects.Within the fixed effects linear model perspective, the joint multiplicative operator a(α)b(γ) could be considered a special case of the use of a linear-bilinear model for analyzing METs, such as the sites regression model (SREG) (Crossa and Cornelius, 1997).The SREG provides a series of multiplicative operators computed from a reduced-rank model matrix of deviations of the parametric cell mean of the ith genotype in the jth environment from the mean of the jth environment (i.e., the effects of the genotypes plus the effect of the G×E).Singular value decomposition based on genotypic and environmental factors is a method used to estimate the parameters of the reducedrank model.It provides a low-dimension representation of the variability in the original data (Gabriel, 1978).Model SREG 2 has two multiplicative operators (first and second singular vectors for genotypes and environments), each multiplied by a constant consisting of the singular value for the first and second bilinear component.
The linear-bilinear models mentioned above can be adjusted under the fixed effects model perspective; this assumption is intrinsically restrictive, since it does not allow making inferences of results to other situations of locations, years, and individuals.However, linear-bilinear models can be stated as linear mixed models assuming certain variance-covariance relationships for the G×E.For example, the SREG model is related to the factor analytic (FA) model (for describing the combined estimates of the main effects of genotype plus G×E interaction) among the s environments by their relationship to unobservable latent variables (Lawley and Maxwell, 1971) through a multiplicative operator of genotypic and environmental factors.Crossa et al. (2004) used the FA model within the linear mixed model framework to show that there is no statistically significant departure from positive proportionality in disjointed subsets of environments or genotypes revealed by the SREG (and the Shifted Multiplicative Model) clustering procedures.
The FA multiplicative operator used to describe the effect of the ith genotype in the jth environment consists of the sum of k latent factors.For the first two latent factors, the FA(2) multiplicative operators comprise the loadings for the jth environment, and the scores of the ith genotype.The matrices of loadings and scores are rotated to obtain a principal component solution (Smith et al., 2002) and produce the same graph as the one developed from SREG 2 (for completely balanced data sets), with the same interpretation in terms of (i) directions and projections of genotypes and environments; and (ii) subsets of environments and genotypes with COI and non-COI responses.Using the FA variance-covariance structure to model the bilinear terms of a mixed model version of the regression of the genotypic mean on the site means was first proposed by Piepho (1997).The FA variance-covariance structure for modeling genotype plus G×E in METs is used in plant breeding programs for selecting genotypes in each environment based on Best Linear Unbiased Predictions (BLUPs) or empirical BLUPs (Smith et al., 2002(Smith et al., , 2005;;Kelly et al., 2007).Furthermore, FA offers a powerful solution for simultaneously modeling G×E with heterogeneity of variances and covariances between environments together with the numerical relationship matrix (A) obtained from the coefficients of parentage among genotypes to study additive genetic × environment and additive × additive genetic × environment interactions (Crossa et al., 2006;Oakey et al., 2006;Burgueño et al., 2007).
The use of FA in plant breeding for studying G×E has been extensive (Meyer, 2009).Recently the use of FA has been extended to genomic prediction for assessing G×E interaction; for example, Burgueño et al. (2012) compared the predictive power of genomic prediction of linear mixed models with a multiplicative operator of the type mentioned above with (i) single-environment pedigree or/and marker-based models and (ii) multi-environment pedigree or/and marker-based models.Burgueño et al. (2012) used a FA statistical framework for multi-environments with G×E that is flexible enough to incorporate pedigree and genomic relationships, and found increasing prediction accuracy when using simultaneously genomic and pedigree information while modeling the G×E interaction.Furthermore, two recent studies aimed at quantifying the genetic gains in two historical data sets from CIMMYT's Global Wheat Breeding Program employed the FA for modeling the complex G×E phenomenon (Crespo-Herrera et al., 2017, 2018).Crespo-Herrera et al. (2018) analyzed 740 locations in 66 countries classified into low-yielding and medium-yielding locations; modeling the complex G×E with the FA produced genetic gains of 1.8% in low yielding locations and 1.41% genetic gains in medium-yielding locations.
Despite the extensive use of FA for assessing and studying G×E in plant breeding, the use of linear-bilinear models for METs in agronomy and on-farm trials for studying and modeling G×E of agronomic practices × environments has decreased.Based on these considerations, the main objective of this study was to apply the FA model to multi-environment trials established by CIMMYT's Global Wheat Program in farmers' fields in the Yaqui Valley during three cropping seasons (2012-2013, 2013-2014, and 2015-2016) and model the G×E of several durum and bread wheat advanced lines based on grain yield (although several traits were measured).Four farmers were employed; each planted his own trials in different parts of the field using his own agronomic practices including optimal irrigation and reduced irrigation.Different numbers of bread and durum wheat lines were planted in the different cropping seasons (ranging from 12 to 17 for each crop), making the data across cropping seasons very unbalanced.Analyses for modeling G×E using the FA model were performed within each cropping season and combined across cropping seasons.
The main objectives of this study were to model the G×E between the bread and durum wheat lines and the environments employed in the 3 yr of on-farm trials, and to evaluate the efficiency of using a linear mixed model with FA covariance structure to identify (i) wheat lines that are highly productive and stable in the environments, and (ii) to detect environments that cause large (or not) G×E.

Phenotypic data
Bread and durum wheat lines were evaluated in on-farm trials for grain yield and other traits during three cropping seasons (2012-2013, 2013-2014, and 2015-2016) in the Yaqui Valley of southern Sonora, Mexico.In the first cropping season (2012-2013) 33 genotypes were evaluated, 17 bread and 16 durum wheats; in the second cropping season (2013-2014) 33 genotypes were also evaluated, only this time there were 14 bread and 19 durum wheats; finally, in the third cropping season (2015-2016) only 24 genotypes were evaluated, 12 of which were bread wheat and 12 were durum wheat (Table 1).The above situation generated an imbalance in the complete data, since the genotypes evaluated were different in all three cropping seasons; in total there were 53 genotypes including both bread and durum wheats.In addition, the number of on-farm trials was different in each cropping season: in the 2012-2013 and 2015-2016 cropping seasons, six on-farm trials were established, whereas in the 2013-2014 cropping season, only five were performed.The frequency with which each genotype was evaluated in each on-farm trial (environment) within each cropping season and the imbalance pattern are shown in Table 1.In each farmer's field, the wheat and durum lines were planted in a randomized complete block design with three replicates.
The notations of the individual environments were as follows: the letter F followed by a number stands for a different farmer (F1 for farmer 1, F2 for farmer 2, and so on); the next letter represents the irrigation scheme (F for full irrigation, or R for reduced irrigation), and finally, lower-case letters a or b indicate a different place where a farmer's trial was established under the same irrigation scheme (another environment for that same farmer).In the combined analysis, we simply added two digits to indicate the cropping season (12 for 2012-2013; 13 for 2013-2014; and 15 for 2015-2016).For example, 12F1Fa stands for 'a' plot (i.e., a place on the farm of farmer 1 in the 2012-2013 cropping season under full irrigation).Therefore, environment is defined as the combination of farmers (and the place where the trials was located within that farmer), the irrigation (full and reduced), and the year (or cropping season, 2012-2013, 2013-2014, and 2015-2016) The trials (environments) were planted by CIMMYT using a Wintersteiger experimental plot machine, on the same planting date as the one used by the farmer in his field.The rest of the crop management practices (fertilization, irrigation, weed control, etc.,) were implemented by the farmers.The trials were combine-harvested by CIMMYT using a Wintersteiger plot combine.Each experimental unit was six beds wide, and the bed size ranged 75-80 cm wide × 8 m long.Some farmers planted three rows of wheat on top of the beds and others planted a fourth row in the furrow.The harvested area was the three central meters of the two central beds.All the trials were arranged as complete block designs with three replicates.Different traits such as thousand-grain weight, grain moisture, grain yield, "black point" disease scores, and grain quality scores of yellow berry, among others, were measured.

stAtIstIcAL ModeL
We will describe the baseline fixed effect model, the general fixed effect linear-bilinear SREG and the linear mixed model similar to SREG with the FA model for modeling the main effect of the genotypes plus the G×E interaction.Finally, we will explain the connection between the SREG and the FA.
the Linear-bilinear site regression Fixed effect Model The baseline fixed effect linear model for response of genotypes in environments is y ( ) , where ij y is the empirical mean response of the ith genotype (i = 1,2,…,g) in the jth environment ( j = 1,2,…,e) with r replications in each of the g×e cells, μ is the grand mean across all genotypes and environments,τ i is the effect of the ith genotype, δ j is the effect of the jth environment, (τ δ) ij is the interaction of the ith genotype in the jth environment, and ij e is the (average) error assumed to be normally and independent distributed (0,σ 2 /r) (where σ 2 is the within-site error variance, assumed to be the same for all sites).Note that the term environments may refer to individual locations (or sites) in one specific year or several locations across several years, such that environment is condensed and commonly refers to the combination of location (farmer, irrigation level, and farmer's plot) and year.
When the interaction bilinear term,( ) ij td , of the above linear model is subjected to singular value decomposition, and decomposed as Another class of fixed effects linear-bilinear models, described by Crossa and Cornelius (1997), is the SREG , where δ j is the mean of the jth environments and the parameters m l , im a , and jm g are the same as those defined for AMMI.However, in the SREG model, the main effects of genotypes and G×E are combined and expressed as the sum of bilinear effects.The SREG model has been used for grouping environments without genotypic rank change (Crossa and Cornelius, 1997;Crossa et al., 2004).The interaction parameters im a and jm g of these linear-bilinear models, i.e., AMMI and SREG, define the behavior of genotypes and environments, and when 1 2 and 1 2 , j j g g are plotted together in  (2012-2013, 2013-2014, 2015-2016).Each trial between cropping seasons corresponds to a different farmer and irrigation regime.Empty cells represent imbalanced data.a biplot (Gabriel, 1978), useful interpretations of the relationships among genotypes, environments, and G×E are obtained.Usually 1 2 , j j g g are called primary and secondary effects of environments, respectively.

Linear Mixed Model for the Variance-covariance structure of the Genotype × environment
Assuming g genotypes, e environments, and r replicates in each environment, the response variable can be modeled as follows: where X is the incidence matrix for the fixed effects of environments, and Z r and Z g are the incidence matrices for the random effects of replicates within sites and genotypes within sites that combine the main effects of genotypes and G×E.Vector b is the fixed effect of environments and vectors r, g, and e are the random effects of replicates within environments, genotypes within environments, and residuals within environments, respectively, and are assumed to be random and normally distributed with zero mean vectors and variance-covariance matrices R, G, E, respectively, such that the joint distribution of these three terms is multivariate normal, that is: Crossa et al. (2004Crossa et al. ( , 2006)), a representation of these matrices is as follows: where y j is the vector of the response variable in the jth environment ( j = 1,2,…,e), 1 is a vector of ones, μ j is the population mean of the jth environment, and

Modeling Matrix G by the Factor Analytic Model
Various variance-covariance structures can be proposed to model matrix G; the most restrictive variance-covariance structure would be to assume that variances within all environments are equal, and that all pairwise correlations between genotypes are zero, whereas the most liberal structure is the completely unstructured model that assumes the matrix contains e(e+1)/2 parameters.In this study, we used the factor analytic structure for modeling G in terms of a few hypothetical (unobservable) factors.The unobservable factor effect of the ith genotype in the jth environment can be expressed as t , where ik d is the kth random regression coefficient of the ith genotype (loading or genotypic sensitivity) to the kth unobserved (latent) variable related to the jth environment (environmental potentiality), x jk , and d ij is the residual interaction term.The score of the ith genotype in the kth factor component is estimated as a covariance parameter; thus the model has a random regression coefficient form where x 1k , x 2k ,…, x sk are covariates for the kth factor component.In matrix notation, the vector of genotypic effects is represented by g = Δx +d so that the variance-covariance of g is V(g) = ΔV(x) Δ´ + D and, since V(x) = I, V(g) = Δ Δ´ + D. The factor analytic model implies that the variance of the effect of the ith genotype is t 2 1 +d k ik i

S d
= and the covariance of the effects of genotypes i and i´ is t Therefore, the factor-analytic structure with q ≤ e factors or components [FA(q)] is of the form ΔΔ´ + D, where Δ is an e × q matrix of δ´s and D is an e × e diagonal matrix with e nonnegative parameters on the diagonal.Each column of Δ contains the genotype scores for one of the multiplicative terms.For q = 1, the model denoted as FA(1) has one multiplicative term and 2e parameters to be estimated, for q = 2, i.e., model FA(2), the model has 3e parameters to be estimated, and so on for FA(3), etc.

connection between sites regression Model And Factor Analytic
The FA can be interpreted as the linear regression of genotype and G×E on latent environmental covariates (environmental loadings, x jk ), with each genotype having a separate slope (genotypic scores, δ ik ) but a common intercept (if main effects of genotypes are not distinguished from G×E).The slopes of genotypes measure the sensitivity of the genotypes to hypothetical environmental factors represented by the loadings of each environment.The interpretation of the loadings (environments) and scores (lines) of the first two components of FA(2) is the same as the interpretation obtained by the SREG fixed effect linear model with two components (SREG 2 ).Under this rotation, the directions and projections of the vectors of FA(2) and SREG 2 in the biplot are the same.Therefore, the properties that determine that the first component of SREG 2 accounts for non-COI and the second component of SREG 2, is due to COI variability should be the same as for FA(2) and as shown by Burgueño et al. (2008).Therefore, the FA can be considered a member of the general class of linear-bilinear SREG models in which a random effects assumption for the multiplicative part of the linear-bilinear model (i.e., SREG) is justified.
Therefore, based on the above considerations, for a completely balanced data set, the biplots originated by the SREG 2 and the FA(2) are equivalent and display similar patterns in terms of directions and projections of the environments and the genotypes.However, in the SREG model, environment, lines and G×E are fixed effects and therefore inferences are based solely on these effects, whereas in the linear mixed model with FA, the inferences on these effects are based on the variance-covariance structures.Thus, the main advantage of the FA over the linear-bilinear fixed effect SREG is that by modeling the G×E, the standard error of the line means their differences are more precise than in the SREG model.In this study we show only biplots generated from FA(2) that are very similar to those obtained by the SREG in each year.

software used
The mixed models using the FA covariance structure were fitted through the ASReml-R package (Butler et al., 2009) within the R Statistical software (R Core Team, 2017).The variance parameters in the mixed models were estimated using the Average Information (AI) algorithm.

resuLts results of cropping season 2012-2013
The individual analyses of variance for each cropping season show the significance of G×E in all three cropping seasons (Table 2).Particularly in the first cropping season (2012-2013), the G×E was only significant at a 2.66% level (p = 0.0266), while in the other two cropping seasons, it was highly significant (p < 0.0001).The contrasts between BW versus DW in each of the individual analyses, as well as in the combined analysis across all trials, are depicted in Table 2.
The genetic correlations of the environments obtained after fitting FA(2) indicated all correlations were positive and relatively high between environments 12F1Fa and 12F1Ra and all the others (Table 3), except environment 12F1Rb, which had low correlations with all the other environments except 12F1Fa.For cropping season 2012-2013, when fitting the FA(2) model, the two factors explained, on average across the six environments, 77.63% [FA(1)] and 15.53% [FA(2)] of G×E variability, and together explained 93.16% of total G×E variability (Table 4); the environment with highest production was the fully irrigated 12F2F environment (8.881 ton ha -1 ).For most environments, FA(1) explained a relatively large proportion of the variance, except for environment 12F1Rb, where FA(1) explained only 23.93%, indicating the difference between this environment and the other environments; therefore 12F1Rb is the environment that caused most of the G×E.
Figure 1 shows that the durum wheats (DW, in red) generally had higher yields than the bread wheats (BW, in blue).In particular, line DW_C (in bold) is durum wheat Cirno, which had the highest yield across all six trials.The plots (farmers) were grouped according to their irrigation scheme.In the top right quadrant are four trials, mostly under full irrigation: farmer 1, full irrigation, second plot 'b' (12F1Fb); farmer 2, full irrigation (12F2F); and farmer 3, full irrigation (12F3F), generating a very compact and homogeneous cluster (small angles between their vectors), except for the trial of farmer 1, reduced irrigation and plot 'a' (12F1Ra).In the bottom right quadrant of the biplot, the grouping tendencies are not as clear as before; it contains both irrigation regimes: farmer 1, reduced irrigation, second plot (12F1Rb), and farmer 1, full irrigation, first plot (12F1Fa).Environment 12F1Rb seems to be the one causing most of the G×E interaction, as already indicated by the low percentage of variance explained by FA(1) (Table 4) and the low correlation with the other environments (Table 3).
The Supplementary Material show the means of the wheat lines in each of the 6 environments for cropping season 2012-2013, the FA(1) and FA(2) loadings for the wheat lines, and the phenotypic correlations between the loadings of the wheat lines and the grain yield of the lines for each environment and across  environments.Except for environment 12F1Rb (r = 0.588), all the other environments had a correlation higher than 0.8 with the loadings of FA(1).The most productive line across all the environments was durum wheat Cirno (DW_C).

results of cropping season 2013-2014
In the analysis of variance for the 2013-2014 cropping season, the G×E was highly significant (p < 0.0001), indicating the differential response of bread and durum wheats across the five environments.The grain yield was similar for both groups, as indicated by the nonsignificant contrast in the combined analyses (p = 0.2880).Similarly, in the individual contrasts there were two environments that were also nonsignificant: farmer 1 full irrigation regime in 'b' plot (F1Fb, p = 0.8029) and farmer 1 reduced irrigation regime 'b' plot (F1Ra, p = 0.2723), whereas in the plots of farmer 1 full irrigation 'a' plot (F1Fa), and of farmer 1 under reduced irrigation regime in 'a' and 'b' plots (F1Rb), the contrast was highly significant (p = 0.0012, p < 0.0001 and p < 0.0001, respectively) (Table 2).
The genetic correlations between the environments in cropping season 2013-2014 were intermediate to high for most pairs of environments, except environment 13F2F, which had a negative correlation with environments 13F1Fa (-0.551), 13F1Ra (-0.557), and 13F1Rb (-0.034) (Table 3).In the FA(2) model, the two factors explained, on average across all environments, 62.38% and 25.66% of the variance for FA(1) and FA(2), respectively, and together explained 88.04% of total G×E variability (Table 5).This value was similar to that obtained for the environments in the 2012-2013 cropping season, except that now the first factor has less associated variability and the second factor has higher associated variability, which can be interpreted as lower stability for both genotypes and environments.Factor analytic component FA(1) explained a sizeable proportion of G×E variance, except for environment 13F2F (11.57%;Table 5), which is the one causing most of the G×E interaction.The reduced irrigation environment 13F1Ra had the highest production (7.614 ton ha -1 ).
The biplot (Fig. 2) shows that DW_C was consistently the genotype with the highest yield across all trials [greatest value in FA(1), and a very low value in FA( 2)].However, in general, the durum (DW, in red) and bread wheats (BW, in blue) were not as clearly differentiated as they were in the 2012-2013 cropping season, which was reflected in the significance of contrasts (Table 2).The environments were grouped based more on plot similarities than on the irrigation scheme; the top right quadrant shows the plot owned by farmer 1 in the 'b'plot for both irrigation regimes (13F1Fb and 13F1Rb), while the bottom right quadrant contains the plot owned by farmer 1 in the 'a' plot, once again for both irrigation schemes (13F1Fa and 13F1Ra).Finally, the plot owned by farmer 2 under full irrigation (13F2F) is isolated from the others in the upper left quadrant.
The biplot in Fig. 2 shows environment 13F2F as being in the opposite direction of the rest of the environments, indicating its negative correlation with the other environments (Table 3) and that it is the environment where FA(1) explained less of the G×E interaction and thus the one causing the COI (Table 5).
The Supplementary Material show the means of the wheat lines in each of the five environments for cropping season 2013-2014, and the phenotypic correlation between the loadings of the wheat lines and the grain yield of the lines in each environment and across environments.Except for environment 13F2F (r = -0.54),all the other environments had a mean correlation higher than 0.6 with the loadings of FA(1).Durum wheat line DW_C had the highest mean yield across all the environments.

results of cropping season 2015-2016
In the on-farm trials during the 2015-2016 cropping season, only 12 durum wheat genotypes and 12 bread wheat genotypes   were tested.In the combined analysis of variance (Table 2), the genotype × environment interaction was highly significant (p < 0.0001), indicating variability in the performance of both durum and bread wheats across the six environments included in this cropping season.In the combined analysis, the contrast was highly significant, as was the case in the four individual trials of farmer 1 with both full and reduced irrigation schemes (plots F1Fa, F1Fb, F1Ra, and F1Rb), whereas in the plots of farmers 2 and 4 with full irrigation (plots F2F and F4F), it was not significant (Table 2).
The genetic correlations among environments (Table 3) show a varying set of correlation values.Environment 15F1Fa had zero correlation with environments 15F1Fb (-0.003) and 15F1Rb (-0.032), whereas for this cropping season, environment 15F2F had a high genetic correlation with 15F4F (0.982) and 15F1Ra (0.891).In agreement with the genetic correlation, the variance percentage explained by the FA(1) component of environment 15F1Fa was only 1.38% of total G×E variance, indicating the complex interaction caused by environment 15F1Fa (Table 6).For the FA(2) model, the two factors explained, on average across all environments, 61.54% and 20.48% of the total variance for FA(1) and FA(2), respectively, and together they explained, on average, 82.02% of total G×E variability.The full irrigated environment had the highest grain yield production (8.002 ton ha -1 ).
Once again, the reference genotype DW_C had the highest yield and stability across trials; likewise, in the durum wheats (DW, in red) in general, 7 of 11 lines showed higher yields than the bread wheats (BW, in blue).Figure 3 shows that environments were more likely grouped by the similarities in the plots than by the irrigation scheme.In the bottom right quadrant are the plots owned by farmer 1 in the first plot under both irrigation schemes (15F1Fa and 15F1Ra).The plots owned by farmer 2 and farmer 4 under full irrigation (15F2F and 15F4F) formed another very compact group because the angle between their vectors is practically zero, yet the yield in the plot owned by farmer 2 was higher.Finally, in the top right quadrant are the plots "b" of farmer 1 under both irrigation schemes (15F1Fb and 15F1Rb) with similar yield (Fig. 3).
The mean of the wheat lines in each of the six environments in cropping season 2015-2016 and the phenotypic correlation between the loadings of the wheat lines and the grain yield of the lines in each environment and across environments are shown in the Supplementary Material.Except for environment 15F1Fa (r = 0.167), all the other environments had a correlation higher than 0.7 with the loadings of FA(1).As in previous cropping seasons, durum wheat line DW_C had the highest mean yield across all the environments included in this cropping season.

results of the combined Analysis across environments in the different cropping seasons
Table 7 shows the genetic correlations for all 17 environments across the three cropping seasons.Almost all the correlations among environments were positive, except for 13F2F, 15F1Fa, and 15F1Ra, the same as in the individual analyses for the 2013-2014 and 2015-2016 cropping seasons, which were the only ones    oriented to the left in the biplot.Results in Table 8 indicate that across years, FA(1) explained a lower percentage of variance of the total G×E variance for environments 13F2F, 15F1Fa, and 15F1Ra (41.08%, 0.66%, and 0.31%, respectively).Also, environments 13F1Fb and 12F1Rb explained a lower percentage of the total G×E variance than other environments (20.84% and 37.27%, respectively).These environments are the ones that caused the complexity of the G×E when combining all the environments across years.
When fitting the FA(2) model, the two factor components explained, on average across all environments, 55.33% and 26.02% of G×E variability for FA(1) and FA(2), respectively, and on average 81.35% of total G×E variability (Table 8).The most productive environments are in cropping season 2012-2013, with fully irrigated environment 12F2F being the highest one (8.877ton ha -1 ).The biplot in Fig. 4 shows that, similar to the individual analysis biplots for each cropping season, the durum wheats (in red) had higher yields than the bread wheats (in cyan), while the reference durum wheat line DW_C (in bold) had the highest yield and was relatively stable across cropping seasons.The biplot also shows that, in general, the trials belonging to the same cropping season tended to be grouped together, particularly the trials from the 2012-2013 cropping season (in green), and were differentiated from the trials of the other cropping seasons.The above was probably due to the specific weather conditions in each cropping season, since the farmers' plots were mostly the same during the three cropping seasons.Clearly, the environments causing most of the G×E complexity (i.e., 15F1Ra, 15F1Fa, and 13F2F) appeared in the opposite direction of the other environments, indicating they are the ones causing G×E complexity.
The mean of the wheat lines in each of the 17 environments for combined cropping seasons 2012-2013, 2013-2014, 2015-2016, and the phenotypic correlation between the loadings of the wheat lines and the grain yield of the lines in each environment and across environments are shown in the Supplementary Material.The correlations between the FA(1) loadings and the line mean yield for each environment and across environments were all higher than 0.5, except for environments 13F2F (-0.714), 15F1Fa (-0.154), and 15F1Ra (-0.09).Durum wheat line DW_C had the highest mean yield across all the environments included these three cropping seasons.

dIscussIon
The complex G×E phenomenon present in the 3 yr of data analyzed in this study was dissected by means of the FA model.The variability due to G×E can be due to crossover interaction (COI) or to non-COI (change in scale or heterogeneity of variances within environments).The SREG and FA models can assess these two components of the total variability due to G×E.Note that the objective of this article was not to quantify the number of significant or no significant COI versus the number of non-COI in all possible 2 ×2 way tables of environments and lines as done it by Burgueño et al. (2008).The objective of this article was to use the linear mixed model together with FA variance-covariance structure for (i) studying G×E, (ii) for detecting key environments (productive agronomic treatments comprising combinations of farmers, irrigations levels and cropping seasons), and for (iii) identifying stable and highly productive bread and durum wheat lines that explain interactions in the first factor analytic components to infer the presence of COI and/or COI.

General Assessment of crossover Interaction and non-crossover Interaction
Genotype × experiment Interaction Crossa et al. (2002Crossa et al. ( , 2004) ) proposed detecting COI and non-COI interaction by fitting a SREG model with two components; they noted that the ideal lines from the SREG 2 biplot should give a high estimate of the primary effects ( 1 ˆi a ), that is, high mean of the measured trait (e.g., grain yield) and near zero secondary effects ( 2 ˆi a ).The ideal environment will have large 1 ˆj g (discriminate lines) and small 2 ˆj g ; the authors pointed out that these properties will occur if 1 ˆi a is highly correlated with the line mean of the trait being studied.These properties of the SREG biplot will occur if the primary effects of line ( 1 ˆi a ) are highly correlated with line means.Crossa et al. (2002Crossa et al. ( , 2004) also show that if all 1 ˆj g scores have the same sign, then the first component of the SREG 2 biplot represents the G×E variability due to non-COI, whereas the second component indicates the G×E variability due to COI.Note that the first component (non-COI) indicates the proportionality of line response in environments, whereas the second component (COI) quantifies the disproportionality of line response in environments.As pointed out by Burgueño et al. (2008), the matrices of loadings and scores from FA are rotated to obtain a principal component solution that produces biplots similar to those developed from SREG 2 .Under this rotation, the direction and projections of the vectors of FA(2) and SREG 2 are the same; thus the properties of the first component of SREG 2 that accounts for non-COI and the second component of SREG that explains COI variability also hold for FA(2).As already mentioned, SREG is related to FA (for the combined assessment of the main effect of lines + G×E), which represents the correlation among the environments based on their relationship to unobservable latent variables.

Assessing crossover Interaction and non-crossover Interaction for each Year and combined Across Years
From a practical agricultural perspective, clustering environments with non-COI and identifying environments that are causing most of the G×E variability is important for selecting key environments and stable lines.In this study, there were negative or near zero correlations only between a few pairs of environments in cropping seasons 2013-2014 and 2015-2016, and a large number of negative and/or near zero correlations when combining all three cropping seasons.For cropping season 2012-2013, all correlations are positive; however, environment 12F1Rb had relatively low correlations with all the other environments except 12F1Fa.For cropping season 2013-2014, environment 13F2F had negative correlations with the rest of the environments except environment 13F1Fb.For cropping season 2015-2016, environment 15F1Fa had zero (negative) correlation with 15F1Fb and 15F1Rb.Finally, the combined analysis across years indicated environments 15F1Fa, 15F1Ra, and 13F2F had negative correlations with most of the other environments.
In general, environments with negative or near zero correlations with other environments had the smallest loading values for FA(1) and the smallest total percentage of variance of G×E explained by FA(1).For cropping season 2012-2013, environment 12F1Rb had an FA(1) loading of 0.3 that explained only 23.93% of the total G×E; for cropping season 2013-2014, environment 13F2F had an FA(1) loading of -0.249 that explained only 11.57% of the total G×E.For cropping season 2015-2016, environments 15F1Fa and 15F1Ra had FA(1) loadings of 0.108 and 0.34, respectively, explaining 1.38% and 37.93% of the total G×E.Finally, for the combined analysis across years, environments 15F1Fa, 15F1Ra, and 13F2F had FA(1) values of -0.075, -0.029, and -0.339, respectively, with the FA(1) from those environments in cropping season 2015-2016 explaining only 0.31% and 0.66% of the total G×E.
Results show that for cropping seasons 2012-2013 and 2015-2016, all 1 ˆj g are non-negative; thus SREG 1 and FA(1) predict non-COI, whereas for cropping season 2013-2014, environment 13F2F has a negative 1 ˆj g (-0.249).Therefore, COI is predicted by FA(1).Also, when combining all cropping seasons, environments 13F2F, 15F1Fa, and 15F1Ra had negative 1 ˆj g values and thus FA(1) predicts COI.Furthermore, environments with negative 1 ˆj g values and where the variance explained by FA(1) is small are the environments most distant from the rest of the environments in the corresponding biplots and therefore the ones causing most of the G×E interaction.On the other hand, environments where FA(1) explained a large proportion of total G×E are not causing a great deal of G×E variability.Furthermore, the phenotypic correlation between line performance in environments and FA(1) loadings were high most of the time (see Supplementary Material), except for those environments causing COI.These results indicate that the properties of the SREG biplots, and thus the properties of the FA biplots, can be applied.For example, in cropping season 2012-2013, the most stable and high yielding wheat line is DW_C and the ideal environment is 12F3F; in cropping seasons 2013-2014 and 2015-2016, the most stable and high yielding line is   2012-2013, 2013-2014, 2015-2016.also DW_C.However, when combining all the cropping seasons, DW_C is not the most stable line, although it is still high yielding.

concLusIons
Results of this study indicated differences between bread and durum wheat lines across all the environments (cropping seasons).For example, the durum wheats consistently showed higher yields than the bread wheats in both the individual analyses for each cropping season (year) and the combined analysis across all three cropping seasons.Also, the reference durum wheat, DW_C, always had the highest yield and was very stable both in the individual cropping seasons and combined across cropping seasons.
The factor analytic covariance structure of order two (FA2) used in the linear mixed model offers a useful alternative for modeling interactions in multi-environment breeding trials with highly unbalanced data, to dissect and explain the complex interactions that are common in agronomy experiments.The FA for modeling the variance-covariance structure of a line plus the G×E detected in particular environments (farmer type) in each cropping season and across cropping seasons causes G×E of the COI type.For example, farmer 2 under full irrigation in 2013-2014 (13F2F) caused COI as well as farmer 1 under full and reduced irrigation in 2015-2016 (15F1Fa, 15F1Ra) in the combined analysis.Although environments 12F1Rb and 15F1Fa did not cause COI in their respective years, they were the environments causing most of the non-COI G×E.
In summary, in this study we show that on-farm trials established for testing bread and durum wheat breeding lines prior to release provide valuable information for farmers, agronomists and breeders in terms of their performance in farmers' fields.Appropriate statistical models dissect G×E in a way that helps research identify the most unstable environments causing most of the interaction.Linear mixed models that use FA variance-covariance provide a valuable tool for dissecting COI and non-COI G×E.
m l is the singular value of the mth multiplicative component, which is ordered 1 ; im a corresponds to the left singular vector of the mth component and represents genotypic sensitivities to hypothetical environmental factors represented by the right singular vector of the mth component, jm g .The im a and jm g satisfy the ortho-normalization constraints 0 fixed effect linear-bilinear model is called the Additive Main Effects and Multiplicative Interaction (AMMI) model.

Z
are the design matrices of the random effects of replicates and genotypes within the jth environment, respectively.The variance-covariance matrices R and E are R I r and I rg are the identity matrices of orders r and r × g, respectively, the e × e replicate and error variance-covariance matrices among pairs of e environments, respectively; 2 j r s , 2 e j s are the replicate and residual variances within the jth environment, respectively, and ⊗ is the Kronecker (or direct) product of the two matrices.The structure of E assumes that the residuals of the field plots at each site (i.e., elements of vector e) are not spatially correlated,

Table 1 .
Number of bread wheat (BW) and durum wheat (DW) lines evaluated in the different trials in each of three cropping seasons

Table 2 .
Combined analyses of variance across trials (farmers and irrigation regimes) in each cropping season, using a mixed model, showing the genotype × environment (farmers' plots) interaction.Contrasts from combined analyses of bread wheats (BW) versus durum wheats (DW).Num DF, numerator degrees of freedom; Den DF, denominator degrees of freedom; Pr > F = p-value.

Table 3 .
Genetic correlations among the trials in each cropping season.

Table 4 .
The BLUPs, environment loadings and percent of variance explained by each factor (FA1 and FA2) in each environment for cropping season 2012-2013.

Table 5 .
The BLUPs, environment loadings and percent of variance explained by each factor (FA1 and FA2) in each environment for cropping season 2013-2014.

Table 6 .
The BLUPs, environment loadings and percent of variance explained by each factor (FA1 and FA2) in each environment for cropping season 2015-2016.

Table 7 .
Genetic correlations among the trials' combined analysis across all cropping seasons.