Multivariate Analyses of Phenotypic Diversity of Bread Wheat (Triticum aestivum L.) in the Highlands of Northern Ethiopia

An assessment of genetic variation within diverse germplasm is needed to allow more efficient genetic improvement. Forty-nine bread wheat genotypes were evaluated for 11 traits in simple lattice design at two locations to determine the extent of genetic diversity among the genotypes for grain yield and other agronomic traits. Mean squares of the traits studied showed statistically significant differences among the genotypes listed (P<0.01), indicating the presence of adequate variability. In the PC analysis, five PCAs explained 80.4% of total variability residing in the bread wheat genotypes. The first principal component, followed by the second, had the largest variance, and consequently explained much of the variability in the bread wheat genotypes. The traits which were important in these PCAs, plant height, grain yield, number of productive tillers, days to heading, spike length and number of spikelets per spike are the important traits in differentiating the genotypes. Average linkage cluster analysis classified the 49 genotypes into six clusters. Higher inter-cluster distance was exhibited between cluster I and III (D2=25.79**) followed by cluster II and IV (D2=22.82), and cluster II and III (D2=22.75), indicating wider genetic diversity among these clusters. Thus, future crossing program between members of cluster I with cluster III, and cluster II with III and IV could possibly result in heterosis in the F1, and a great deal of variability in the F2 generations.


Introduction
Wheat, one of the globally important crops among the cereals, is cultivated approximately on 220 million and 9.9 million ha in the world and in Africa, respectively [1]. In Ethiopia, both bread and durum wheat species are produced by 4.6 million people on 1.66 million ha of land. It is the fourth important crop in area coverage next to tef, maize and sorghum; the third in productivity next to maize and rice [2].
The leading wheat producers in the world like Germany and France are capable of producing average yields of 7.4 and 7.2 t/ha, respectively [3], the world average of wheat production is about 3.25 ton per ha [1]. However, the average yield at production fields in Ethiopia is about 2.54 t/ha [2], which is much lower compared to the yield on experimental fields which is about 5 t/ha [4]. Thus, there is a need to further improve the yield of bread wheat.
Genetic diversity is indispensible for species adaptation to variable environmental conditions, it is valuable for germplasm collection and conservation and, the information also helps to study heterosis [5][6][7]. Plant breeding depends on the correct combination of specific alleles at the genetic loci present in a plants' genome [8]. In order to access the alleles responsible for traits of agronomic importance, an assessment of genetic variation within diverse germplasm is needed to allow more efficient genetic improvement [9].
Therefore, this study was conducted with the objectives to study the extent of genetic diversity of bread wheat genotypes and to identify the important traits in distinguishing the genotypes.

Description of the study area
The experiment was conducted in 2015 at Jamma and Geregera research fields of Sirinka Agricultural Research Center. Jamma lies between 10°23' to 10°27' N latitudes and 39°7' to 39° 24'E longitudes, at an altitude of 2600 m.a.s.l. The dominant soil type is Vertisol and has PH of 6.0 with total rainfall of 720.5 mm. Geregera is located at an altitude of 2650 m.a.s.l, which lies between 38°45' E longitude and 11°46' N latitude with annual rainfall of 1105 mm. The soil type is characterized as Lithosol, and has pH of 5.6.

Planting materials
The planting materials include forty-nine bread wheat genotypes which are released varieties and elite materials obtained from Sinana and Kulumsa Agricultural Research Centers (Table 1).

Experimental design and trial management
The treatments were arranged in 7 × 7 simple lattice design replicated twice. The individual plot area was 1.2 m × 2.5 m (3 m 2 ) with six rows for each genotype. The spacing between plots and rows were 0.4 m and 0.2 m, respectively. A 1.5 m space was left between blocks. Planting was done with seed rate of 150 kg/ha. Both Diammonium phosphate (DAP) and Urea fertilizers were applied at the rate of 100 kg/ha. DAP was applied at planting while 1/3 of the Urea was applied at planting and 2/3 at mid tillering stages.

Data collection and statistical analyses
Eleven quantitative traits were recorded. Ten plants were randomly chosen from the central four rows for recording data on plant height (cm), number of productive tillers per plant, spike length (cm), number of spikelets per spike and number of grains per spike. While the data for days to heading and maturity, thousand-seed weight (gm), biological yield (kg/m 2 ), grain yield (qt/ha) and harvest index (%) were collected on plot basis considering the central four rows.
The data collected for each quantitative trait were subjected to analysis of variance (ANOVA) for simple lattice design using PROC LATTICE and PROC GLM procedures of SAS version 9.2 [10]. Homogeneity of error variances of each character were tested as per Hartley [11] before combining the data over the two locations. Principal component analysis was performed using correlation matrix by employing PAST 1.93 [12] to evaluate the contribution of each quantitative character in the total variation of genotypes. Factors (principal component axes) with eigenvalues values>1.0 were retained. Mahalanobis [13] D2 statistics was used to estimate the genetic divergence in the bread wheat genotypes. The D2 statistics is defined by the following formula: Where; D2ij=the squared-distance between any two genotypes i and j; Xi and Xj=the vectors for the values for genotypes ith and jth genotypes; S-1=the inverse of pooled variance covariance matrix within groups.
Testing the significance of the squared-distance values obtained for a pair of clusters was taken as the calculated value of x 2 (chi-square) and tested against the tabulated x 2 values at p-1 degree of freedom at 1% and 5% probability levels, where p=number of traits used for clustering genotypes.
The PROC CLUSTER of SAS 9.2, with average linkage method of clustering strategy, grouped and sorted the genotypes into clusters to form Dendrogam. Cubic clustering criterion (CCC), pseudo F and pseudo t2 statistics were used in determining the number of clusters.

Analysis of variance
In the study, the relative efficiency of RCBD was greater as compared to the simple lattice design for most characters, and sum of squares for blocks within replication were nonsignificant. As a result, analysis of variance was done using RCBD model. Homogeneity of error variance for all characters investigated was met. Consequently, ANOVA was run for the combined data over the two locations.
The location × genotype interaction was significant for all traits except number of spikelets per spike signifying differential response of the bread wheat genotypes across locations. Besides, significant (P<0.05) treatment differences were observed for all the characters studied, indicating the existence of genetic variability within the genotypes (Data not shown). The present investigation is in conformity with early findings in bread wheat [14][15][16].

Performance of the bread wheat genotypes
The success of a breeding program depends largely upon the amount of genetic variability present in the population and the extent to which the desired traits are heritable. Based on combined ANOVA over locations, the genotypes showed a wide range of variation for the studied traits (Table 2). Days to maturity ranged from 124 to 136 with mean value of 127.6 days, plant height ranged from 68-93.75 cm with mean value of 78.3 cm, number of grains per spike from 29.45-7 with mean value of 37.8 and thousand-seed weight from 34.8-48 g with mean value of 40.8 g. Also, grain yield ranged from 26.5-43.8 with mean value of 34.6 qt/ha. The sizable ranges of values for traits indicate good opportunity for bread wheat improvement. A similar wide range of variation among bread wheat genotypes in yield and yield related traits was reported by Kumar et al. [14] and Tesfaye et al. [15].

Principal components analysis
The main advantage of principal component analysis is extracting the most important information from the data table, compressing the size of the data set by keeping only this important information [17] and reducing the number of dimensions without much loss of information [18].
Out of the total 11 principal component axes (PCAs) extracted, five PCAs with eigenvalues greater than one were retained. The first principal component should have the largest possible variance, and therefore should explain the largest part of the variability [17]. According to the results, the first PCA explained 25.48% of the variation which is the highest of all the PCAs. The five PCAs all together explained 80.4% of total variability residing in the bread  The factor loadings refer to the correlation between principal components and variables. A high correlation between PCAs and variables indicate that the variables are associated with the direction of the maximum amount of variation in the data set [17]. The data presented in Table 3 showed the most contributing characters are found in the first principal component which were plant height and grain yield; whereas number of productive tillers, days to heading, spike length and number of spikelets per spike contributed more in the second PCA. Likewise, days to maturity and thousand-seed weight in the third PCA; biological yield and harvest index in the fourth PCA; number of grains per spike in the fifth PCA, were the major contributing characters for variability to those principal components. Similar findings of grouping bread wheat genotypes by principal component analysis were reported [19][20][21].  It is also evidenced by the PCA plot ( Figure 1) that genotypes depicted on the top part of the plot had high values of days to heading, spike length and spike number per spike; and those depicted on the right side of the plane had higher values of grain yield and plant height. The bread wheat genotypes at the top right side of the plane are expected to be late heading tall plants with higher number of spikelets and high yield. These genotypes could be used to develop superior varieties with desired traits. The fact that 80.4% of the variance is explained by five PCAs reveals the presence of wide variability in the genotypes for the character studied and it suggests ample opportunities for genetic improvement of bread wheat through direct selection, and utilization of the genotypes for future hybridization programs.

Genetic divergence and cluster analyses
Divergence analysis is performed using Mahalanobis [13] D2 distance to classify the diverse genotypes for hybridization purpose (Tables 4 and 5). The genetic improvement through hybridization and selection depends on the extent of genetic distance between parents. Chi-square test was used to declare significance of distance values using P-1 degrees of freedom where, P is the number of characters used in the study [22].   Table 5: Distribution and grouping of 49 bread wheat genotypes into different diversity classes based on D2 analysis.
Inter-cluster distance values (D2) between six clusters are presented in the ( Table 4). The highest inter-cluster distance was exhibited between cluster I and III (D2=25.79 ** ), followed by cluster II and IV (D2=22.82 * ), cluster II and III (D2=22.75 * ) and cluster II and VI (19.78 * ), indicating wider genetic divergence among the clusters. Thus, crossing of genotypes between members of cluster I with members of cluster III; members of cluster II with members of cluster III, IV and VI may produce high amount of heterotic expression in the F1's and broad spectrum of variability in segregating (F2) populations. Genetic divergence in bread wheat genotypes reported by earlier workers [19][20][21]23].
The dendrogram obtained from the cluster analysis through average linkage technique grouped the 49 genotypes into six clusters at about 47% similarity level based on D2 values computed using pooled mean trait data (Table 5 and Figure 2) which makes them moderately divergent. Related findings were reported by earlier workers [19][20][21]. Similarity between clusters is the average distance between all objects in one cluster and all objects in other cluster, whereby individuals within any cluster were more closely related than individuals in different clusters. Figure 2: Tree diagram of genetic relationships among 49 bread wheat genotypes constructed by using 11 traits.
The genotypes were grouped in such a way that cluster I had the largest member of all clusters with 27 (55%) genotypes, followed by cluster III with 15 (30.6%) and cluster IV with 3 (6.04%) genotypes. In contrast, cluster V and cluster VI had the smallest member, constituting of one (2.04%) genotype each.
The mean values of the 11 quantitative traits for each cluster are presented in Table 6. Cluster I had a characteristic features of heading early, high grain yield per hectare and moderate high value in terms of harvest index. Cluster II had characteristic features of early maturity, tall plant type (plant height), high number of spike lets per spike, high grain yield, and relatively low harvest index as compared to other clusters. Cluster III showed earliness in days to heading and days to maturity as well as moderate grain yield. Cluster IV had characteristic features of relatively low values of biological yield and grain yield per hectare, and relatively moderate values in terms of characters studied. Cluster V had characteristic features of lateness in heading and maturity, high values in terms of number of grains per spike, thousand-seed weight and harvest index; on the other hand it had low value in terms of spike length and biological yield. Cluster VI had a characteristic feature of late heading and, short in plant height, high value in thousand-seed weight, and also characterized by high harvest index and grain yield per hectare.  As presented in Table 6, low and high mean values, respectively, were recorded between cluster VI and II for plant height, cluster VI and II for number of spikelets per spike, cluster VI and V for number of grains per spike, cluster IV and VI for thousand-seed weight, cluster V and II for biological yield, and for harvest index observed between cluster II and VI respectively. In addition to these the highest grain yield was obtained from cluster I, II and VI; however, comparatively low grain yield was obtained from cluster III, IV and V.

Summary and Conclusion
Overall variability within a crop is due to heritable and nonheritable components. The present study comprised 49 bread wheat genotypes that were evaluated at Jamma and Geregera environments to study genetic diversity in 49 bread wheat genotypes. The analysis of variance revealed significant differences (P<0.01 and p<0.05) among the genotypes for all traits except grain filling period at Geregera, which indicated the existence of variation among the tested genotypes. Besides, the estimated ranges of mean values revealed that bread wheat genotypes possessed a great deal of genetic variability, and it signifies the possibility of improvement through selection.
The principal component analysis revealed that five PCAs, with eigenvalues greater than unity, explained 80.4% of the total variability. The first principal component, followed by the second, had the largest variance, and consequently explained much of the variability in the bread wheat genotypes. The traits which were important in these PCAs, plant height, grain yield, number of productive tillers, days to heading, spike length and number of spikelets per spike are the important traits in differentiating the genotypes.
The highest inter-cluster distance was exhibited between cluster I and III (D2=25.79 ** ), followed by cluster II and IV (D2=22.82) and cluster II and III (D2=22.75), indicating wider genetic diversity among the clusters. Therefore, initiating crossing program between members of cluster I with members of cluster III, and members of cluster II with members of cluster III and IV may produce a high amount of heterotic expression in the F1's and broad spectrum of variability in segregating (F2) populations.