Principal Component Analysis With Quantitative Traits in Extant Cotton Varieties ( Gossypium Hirsutum L . ) and Parental Lines for Diversity

The experimental material consisted of 101 extant varieties and parental lines characterized for morphological traits under Distinctiveness Uniformity and Stability (DUS) testing at CICR, Regional Station, Coimbatore, India. Twenty one quantitative traits were taken for observation. The data were utilised to estimate substantial variation and relationship within the extant varieties and to identify the best performing genotypes. Analysis of variance for quantitative traits, in diverse line, showed sizable amount of variability. The highest variation was found for vigour index, plant height, germination per cent, fibre maturity, yield per plant, plant stand, fibre uniformity and ginning per cent when mean performance genotypes were considered. Seed cotton yield showed significantly positive correlation with boll number plant-1 (0.95), boll weight-1 (0.53), lint weight (0.50), fibre length (0.27), plant growth habit (0.26), plant height (0.23) and seed index (0.21). Principal component analysis showed the extend of variation by components 1 to 8 that exhibited Eigen value >1. Cluster analysis based on various morphological traits assorted 101 extant varieties of cotton into three main clusters. Dendrogram arrived based on hierarchal clustering, grouped genotypes based on their morphological traits rather than the geography of origin. Article History Received: 07 May 2017 Accepted: 28 January 2018


Introduction
In India, enactment of The PPVFR Act, 2001 1 not only facilitated the protection of plant varieties, Breeder's Right's but also ensured the availability of high quality seeds and planting materials to Indian farmers.It not only encourages the plant breeders but also stimulates investment on research and development, both in the public and private sector for the development of new varieties of plants.
A successful breeding program needs, complete knowledge and understanding of genetic diversity available within the resources.This enables to choose parents for generation of diverse population and selection.The main criteria considered for assessing genetic diversity and to identify genetic distances within the up land cotton varieties are morphological traits.This will set a path for the development of superior parents and heterotically potential hybrid combinations.
It was reported that cultivated cotton genotypes have narrow genetic base 2,3 .To broaden the genetic base through the hybrid breeding program, studies on the genetic divergence within genetic stock is a prerequisite.Genetic variations within and between different cotton genotypes for morphological and fibre quality traits have been studied for their improvement 4 .Principal component analysis (PCA) and linkage cluster analysis was employed by the researchers to find the similarity among the genotypes for the traits and their placement into different clusters 5,6 .
Characterisation and documentation of qualitative and quantitative traits of extant cotton varieties and parental lines are mandatory requirement of Distinctiveness, Uniformity and Stability (DUS) testing programme towards the implementation of PPV & FR Act, 20011.As per this act, a new plant variety is considered for protection (IP protection), if that variety is clearly distinguishable by minimum one essential characteristic from other varieties whose existence is a matter of common knowledge in any country at the time of filling the application, and the variation in characteristic features must be uniform subject to further propagation and the same essential characteristics remain unchanged after subsequent propagation.The extant varieties were used as reference varieties (check varieties) in the DUS testing of new plant varieties.Grouping of extant varieties based on their qualitative and quantitative trait is essential and this will support the selection process of reference varieties for DUS testing of a new plant variety.Keeping this in mind, the DUS data along with ancillary data, generated under this programme, were subjected to Principal Component Analysis and cluster analysis for exploitation of diversity available among the varieties and parental lines.The information generated in this study would be useful for DUS testing of new plant varieties and also for inter varietal breeding.

Materials and Methods
The extant varieties of cotton were characterised for morphological traits under DUS testing as reference varieties at Central Institute for Cotton Research, Regional station, Coimbatore during 2012-14.The experimental material comprised of 101 extant cotton varieties (Table 4) for use as reference varieties, under DUS programme, received from varied agro ecological cotton growing zones of India.Sowing of seeds of each variety was taken up in Randomised Complete Block Design with three replications.Each variety was sown in 10 rows adopting 90 cm space between rows and 60 cm within rows.Recommended agronomic, cultural and plant protection practices for tetraploid cotton was adopted till the harvest of the crop.
The genotypes were evaluated for plant height (cm) by measurement of ten individual plants on 140th day of sowing, growth habit (cm) recorded by measuring the length of longest sympodia in randomly selected ten individual plants and the average was taken for analysis.The sympodia number and number of bolls plant -1 were counted in randomly selected 10 plants in each variety and the average was taken.Plant stand was assessed on 140 th day in each plot; ten individual bolls harvested in each plot were weighed (g) and the mean values taken for statistical analysis.The harvested bolls were ginned in the laboratory after which lint weight and seed weight/boll (g), seed index and ginning percent were recorded, seed cotton yield/plant arrived through multiplication of a number of bolls/plant and boll weight.The harvested seeds were subjected to standard germination test by sowing 100 seeds in four replications in sand medium and the mean per cent was arrived 7 .For calculating the speed of germination, one hundred seeds were germinated in sterilized sand medium replicated four times.Number of seeds that germinated was counted daily up to the final counting day.From the mean per cent germination of each counting day, the speed of germination was derived.and expressed as numeral 8 .From standard germination test, randomly ten normal seedlings were taken for measuring the root and shoot length of individual seedlings.The length measured from the collar region to tip of primary root designated as Root length, similarly, shoot length from the collar region to the tip of the plumule.The mean value in each was expressed in cm.Vigour index (VI) 9 was computed by multiplying seedling length with germination per cent, Dry matter of seedlings (DMS) was estimated using hot-air oven dried (85 °C for 24 h and cooled in a desiccator) 10 normal seedlings.The fibre parameter includes fibre length (FL), fibre strength (FS), fibre fineness (FF), fibre uniformity (FU) and fibre maturity (FM) were measured in PRIMIER ART2 fully automated cotton testing instrument (ICC mode).Standard characteristic states were used to measure different morphological traits of cotton at appropriate growth stages.
Mean value of each trait in every genotype was computed for determining the analysis of variance.Pearson's linear correlation coefficient was used for 21 quantitative traits 10,11,12 (Table 1) and correlation matrix was arrived to compare various traits.Principal Component Analyses based on 21 quantitative traits were executed to find out the relative importance of different traits in capturing the genetic variation.The standardized values were used to perform PCA in PAST3 software 13 .Score plot was used for assessment of components or factors that could explain for major variability in the data.The factors correspond to 21 PCs were subjected to cluster analysis based on Euclidean distances and wards minimum variance using Agglomerative hierarchical clustering in XLSTAT software version 2016.05.A hierarchical cluster analysis was performed with pooled data using scores of dissimilarity matrix 14 .

Results and Discussion
Under field and laboratory twenty one quantitative traits were observed for 101 extant varieties of cotton.Analysis of variance exposed considerable level of variability among accessions for majority of the traits observed.Basic descriptive statistics are presented for 21 characters in Table 1.The highest variation was found for vigour index, plant height, germination per cent, fibre maturity, yield per plant, plant stand, fibre uniformity and ginning percent.Relatively, low variation was noticed for dry matter of seedlings, boll weight, fibre fineness and seed index.The observed variability found among extant cotton genotypes can probably attribute to the inherent genetic differences and the environment in which they were grown.The correlation coefficients of characters attributed to seed cotton yield per plant were estimated and the results are presented in Table 2. Significant and positive correlation of seed cotton yield with boll number per plant (0.95), boll weight (0.53), lint weight (0.50), fibre length (0.27), plant growth habit (0.26), plant height (0.23) and seed index (0.21) was observed.This result indicated that the increase in seed cotton yield might be due to increase in one or more of the above traits.Association of seed cotton yield with boll weight were reported in earlier studies 15 , for number of bolls and boll weight 16,17,18 , for number of bolls 19 .
The close relationship between yield and yield attributing traits will be exploited in selection programme which might be helpful in developing high yielding genotypes.Among the fibre quality traits, fibre length alone showed a positive association with seed cotton yield, a similar trend was reported 20,21 .Fibre strength and uniformity ratio was also found to exhibit a negative association with seed cotton yield which was in accordance with Erande et.al., 21 .
Regarding inter correlation, germination per cent had significant positive correlation with vigour index, speed of germination, shoot length and lint weight.The speed of germination exhibited positive correlation with shoot length, vigour index, dry matter of seedling, lint weight and fibre length.Plant stand has significant inter correlation with boll weight and lint weight.Root length showed positive significant correlation with vigour index.Shoot length showed positive significant inter correlation with vigour index.Shoot length and vigour index showed significant positive association with dry matter of seedling, fibre length and plant growth habit.Dry matter of seedlings had significant inter correlation with fibre length, lint weight, seed index and boll weight.Plant height has positive significant inter correlation with plant growth habit, number of sympodia plant -1 , fibre maturity and boll numbers/plant.Plant growth has positive inter correlation with number of sympodia plant -1 , boll number /plant and fibre maturity.Boll number per plant has positive significant inter correlation with boll weight and lint weight.Boll weight has positive inter correlation with lint weight, seed index and fibre length.Number of sympodia with fibre maturity, Lint weight with seed index, ginning percent and fibre length; seed index with fibre length and fibre strength; fibre strength with fibre uniformity; fibre fineness with fibre uniformity and fibre maturity and fibre uniformity with fibre maturity.These results clearly indicated that selection for any one of these traits leads to concurrent improvement of other traits as well as seed cotton yield.A significant positive association of fibre length was observed with boll weight, lint weight and seed index indicated that these important yield contributing traits were good indicators of fibre length improvement.The other fibre quality traits showed negative association with yield and yield attributing traits.It was found that linkage was the primary cause for negative correlation between yield and fibre quality traits and inter mating may be recommended to break this association 22 .

Principal Components
The Principal Component Analysis is a multivariate statistical technique, to extract the important information from the data table and simplify the description of the data set.To discern patterns of variation, PCA was performed on all variables simultaneously.The Eigen values, variability (%) and cumulative (%) are presented in Table 3 Out of 21 principal components, eight components had extracted Eigen value of >1.This contributed for 83.11% of the variation among the extant cotton varieties.Principal component 1, contributed for 21.99%, to the total variability.The variation on principal component 1 was mainly attributed due to lint weight, shoot length, fibre length, boll weight and dry matter of seedlings.These results are in accordance with 23 .The PC2 contributed for 15.62% to the total variability and was depicted mainly by plant height followed by growth habit, fibre maturity, number of sympodia plant -1 and fibre fineness.The PC3 contributed for 12.71% of the total variability and was mainly attributed to vigour index, shoot length and speed of germination.Principal component 4 contributed 9.97% to the total variability and was mainly ascribed to ginning percent and fibre fineness with their positive loadings and negative loadings with plant height and number of sympodia plant -1 .The PC5 described contribution of 6.40% to the total variability, illustrated primarily the divergence in plant stand and boll weight with their positive loadings and negative loadings with boll number/ plant.The first three PCs exhibited high variation for the traits under study and therefore good cotton improvement may be accomplished through inter varietal development.

Score Plot
The principal component scatter plot of the cotton accessions depicted that the accessions those were close together were perceived as being similar when rated based on the variables.Thus accessions RST-9 and F 505, CPD 428 and ACP 71, LRA 5166 (SB) and VIKRAM, PG 5 and RHC 005 were close to each other on both PC1 and PC2 respectively.The accessions SRT GMS-1, Surat Dwarf, LRA 5166, GSAV 1056, GSB 39 and DHY 286-1 were separated from other accessions.The accession SRT GMS-1 was opposite to GMS 39 because SRT GMS-1 lied in the negative region and GMS 39 lied in positive region.The genotypes in the positive ordination may be utilized for heterosis breeding program (Fig. 1, Table 4).

Biplot
These variables were super imposed on the plot as vectors in the biplot, the relative length of the vector represents the relative proportion of variability in each variable.The extant variety distant from origin showed more variation and less similarity with other varieties.Traits such as fibre fineness, fibre maturity, number of sympodia plant -1 , plant growth habit, plant height, boll number per plant, yield per plant, germination percent, speed of germination, shoot length, vigour index, lint weight per plant, boll weight per plant, dry matter of seedlings, fibre length and seed index were well represented with high amount of variability, while root length and plant stand showed the lowest variability.The quality traits, such as fibre uniformity, strength were not in desirable direction (Fig. 2).A similar result of least variability in fibre uniformity in cotton was reported 23 .The range of variability in each trait, among the varieties under study, exhibited greater divergence which may be useful for an effective cotton breeding program.

Clustering
The factors corresponding to 21 PCs were subjected to cluster analysis based on Agglomerative hierarchical clustering performed on the Euclidean distance matrix utilizing Ward's linkage method and the resultant dendrogram showed three distinct clusters (Fig. 3).The cluster II was the biggest comprising of 47 genotypes followed by cluster III, which occupies 35 genotypes and cluster I with 19 genotypes (Table 5).The cluster analysis identifies groups of cotton cultivars those were of more closely related as reported 24 .These results are in confirmation with the earlier studies of 25,26 .The geographical distribution of genotypes is not the lone factor that causes morphological and genetic diversity.It may be the outcome of several other factors like natural or artificial selection, exchange of breeding materials, genetic drift and environmental variations.Therefore, the emphasis for selection of parents should be based on genetic rather than geographical diversity.Cluster analysis using Ward's method of minimum variance exhibited a distinct pattern of group formation (Table 6).The genotypes in cluster I showed higher values for germination per cent, rate of germination, root and shoot length, vigour index, dry matter of seedlings, plant growth habit, boll weight, ginning (%), fibre length, fibre maturity and lint weight/10 bolls.Similarly, cluster II was comprised of genotypes having the highest values for plant height, seed index, fibre strength, boll number/ plant and yield/plant.The members of the III cluster were characterized by higher values for plant stand (%), fibre fineness, fibre uniformity and number of sympodia plant -1 .

Fig. 3 :
Fig. 3: Dendrogram based on Ward's linkage method of 101 extant varieties of upland cotton

Table 6 : Mean values of clusters for various traits in extant varieties of upland cotton
for vigour index, plant height, germination percent, fibre maturity, yield per plant, plant stand, fibre uniformity and ginning percent.Seed cotton yield exhibited significantly positive correlation with boll number per plant, boll weight, lint weight, fibre length, plant growth habit and plant height.The Principal component analysis summarized maximum diversity present among the genotypes in eight components.In the cluster analysis the genotypes like Surat Dwarf, GSAV 1039, DHY 39, SRT GMS-1, forms different clusters and in the biplot also it found distantly when compared to rest of the genotypes.So, these varieties may be useful for further breeding programme to develop high yielding cultivars.The classification of varieties into different clusters, aids for selection of genotypes for designed future breeding program.Varieties which are in diverse nature may be useful for transfer of the desired gene for cotton yield improvement, suitable for different ambiances.