Identification of hydrated and dehydrated lipids and protein secondary structures in seeds of cotton ( Gossypium hirsutum ) lines

Cottonseeds from two parents (TM-1 and 3-79) and their 17 progeny (chromosomal substitution) lines were analyzed for various secondary structures of proteins and moisture content of lipids, separately in hulls and kernels. Fourier transform infrared spectroscopy (FTIR) was used on mature seeds from Upland cotton (G. hirsutum) progeny lines and parents. Based on secondary structures of proteins and hydration levels of lipids, differences were observed among the cottonseeds. The two progeny lines – CS-B12sh and CS-B22sh retained lipid moisture content and protein secondary structures similar to both parents, while CS-B06, CS-B15sh and CS-B16 remained distinct from either parent. On the other hand, CS-B05sh, CS-B07 and CS-B26lo were alike to TM-1 parent for lipid and protein profiles, whereas CS-B02 and CS-B04 were comparable with 3-79 parent. The capacity to detect hydrated and dehydrated lipids and different protein secondary structures using FTIR in these cottonseeds is the novel finding of this project for improving the seed nutritional traits in cotton towards cooking-oil and protein-feed usages.


Introduction
Upland cotton (Gossypium hirsutum) and Pima cotton (G.barbadense) are the two most commonly cultivated cotton species which attribute for 98% and 2% of the cotton acreage in the United States respectively.Current cotton yield in the United States is 1 1/3 bales of fiber and about 1,078 pounds of seeds per acres [1] and it generates more than $120 billion business revenue annually.Grown in less than 3% of world's total agricultural land, cotton farming supports 30% global need for textile fiber [2].The largest uses of cotton are from its fiber that is used to produce clothes, towels, shoe strings, sheets, high quality papers and cushions [3].Besides fibers, cottonseeds are crushed to obtain oil and cottonseed meal for more than 100 years.Cottonseed kernels have been pressed for the extraction of oil which can be directly used as cooking oil for human consumption as it contains low level of saturated fatty acids and possesses a high level of natural antioxidants 'tocopherols' [4].Kernels are the most nutritious part which contains 28.24 to 44.05% of oil and 27.83 to 45.6% of protein along with 17 different kinds of amino acids [5].Whole cottonseed (WCS) feed increases milk production and fat test in high-producing dairy cow and does not interfere with forage digestion when fed at a reasonable level.When dried, WCS contains 23% protein, 25% crude fiber and 20% fat, so are termed as a cost-effective "Triple nutrient" [6].Cottonseed meal contains around 41% proteins and thus can also be used as protein concentrate in the form of cake or pellet [7].Previous gene introgression approaches in cotton were done only in whole genome level, which caused accumulation of unwanted DNA resulting in negative effects such as infertility, cytological abnormalities and distorted segregation [8].This led to the development of backcrossed chromosome or chromosome arm substitution (CS) lines by replacing chromosomes or chromosomal arms of G. hirsutum with corresponding chromosomes or chromosomal arms from G. barbadense (B) lines.Upland parent (TM-1) had been derived as an inbred from Deltapine 14, a commercial cultivar, and maintained for 40 generations through selfpollination.The Pima parent (3-79) was developed as a doubled haploid from Pima germplasm.All their progeny aneuploid CS-B lines used in this study are nearly isogenic to TM-1 parent for 25 chromosomal pairs and with themselves for the 24 chromosomal pairs [9].Infrared (IR) spectroscopy is a classically common and non-destructive experimental method for evaluation of organic compounds which can be used under wide variety of environments [10].During IR procedure, samples absorb electromagnetic radiation based on molecular vibrational stretching and bending of the chemical bonds which can be used to determine membrane linked alignment and conformation of the functional groups.The absorbance of electromagnetic radiation in IR is directly proportional to the concentration of target molecules and the path length of the measuring cell which allows measurement with high precision and high throughput at low cost [11].Sun et al. [12] studied chemical composition between transgenic and non-transgenic cottonseeds, while using of Fourier Transform (FT)-IR in four absorption regions, 1800 to 1720, 1720 to 1580, 1580 to 1480 and 1200 to 1130 cm - 1 .They compared the protein secondary structures for cottonseeds, and their profiles showed large amounts of -helices and sheets along with some random coils and turns.The cottonseeds used in this study were from two parents -TM-1 and 3-79 as well as their 17 progeny lines with either substituted chromosomes or short (sh)/long (lo) chromosomal segments from G. barbadense.The progeny CS-B lines assayed were: CS-B01, CS-B02, CS-B04, CS-B05sh, CS-B06, CS-B07, CS-B11sh, CS-B12sh, CS-B14sh, CS-B15sh, CS-B16, CS-B17, CS-B18, CS-B22sh, CS-B22lo, CS-B25 and CS-B26lo.Total lipid contents of TM-1 and 3-79 are reported [5] to be 22.14% and 25.79%, respectively.Horn et al. [13] also estimated the total protein levels for TM-1 and 3-79 to be 15.14% and 21.69%, respectively.Thus, their 17 CS-B progeny lines used in this study were not expected to widely differ from either parent or from each other in terms of the total seed lipid and protein contents.This research, therefore, compared the CS-B lines for their protein secondary structures (Table 1) as well as difference in moisture content of lipids, reflecting the nutritional quality, separately for hulls and kernels.

Results and discussion
Cotton can be grown not only as one of the most important cash crop for its fiber but also for seed nutritional elements usage in vegetable oil and protein seed supplements.Both parents, TM-1 and 3-79 had detectable lipid and protein levels, however, based on their moisture contents or secondary structures 3-79 appears to have profiles of two nutritional elements that are more desirable (Tables 2 and 3).The comparison of 17 progeny CS-B lines for nutritional seed traits (lipid and proteins) with their two parents and among themselves revealed that hulls of CS-B07 and CS-B25 were closely related to TM-1 while CS-B01, CS-B04, CS-B05sh and CS-B22sh hulls reflected 3-79 profiles (Figure 1).Kernels of CS-B01, CS-B11sh, CS-B12sh, CS-B22sh and CS-B26lo were closely related to TM-1 while CS-B02 and CS-B04 kernels were like that of 3-79 (Figure 2).24.92 *Wavenumber (ṽ) to identify β-Turns (1665 to 1680 cm -1 ), α-helices (1646 to 1660 cm -1 ), Random coils (1638 to 1645 cm -1 ) and β-sheets (1610 to 1637 and 1685 to 1699 cm -1 ) in Hulls (H) and Kernels (K) of CS-B lines a Area under curve for Fourier Transform Infrared spectra of protein secondary structures For the 19 cotton lines assayed, both the hydrated and dehydrated types of lipids were evident in cottonseed hulls as well as kernels.The wavenumbers ranging from 1731 to 1750 cm -1 were selected to detect hydrated type of lipids and 1700 to 1730 cm - 1 were used for the dehydrated types.Pima parent (3-79) had relatively higher lipid area than Upland parent (TM-1) in hull but lower in the kernel (Table 2).Both parents had same kinds of lipids in their kernel but were found to be different in hull, since TM-1 hull showed evidence for dehydrated lipid while 3-79 hull had the hydrated type.The distinctive detection of hydrated and dehydrated lipids was possible nondestructively through FTIR analysis of cottonseed samples and this warrants further research for better understanding of mechanism and roles of lipid structures especially in the areas of seed physiology.Cottonseed hulls from 3-79 and its 11 progeny lines showed evidence for hydrated type of lipid (Table 2) while only seven lines including Upland parent TM-1, CS-B05sh, CS-B07, CS-B12sh, CS-B17, CS-B22sh and CS-B26lo showed presence for dehydrated type (Table 2).Four lines CS-B12sh, CS-B17, CS-B22sh and CS-B26lo showed evidence of both kinds of lipids in their hulls, of which CS-B12sh possessed the highest lipid profiles.These four progeny lines showed traits of both the parental hull types, while all other progenies were like either one of the parents (TM-1 or 3-79).Among the 12 cottonseed hulls showing evidence for hydrated lipids, 3-79, CS-B12sh had the highest relative amount (7.34) while CS-B04 had the lowest (1.20).CS-B22sh and CS-B25 were closely related to 3-79 in terms of lipid curve area (Table 2).Among the seven cottonseed hulls that showed evidence for dehydrated lipid, TM-1 and CS-B05sh had the highest area (4.61) whereas CS-B22sh had the lowest (1.06).CS-B12sh was like TM-1 while CS-B05sh, CS-B07 and CS-B17 had higher lipid areas than this parental line (Table 2).When cottonseeds were analyzed for kernels, all the samples except CS-B16 showed the evidence for hydrated lipids, of which CS-B01 had the highest (12.52) lipid area (Table 2).Hydrated lipids can be advantageous in seed germination as they exhibit stronger hydrogen bonds and act like "water bridges", thereby water molecules in the lipid bilayer tightly connect with the neighboring phosphate and carbonyl headgroups [15].The most energy dense storage compound present in dormant cottonseed is lipids, which are utilized for the growth of the plant after its germination.Thus, presence of lipid in the seed kernels, as witnessed in these CS-B lines, boosts the energy required for the growing seedling [16].CS-B04 and CS-B22lo showed evidence for similar structures to that of Pima parent (3-79) while CS-B05sh, CS-B06, CS-B12sh, CS-B14sh, CS-B22sh and CS-B26lo were closely related to Upland parent (TM-1) in terms of lipid areas in kernels (Table 2).The discrimination of lipid profiles in hulls and kernels by FTIR indicates the importance of this tool in selection of breeding lines for seed traits.Protein secondary structures' comparison Hinze et al. [17] have reported protein contents above 21% for both G. hirsutum and G. barbadense, thus both parental lines have high potentiality as protein source even for human consumption though the seeds are primarily fed to livestock.The lack of detailed information about the structure of functional storage proteins of plant seeds' protein has hampered genetic transformation approaches for improving their nutritional traits [18].The proteins which are stored in organelles of seeds [13] are degraded by endogenous protease enzymes upon germination for providing nutrition to the growing seedlings [19].Storage proteins in seed tissues are highly expressed during later growth of the plants, and determine the seed nutritional value when used as food or as feed [20].All analyzed samples for hulls and kernels showed the evidence for several protein secondary structures (Table 3).Most CS-B lines possessed multiple secondary structures in their hulls while kernels contained only a single kind of structure.helices and -sheets were the major secondary structures found in these cottonseeds, and both structures are also highly responsible for the organization of three-dimensional proteins [21].The two cotton parents showed completely different secondary structures profiles for both their hulls and kernel when compared with each other.In hulls, TM-1 showed evidence of turns while -helix was evident in 3-79 (Table 3) suggesting that the latter had stable protein structure.For kernels, TM-1 was detected with -sheets while random coils were evident in 3-79 (Table 3), suggesting that the former had more stable proteins.Hulls of CS-B18 and CS-B22sh showed evidence for all the four-different protein secondary structures, of which CS-B18 had three and CS-B22sh had two curves of -sheets respectively (Table 3).Turns and random coils TM-1, CS-B05sh, CS-B07, CS-B11sh, CS-B12sh, CS-B15sh, CS-B17, CS-B18, CS-B22sh, CS-B25 and CS-B26lo hulls showed the evidence of turns (Table 3), of which TM-1 had the highest area (36.79) and CS-B05sh had the lowest (0.125).For kernels, only CS-B06 had this protein secondary structure with an area of 21.94 (Table 3).Only two progeny lines, i.e., CS-B16, CS-B17 and CS-B26lo hulls showed evidence of random coils (Table 3), of which CS-B16 had the highest area (70.77) and CS-B17 had the lowest (1.27).Kernels of two progeny lines, CS-B02 and CS-B04 as well as 3-79 parent showed the evidence of random coils (Table 3), of which CS-B02 had highest area (73.46) and CS-B04 had the lowest area (56.99) of random coils.Due to their hydrophilic nature, these unordered protein structures serve as water binding proteins, interact with macromolecules as a water matrix, may act as hydration buffers that regulate the water in cells and help to resist protein denaturation when the tissues are dehydrated [22].Thus, the above cotton lines with evidences for these unordered protein structures may have better seed germination rates.-Helices Hulls of 3-79, CS-B01, CS-B02, CS-B04, CS-B05sh, CS-B11sh, CS-B12sh, CS-B14sh, CS-B18, CS-B22sh and CS-B22lo showed the evidence of -helices (Table 3).Among these lines, CS-B01 had the highest area (65.56) for -helices while CS-B22lo had the lowest (7.37).The area under helix curves for CS-B01 and CS-B04 hulls were higher than 3-79 parent while that for other eight progeny lines had lower.Kernels of CS-B15sh, CS-B16, CS-B17 and CS-B18, showed evidence of -helices (Table 3), among which CS-B16 had the highest area (66.36) while CS-B18 had the lowest (13.50).From the above CS-B lines, only CS-B18 had -helical structure both in its hull as well as kernel (Table 3).-helix rich proteins may provide fundamental mechanical support in cells, outline cell's stretchiness, enable binding with other signaling proteins, and help in cell motility as well as biochemical signaling [23].

-Sheets
CS-B02, CS-B04, CS-B06, CS-B11sh, CS-B12sh, CS-B15sh, CS-B18 and CS-B22sh hulls showed evidence of -sheets (Table 3).Among these eight lines CS-B18 had three curves of -sheets at 1696.07, 1635.5 cm -1 and 1615.22 cm -1 with areas of 0.18, 12.26 and 13.04 respectively.Also, CS-B22sh showed evidence for two curves of -sheets at 1685.54 and 1627.10 cm -1 with areas of 4.24 and 2.74 respectively (Table 3).Among these eight hulls tested, CS-B15sh had the highest area (52.51) while CS-B18 had the least (0.18) -sheets.Kernels of 11 cottonseeds showed the evidence of -sheets (Table 3), of which CS-B05sh had the highest (61.75) while CS-B17 had the lowest area (8.82).Kernels of CS-B05sh, CS-B07, CS-B11sh, CS-B12sh, CS-B14sh, CS-B22sh and CS-B22lo had higher area of -sheets than TM-1 parent (32.12)while CS-B01, CS-B25 and CS-B26lo had lower (Table 3).-sheets are formed from the simultaneous uncoiling of -helices, the phenomenon termed as - transition [23].Activities such as heating and roasting have been reported [14] to increase the percentage of -sheets in cellular level.This protein structure is un-degradable and un-digestible, which lowers the feed value and its access to gastrointestinal digestive enzymes in ruminants [24].Thus, CS-B progeny lines higher in -sheets may not be desirable as feedstock development choice.In this paper FTIR spectroscopy was found reliable and convenient for analyzing the secondary structures of lipids and proteins, non-destructively from all cottonseeds of these CS-B lines, which may also be used as source of food and feed [25,26].The CS-B lines used in these studies had background of TM-1 parent, therefore, as expected most of the progeny lines (CS-B01, CS-B05sh, CS-B07, CS-B11sh, CS-B12sh, CS-B14sh, CS-B22lo, CS-B22sh, CS-B25 and CS-B26lo) clustered around G. hirsutum parent based on their lipid and protein secondary structures (Figure 3).The remaining seven progeny lines (CS-B02, CS-B04, CS-B06, CS-B15sh, CS-B16, CS-B17 and CS-B18) had the lipid and protein FTIR profiles like 3-79 parent probably reflecting their G. barbadense foreground (Figure 3).Further understanding the mechanism of variation pertaining to these nutritional components between the lines could be utilized for selecting effective candidates in crop improvement programs.

Figure 1 .Figure 2 .
Figure 1.Fourier Transform Infrared Spectrometry based evidence of secondary structures of protein and lipid profiles in hulls of seeds from 17 G. hirsutum cotton chromosomal substitution (CS) lines with G. barbadense (B) chromosome or segments as well as both that of TM-1 (G.hirsutum) and 3-79 (G.barbadense) parents

Figure 3 .
Figure 3. Dendrogram based on Fourier Transform Infrared Spectrometry detection of protein secondary structures and lipid moisture contents for seeds of G. hirsutum and G. barbadense (B) parent along with their 17 chromosomal substitution (CS) progeny lines with TM-1 background and G. barbadense chromosome or segments as foreground