Genome Wide Identi�cation and Characterization of Light-Harvesting Chloro a/b Binding Genes Reveals their Potential Role in Enhancing Drought Tolerance in Gossypium hirsutum

Background Cotton is an important commercial crop for its valuable source of natural �ber. Its production has undergone a sharp failure because of abiotic stress in�uences, of signi�cance is drought. Moreover, plants have evolved self-defense mechanisms against the effects of several ways of abiotic factors like drought, salt, cold among others. The evolution of stress responsive transcription factors such as the trihelix, a nodule-inception-like protein (NLP), the late embryogenesis abundant (LEA) proteins among others have shown positive response in improving resistance to several forms of abiotic stress features. Results Genome wide identi�cation and characterization of the effects of Light-Harvesting Chloro a/b binding (LHC) genes was carried out in cotton under drought stress conditions. A hundred and nine proteins encoded by the LHC genes were found in the cotton genome, with 55, 27, and 27 genes found to be distributed in Gossypium hirsutum, G. arboreum, and G. raimondii, respectively. The proteins encoded by the genes were unevenly distributed in various chromosomes. The Ka/Ks values were less than one, and an indication of negative selection of the gene family. differential expression arrangement of genes was showed with the majority of the genes being highly upregulated in the root tissues in relative to leave and stem tissues. Moreover, more genes were induced in M85 a relative drought tolerant germplasm. Conclusion: The results provide proof of the possible role of the LHC genes in improving drought stress tolerance, and can be explored by cotton breeders in releasing a more drought tolerant cotton germplasms.


Introduction
Over the course of the 21st century, food production has to match the increasing population (Beddington et al., 2012).However, Temperature increment and climate change have deepened the incidence and harshness of abiotic stresses that critically disturb the growth and development of crops (Nouri, Moumeni & Komatsu, 2015).Abiotic stress remains one of the key components of yield loss in plant production (Sasi et al., 2018).Moreover, abiotic stress has a major impact on plant growth and development compared to other forms of living organisms due to their immobility (He, He & Ding, 2018;Magwanga et al., 2018Magwanga et al., , 2019;;Xu et al., 2019).Among the various forms of abiotic stress factors drought, heat, toxicity, and salinity do cause over-reduction of the electron transport chain (ETC) resulting in photooxidation (Nishiyama & Murata, 2014).Furthermore, in the chloroplasts, drought, high light, salinity, or extreme temperatures stresses do trigger a diminishing in CO 2 assimilation rates which in turn induce an upsurge in reactive oxygen species creation, which eventually leads to yield damage (Pintó-marijuan & Munnébosch, 2014).It has been reported that abiotic stresses account for over 50% of losses in crop production (Nath et al., 2013).Moreover, a decrease in photosynthesis results in a remarkable reduction in yield and yield quality in crops (Nouri, Moumeni & Komatsu, 2015).
Drought exposure alters the photosynthetic apparatus in the plants, and thus plants have evolved numerous coping mechanisms, one of which is the evolution of various plant transcription factors.(Hussain et al., 2018).The known plants genes with net effects on plant photosynthetic process includes, Ribulose bisphosphate carboxylase large chain (rbcL) (Berry & Yerramsetty, 2013), Cytochrome f (petA) (Xing et al., 2016), light-harvesting chlorophyll a/b-binding (LHC) (Zhao et al., 2020a), cytochrome p450 genes (Magwanga, Lu & Kirungu, 2019) among others.In crops, the LHCA and LHCB sub-families, which encode proteins founding the light-harvesting complex of photosystems I and II in LHC gene family (Fanna et al., 2016).The LHC proteins are the apoproteins of the Light-Harvesting complex of photosystem II (PSII), outer antenna complex which is perhaps the utmost ample membrane proteins in nature (Kró et al., 1995;Horton & Ruban, 2005;Xu et al., 2012).Moreover, studies have shown that LHCB1, LHCB2, LHCB3, LHCB4, LHCB5, or LHCB6, affects stomatal responsiveness to abscisic acid (ABA) in ux, and therefore lowers the plant's tolerance level to drought stress during their down-regulation (Xu et al., 2012).Furthermore, downregulation of the LHCB genes does cause ABA insensitive phenotypes in seed germination and post-germination growth (Liu et al., 2013a).In the recognition of the proteins encoded by the LHCB genes, 28 have been identi ed in Papaya carica (Zou et al., 2020a)

Plant material and Hydroponics
The experiment was laid out in CRD design with three biological replications in Green house.A seed of Marie-galante 85 (M85), a race developed from Gossypium hirsutum species and comparatively tolerant to Abiotic stress (Chen et al., 2018).The seeds were treated by water soaking for one night and sow them in absorbent paper for germination, after a week then transplanted to a hydroponic set up that have Hoagland nutrient solution (Gene & Consortium, 2000), in the greenhouse, with 16 h/8h light-dark and temperature at 28 °C day/25 °C night (Zhao et al., 2020b).At three leaf stage, drought stress was imposed by supplementing the nutrient solution with 17% of PEG-6000 (Liu et al., 2013b).The leaf, stem, and root tissues were then collected for RNA extractions at 0 h, 3 h, 6 h, 9 h, 12 h, and 24 h after stress exposure.

Identi cation of the Light-Harvesting Chloro a/b-bind Proteins in Cotton Species
The Light-Harvesting Chloro a/b bind domain number PF00504 was used as to search for the identi cations of the cotton proteins encoded by the LHC genes.The LHC proteins for G. hirsutum, G. raimondii and G. arboreum were downloaded from the cotton functional genomics database (www.cottonfgd.org),while those for Arabidopsis thaliana, and Theobroma cacao were downloaded from phytozome (https://phytozome.jgi.doe.gov).The HMM pro les of the LHC functional domain PF00504 were retrieved from the Pfam database (http://pfam.xfam.org)and used for the identi cation of the putative ALDH proteins with the best domain e-value cutoffs f < 1 × 10 − 4 (El-Gebali et al., 2019).Moreover, To get the, physicochemical traits of the gene family like Protein length (PL) molecular weight (MW) and molecular charge, isoelectric point (pI) and GRAVY value using the website of CottonFGD (www.cottonfgd.org).

Phylogenetic tree and collinearity analysis
Protein sequences of these three cotton species including Arabidopsis thaliana and Theobroma cacao were aligned by ClustalX in MEGA 7.0 for phylogenetic tree construction.We use Neighbor-joining (NJ) method to know the evolution distance, Jones-Taylor-Thornton (JTT) as substitution model of 1000 bootstrap replication (Tamura et al., 2011).To categorize the homologous genes of cotton species, the protein sequences of G. hirsutum were exposed to a BlastP search alongside the protein database of G. arboreum and G. raimondii; hits with E-values ≤ 1 × 10 − 5 and ≥ 90% similarity were enabled signi cant.
The GFF3 le, linked le, and Gene ID were applied to construct the Collinearity analysis by TBtools software (Chen et al., 2018).Homologous genes of G. hirsutum, G. raimondii and G. arboreum were known from CottonFGD employing BLASTp with a threshold of > 80% match and at least an 80% alignment ratio based on the protein length.

Chromosome mapping, Gene Ontology, and Cis-regulatory elements analysis
To know the distribution of Light-Harvesting Chloro a/b-bind genes in all the chromosomes of A, D, and AD cotton genomes, we used the GFF3 le from CottonFGD (www.cottonfgd.org) and gene ID of the genes.Then employed the TBtools software to show the genes on chromosome via amazing gene location from Gene Transfer Format/General Feature Format (GTF/GFF).
Cellular component (CC), biological process (BP) and molecular functions (MF) was used to determine the functional classi cation of genes by an online tool AgriGO (www.bioinfo.cau.edu.cn/agriGO)(Gene & Consortium, 2000).Analysis of the gene structure of the Light-Harvesting Chloro a/b-bind genes in G. hirsutum, G. arboreum, and G. raimondii was done by means of the Gene Structure Display Server -GSDS 2.0 (http://gsds.cbi.pku.edu.cn)online tool.While for the motif identi cation, an online tool MEME was employed (http://meme-suite.org/).The 2000-bp upstream sequences of CAB genes of cotton species were downloaded from CottonFGD (http://www.cottonfgd.org/) to identify the cis-regulatory elements in the putative promoter regions.Thus the fasta le of the upstream sequence was submitted to Plant Care search (https://pubmed.ncbi.nlm.nih.gov/11752327) for identifying the putative cis-regulatory elements among the promoter sequences (Lescot et al., 2002).The structure was visualized by TBtools.

RNA extraction and RT-qPCR analysis
At three leaf stages, drought stresses were forced by adding the nutrient solutions with 17% PEG-6000 solution as previously adopted by Magwanga et al [36,37].Samples were then collected for RNA extraction at 0 h, 3 h, 6 h, 9 h, 12 h hand 24 h of post stress treatment.Total RNA was extracted using TIANGEN, RNA preppure plant plus kit (www.tiangen.com)according to the manufacturer guidelines.Nano Drop 2000 was used to check the quality and concentration of RNA extracted with a standard of 260/280 which must be between 1.80-2.1 (Joshi et al., 2016).Thus, we convert the RNA to cDNA using TransGen Biotech kit Beijing, China (www.Transgen.com.cn),following the kit instructions.From the LHC gene family, we select 27 genes for RT-qPCR and design the primers (Table S1) using NCBI website (www).For the RT-qPCR analysis, we use 7500 fast real time system with 2µL, 2µL 6µL, and 10µL of cDNA, forward and reverse Primers, RNA free water, and SYBR solution respectively.Three biological and technical replications were used in the whole analysis with Ghactin7 as control.E = 2 -ΔΔCt .formula uses to calculate the gene expression.(Schmittgen & Livak, 2008)

Results
Identi cation of the Cotton LHC proteins 109 proteins translated by the LHC genes were recognized in the three sequenced cotton genomes, with 55, 27, and 27 proteins in G. hirsutum (AD), G. raimondii (D) and G. arboreum (A), respectively (Table S2).The amounts of the proteins found in the LHC genes in the two diploid cotton species, G. raimondii and G. arboreum were less by one compare with the number of LHC proteins in G. hirsutum, may be due to AD emerged in the whole genome duplications between A and D genomes.
The evaluation of the physicochemical properties G. hirsutum of the Chloro a/b binding protein genes, the protein lengths for the G. hirsutum proteins stretched from 62 aa to 644 aa, molecular weights reached from 6.88 kDa to 72.66 kDa were scored respectively in Gh_Sca017783G01 and Gh_A02G1068, a charge ranged from − 8.5 (Gh_A01G0519) to 7(Gh_A02G1068), the isoelectric point (pI) ranged from 4.701 (Gh_D06G2350) to 10.228 (Gh_D04G1505) and nally the grand average of hydropathy (GRAVY) ranged from − 0.529 (Gh_A01G0519) to 0.233 (Gh_D06G2350) (Table 1).1).
On the other hand, the values for pI and GRAVY was almost the same, pI ranges from 4.87 to 9.897, and 4.701 to 9.296, GRAVY − 0.377 to 0.167 and − 0.249 to 0.244 in order of G. arboreum and G. raimondii.In all cotton species, the GRAVY value was lower (positive and negative), which indicates all proteins may be a sign of the likelihood of enhanced relations with water that leads to hydrophilic nature.

Phylogenetic Tree and Synteny block Analysis of the Cotton LHC Proteins
The phylogenetic tree constructed grouped the cotton Light-Harvesting Chloro a/b binding proteins together with other plants into 12 clades.Numerous homolog gene pairs were formed among the several proteins encrypted by the cotton Light-Harvesting Chloro a/b binding genes (Fig. 1A).
The collinearity analysis among the three cotton species was analyzed, in which Circle gene viewer was applied to distinguish the collinear gene pairs with TBtools software (Chen et al., 2018).Finally, the collinearity analysis between the genetic map of At and Dt Subgenomes of G. hirsutum, G. arboreum and G. raimondii for their A Vs D; A vs At, and nally between D Vs Dt Subgenome relationships were observed.We found good collinearity between A vs D with 23 genes, A vs At with 20 genes, and nally between D vs Dt with 23 genes in the Subgenome (Fig. 1B).

Gene Ontology Analysis
Gene Ontology (GO) has a structure that allows powerful comparisons and inferences about gene functions in biological, cellular, and molecular levels (Gene & Consortium, 2000).Presumed functions of 109 genes in the Gossypium Light-Harvesting Chloro a/b-bind gene family, including biological processes (BP), molecular functions (MF), and cellular components (CC) were identi ed using agriGO online analysis.
In G. hirsutum biological processes (GO: 0008150), the functions included cellular and metabolic processes.Various cellular (GO: 0005575) functions were noted in the cell and cell part.Similarly, in G. arboreum, the biological (GO: 0008150) functions were responsible for stimuli, cellular and metabolic processes.In cellular component (GO: 00055750), the functions were focused on cell, macromolecular complex (Protein), and membrane related issues, whereas in molecular function (GO: 0003674), were related with binding function.In G. raimondii the biological process (GO: 0008150) was coined with cellular and metabolic processes, which is similar to G. hirsutum, whereas in cellular component (GO: 0005575), the function is related to membrane.In both G. hirsutum and G. raimondii, there is no signi cant GO term in molecular function (Fig. 2).
Gene Structure and Motif Identi cation of Chloro a/b-bind Proteins Gene structural study is observed as a likely sign of the evolution of multigene families.To obtain additional evidence into the structural diversity of cotton Light-Harvesting Chloro a/b-bind genes, the exon/intron association in the full-length cDNAs was investigated in contrast with their equivalent genomic DNA sequences of distinct genes in G. hirsutum, and it was found that a higher proportion of the Light-Harvesting Chloro a/b-bind genes and their exons were extremely conserved inside the group.Gene structural diversity is regarded as a possible indicator of the evolution of multigene families.To gain further information into the structural diversity of cotton Light-Harvesting Chloro a/b-bind genes, the exon/intron organization in the full-length cDNAs was analyzed in comparison with their corresponding genomic DNA sequences of individual genes in G. hirsutum, and it was identi ed that a greater percentage of the Light-Harvesting Chloro a/b-bind genes and their exons were highly conserved within the group.
In the study of the gene structures, some of the Light-Harvesting Chloro a/b-bind gene structures were disturbed by introns.The maximum level of intron disruption of the Chloro a/b-bind gene structures was 11(Gh_A02G1068), 11(Ga02G0756), and 5 (Gorai.003G092700)for G. hirsutum, G. arboreum and G. raimondii, respectively.Light-Harvesting Chloro a/b-bind genes are mostly found with the occurrence of two exons and one intron.The highest number of exons and introns were found in Gh_A02G1068 ( 12exons, 11 introns) and Gh_A01G0519 (10 exons, 9 introns).Remarkably, Exons and introns for diverse Light-Harvesting Chloro a/b-bind genes were observed to be dissimilar based on their lengths.For example, 18 genes had to have two exons and one intron and 7 genes three exons by two introns and seven genes with one exon and no intron.(Fig. 3).
On the other hand, in the diploid species, the maximum number of exon/intron were 12 exons, 11 introns (Ga02G0756) and 11 exons, 10 introns (Ga01G0731) in G. arboreum, 6 exons, 5 introns (Gorai.003G092700)and 6 exons, 5 introns (Gorai.009G262000) in G. raimondii¸ respectively.Similarly, the number of genes that have two exons with one intron is seven and ten in G. arboreum and G. raimondii.Genes with three exons and two introns as well as a single exon with no intron were ve and three respectively in both species.To explore the structural evolution of LHC proteins, the patterns of motifs were analyzed.A total of 20 different motifs were detected by the MEME analysis (http://memesuite.org/) in the three Gossypium species (Fig. 4).Based on the identi ed motifs, motif 3, motif 4 and motif 12 are the conserved motifs in the G. hirsutum, whereas motif 2 and 8 in G. arboreum and while motif 11 and 4 in G. raimondii, respectively.

Chromosomal Mapping Analysis of the Light-Harvesting Chloro a/b binding Genes
The LHC genes were evenly distributed across the various chromosomes of the A 2 , D 5 , and (AD) 1 cotton genomes.In the tetraploid (AD) 1 genome with At Subgenome, the highest gene loci were found on chromosome A t 01, A t 05, and A t 10 with 3 genes, while At03, A t 08, and At09 chromosomes harbored none.
Similarly, in the (AD) 1 , Dt Subgenome, the highest gene loci were found in D t 07, D t 01, and D t 05 with 5, 4, and 4 genes, respectively, whereas A t 03, A t 08, and A t 09 had zero genes.The rest of the chromosome harbored between 1 to 3 genes (Fig. 5A and B).With the two diploid cotton species, A 2 and D 5 genomes, the gene distribution arrangement was different, In G. arboreum, the highest gene loci were observed on the chromosome, A 2 05, and A 2 07, with the same 4 genes while in G. raimondii, chromosome D 5 01, D 5 09, and D 5 10 concealed the highest gene loci with 4 genes, respectively, while chromosome A 2 04and D 5 06 harbored none (Fig. 5C and D).

Identi cation of Cis-regulatory elements
Cis-Acting regulatory elements are important molecular switches involved in the transcriptional regulation of a dynamic network of gene activities controlling various biological processes, including abiotic stress responses, hormone responses, and developmental processes.It encodes the genomic blueprints for coordinating spatiotemporal gene expression programs underlying highly specialized cell functions (Mao et al., 2020).In the plant Care analysis of Cis-regulatory elements ABRE, ARE, MRE, MYB, AT-rich elements, DRE, MBS, Box-4, and ACE were found related to drought stress in the three cotton species (Fig. 6).The major cis-acting elements, such as the ABA-responsive element (ABRE) and the dehydration-responsive element/C-repeat (DRE/CRT), that are a vital part of ABA-dependent and ABA-independent gene expression in osmotic and cold stress responses (Yamaguchi-Shinozaki & Shinozaki, 2005).

Evolution of LHC genes in Gossypium species
The Ks value in gene evolution was not affected by natural selection generally, but Ka does.The Ka/Ks value showed positive, neutral, and negative selection when the value was Ka/Ks > 1, Ka/Ks = 1, and Ka/Ks < 1 respectively (Zhao et al., 2020b).The distributions of Ka, Ks, and Ka/Ks among homologous pairs of Gossypium species were revealed similar results.(Fig. 7, Table S3) The Ka/Ks of GhAt-Ga ranged from 0-0.949034416, while for GhDt-Gr from 0-0.838286204.The Ka/Ks of GhAt-GhDt ranged from 0-0.523637063, whereas the Ka/Ks value of Ga-Gr was 0-0.755930549.In all the pairs, the Ka/Ks value was < 1 which indicated that the gene family was subjected to negative selection.The result suggested that the LHC of G. hirsutum genes derived from G. raimondii and G. arboreum experienced negative selection commands throughout the evolution.

RT-qPCR Validation of Light-Harvesting Chloro a/b genes under Water De Conditions
Twenty-seven genes expression pro les were carried out under drought stress conditions in different tissues and varying time intervals.The genes showed differential expression pattern on the tissues analyzed, in root tissues, the highly upregulated genes were Gh_D10G2385, Gh_A13G0222, Gh_A05G0725, Gh_D05G0860, Gh_D07G0661, Gh_D01G1508, Gh_D12G1495, Gh_A07G2182, and Gh_A10G2108, while in the leaf tissues, Gh_A07G2184, Gh_D10G2385, Gh_D05G0860, Gh_D02G1996, Gh_A13G0222, and Gh_A05G0725 showed higher upregulation after 12 h of stress exposure.Similarly, Gh_A13G0222, Gh_D06G1791, and Gh_A06G1447 genes were Up-regulated in stem tissues starting from 6 hours up to 24 hours (Fig. 8).
Most genes were Down-regulated mainly in leaf tissue followed by stem.Genes like Gh_A10G0361, Gh_D10G0369, Gh_A03G2154, and Gh_D03G0610 were Down-regulated in the three tissues of cotton in almost all time points.Generally, many genes were Up-regulated in the root tissue.Gh_A13G0222 (CAB6A) was Up-regulated in all tissue samples and Gh_D10G2385 (LHCB4), Gh_D05G0860 (CAB6A), and Gh_A05G0725(CAB6A) also Up-regulated Leaf and root tissues under drought stress.A detailed exploration of these genes will offer e cient information on considerate LHC genes in cotton (Gossypium) and its part in drought stress tolerance.Drought effect is rst felt at the root zone, and the higher upregulation of various genes in the root tissues is in line with earlier results in which most of the LEA genes were upregulated in the root tissues in relative to leaf and stem tissues during drought stress situation (Magwanga et al., 2018).

Discussion
Drought is one of the key abiotic stresses that affect crop production worldwide.It also harshly affects the physiology and growth of many crops (Joshi et al., 2016).It was the main risk to a signi cant loss of cotton yield due to the ever-increasing shortage of water around the world (Hou et al., 2018).Drought stress damages photosynthetic pigments that usually begin with majorly stomatal effects at medium drought intensity, and come to an end in metabolic and structural alters caused by harsh drought stress.Photosynthesis stands for one of the greatest vital photo-chemical reactions in plants.Sunlight is transformed into chemical energy and is employed to change carbon dioxide, water, and minerals into oxygen and energy-rich organic composites then recycled as energy basis by heterotrophs (Gururani, Venkatesh & Tran, 2015).
Photosynthesis is the outcome of many steps and multipart developments that employs numerous biological pathways similar to photosynthetic electron transport system (PETs), makes sun light to transform into ATP and NADPH; in addition, CO 2 is xed into carbohydrates, as well as assimilation, transport, and consumption of photo assimilates as the organic products of photosynthesis by Calvin-Benson cycle (Eberhard, Finazzi & Wollman, 2008;Foyer et al., 2012).Forming disorder of all photosynthesis mechanisms has the primary impact of abiotic stress on the activity of photosynthesis (Nouri, Moumeni & Komatsu, 2015).Photosynthetic reactions of mature crops and small seedlings to drought-stress are mainly diverse.In mature crops, e cient photosynthetic complexes are previously shaped and water-stress brings the creation of ROS due to surplus light absorption, which pressures the photosynthetic apparatus.Though, in water-stressed young seedlings, there is the likelihood to downregulate Chl biosynthesis and slim down the production and gathering of light-harvesting complexes of PSI and PSII, and to acclimatize crops not to suck up surplus light, which is damaging (Dalal & Tripathy, 2018).Chloroplast was the main research area in the eld of biology because it was the site for photosynthesis.But it is also a very sensitive structure to biotic and abiotic stresses and indicates the real status in crops response to stress (Liu et al., 2013b;Li et al., 2020).
Light-harvesting chlorophyll a/b-binding (LHC) proteins contain a plant-speci c superfamily comprised of photosynthesis and stress responses.Identifying genes of this family would help in studying the function and role of these genes in different crop species (Qin et al., 2017;Zou et al., 2020b).But we don't get enough information in the cotton crop for this family.Previous studies in crops suggested that there was an important link between photosynthesis and nal yield.Light-harvesting complex II (LHCII) is a central component of the photosynthesis, with fundamental parts in light harvest and acclimation to changing light (Longoni et al., 2015;Qin et al., 2017).
In our result, many genes were Up-regulated in the root tissue.Gh_A13G0222 (CAB6A) was Up-regulated in all tissue samples while Gh_D10G2385 (LHCB4), Gh_D05G0860 (CAB6A), and Gh_A05G0725(CAB6A) were Up-regulated in leaf and root tissues under drought stress.A study from tea plants showed that two genes, CsCP1 and CsCP2, were found to affect phosphorylation/ dephosphorylation and GTP in the physiological regulation of PS II.The regulation of LHC protein stages allows chloroplasts to answer amenably and quickly to abiotic stresses (Li et al., 2020).Similarly, a nding in Papaya, plants treated with mannitol for drought stress after 10 days, three genes were upregulated (CpELIP, CpLhcb7, and CpPsbS), for 15 days, ve genes upregulated (CpELIP, CpSEP2, CpOHP2, CpLhcb7, and CpPsbS) and for 20 days, 12 genes were meaningfully regulated with ve genes upregulated (CpELIP, CpSEP2, CpOHP2, CpLhcb7, and CpPsbS) (Zou et al., 2020a).
The evolution of LHC genes in Gossypium species indicated that the circulations of Ka, Ks, and Ka/Ks were similar among homologous pairs.The Ka/Ks of GhAt-Ga reached from 0-0.949034416, while GhDt-Gr reached from 0-0.838286204.The Ka/Ks of GhAt-GhDt ranged from 0-0.523637063, whereas the Ka/Ks value of Ga-Gr was 0-0.755930549.The result suggested that the LHC of G. hirsutum genes derived from G. raimondii and G. arboreum experienced negative selection instructions throughout the evolution.In harmony with this nding, the Ka/Ks value of cassava light-harvesting chlorophyll a/bbinding (LHC) genes ranges from 0.0010-0.2507(Zou & Yang, 2019).
LHCB family members positively regulate crops Abiotic stress tolerance by stomatal closure to ABA signaling starting from germination to nal growth (Xu et al., 2012;Liu et al., 2013b).It is well-identi ed that ABA persuades stomatal closure in water shortage conditions, which hinders photosynthesis.Here, the genetic evidence provides that members of the LHCB family are certainly elaborated in guard cell signalling in response to ABA and so LHCB members have been found as new actors in ABA signalling in stomatal movement (Xu et al., 2012).The LHCB members were exposed to be targets of ABA-responsive WRKY-domain transcription factor, for an inducer that modi es LHCB expression at least through suppressing the WRKY transcription repressor in stressful conditions in collaboration with light, which permits crops to adjust to eco-friendly encounters (Liu et al., 2013b).Functional genomics trials will have desirable and be accommodating to demonstrate the biological and molecular function of LHC genes and to make use of them in cotton improvement.

Conclusions
A hundred and nine proteins encrypted by the LHC genes were found in the cotton genome, with 55, 27, and 27 genes found to be distributed in Gossypium hirsutum, G. arboreum, and G. raimondii, respectively.The majority of LHC genes showed with high exon-intron connections.Collinearity analysis and chromosomal mapping showed that LHC genes were dispersed on chromosomes of three Gossypium species, with most genes clustering in the upper and lower arm of chromosomes.In the three cotton species, their GRAVY value was lower (positive and negative), which indicated that the protein was hydrophilic nature.In the RT-qPCR, many genes were Up-regulated in the root tissue.Gh_A13G0222 (CAB6A) was Up-regulated in all tissue samples and Gh_D10G2385 (LHCB4), Gh_D05G0860 (CAB6A), and Gh_A05G0725 (CAB6A) also Up-regulated in Leaf and root tissues under drought stress.The Ka/Ks value showed that the LHC of G. hirsutum genes resulting from G. raimondii and G. arboreum experienced negative selection instructions throughout the evolution.Thus, a detailed investigation of these genes will offer e cient information on understanding LHC genes in cotton (Gossypium) and its part in drought stress tolerance.
, 17 in Hordeum vulgare L. (Qin et al., 2017), 25 in Camellia sinensis (Li et al., 2020), and 35 genes in Manihot esculenta (Zou & Yang, 2019), However, the role of this important plant gene family concerning abiotic stress factors in cotton have not been studied.The complete sequencing of Gossypium hirsutum (Hu et al., 2019), Gossypium arboreum (Huang et al., 2020), and Gossypium raimondii (Wang et al., 2012; Agricultural et al., 2019), provided the needed information to carry out functional analysis of the proteins encoded by the LHC genes in the three cotton genomes.

Figures Figure 1 A
Figures

Figure 2 Gene
Figure 2

Figure 3 Gene
Figure 3

Table 1
Physiochemical properties of LHC proteins in G. hirsutum, G. arboreum and G.In the two diploid cotton species, the G. arboreum and G. raimondii LHC proteins physiochemical properties exhibited slight differences, in molecular weights, protein lengths, pI, molecular charge, and GRAVY values.The protein length stretched from 114 aa to 610 aa, and 151 aa to 349, molecular weights ranged from 12.823 to 68.741 KDa, and 16.55 to 38.267 KDa by a charge range of − 6 to 9 and − 4.5 to 7.5 in G. arboreum and G. raimondii, respectively (Table