Exploring the plasmodesmata callose-binding protein gene family in upland cotton: unraveling insights for enhancing fiber length

Plasmodesmata are transmembrane channels embedded within the cell wall that can facilitate the intercellular communication in plants. Plasmodesmata callose-binding (PDCB) protein that associates with the plasmodesmata contributes to cell wall extension. Given that the elongation of cotton fiber cells correlates with the dynamics of the cell wall, this protein can be related to the cotton fiber elongation. This study sought to identify PDCB family members within the Gossypium. hirsutum genome and to elucidate their expression profiles. A total of 45 distinct family members were observed through the identification and screening processes. The analysis of their physicochemical properties revealed the similarity in the amino acid composition and molecular weight across most members. The phylogenetic analysis facilitated the construction of an evolutionary tree, categorizing these members into five groups mainly distributed on 20 chromosomes. The fine mapping results facilitated a tissue-specific examination of group V, revealing that the expression level of GhPDCB9 peaked five days after flowering. The VIGS experiments resulted in a marked decrease in the gene expression level and a significant reduction in the mature fiber length, averaging a shortening of 1.43–4.77 mm. The results indicated that GhPDCB9 played a pivotal role in the cotton fiber development and served as a candidate for enhancing cotton yield.


INTRODUCTION
Cell walls are essential for plant development and growth.They are a crucial and dynamic structural component that provides vital mechanical support throughout plant growth, development, and adaptation to varying environmental conditions.The cell wall consists of a complex yet well-organized extracellular matrix that encompasses aromatic compounds, including polysaccharides, glycoproteins, and lignin.Notable components of the cell wall matrix include corpus callosum, cellulose, hemicellulose, and pectin (Jamet & Dunand, 2020).

Identification of PDCB genes in upland cotton
The amino acid and genome sequences of the Arabidopsis PDCB gene family were obtained from the website of TAIR (http://www.arabidopsis.org/),whereas the amino acid and CDS sequences of the TM-1 (G.hirsutum L. second generation genome) genome were obtained from the College of Agriculture and Biotechnology, Zhejiang University (http://cotton.zju.edu.cn/).To analyze the PDCB gene family of Arabidopsis thaliana and upland cotton (G.hirsutum), the exploitation tool of TBtools was employed.The A. thaliana sequence was used as the query sequence for the BLAST search of the cotton genome database (Chen et al., 2020).To predict the physicochemical properties of PDCB proteins, including the instability index, isoelectric point (PI), molecular weight (MW), and amino acid count, the website-based ProtParam tool was used (https://web.expasy.org/protparam/).We predicted the subcellular localization of PDCBs using CELLO v.2.5 resources (http://cello.life.nctu.edu.tw/) and WoLF PSORT (https://wolfpsort.hgc.jp/)(Yu et al., 2006).

Evolutionary analysis of PDCB gene family in cotton
The amino acid sequences of the PDCB gene family members in Arabidopsis thaliana and upland cotton were analyzed and compared, leading to the construction of a phylogenetic tree.MEGA-X was employed to construct a neighbor-joining tree for PDCB genes, with 1,000 bootstrap replications (Hall, 2013).The resulting phylogenetic tree was visualized using the EvolView software (He et al., 2016).

Gene structure and conserved motifs analyses
Cotton PDCB members were analyzed using the MEME online software (http://memesuite.org/).Amino acid sequences were inputted, and the software detected the number and types of motifs.The analysis was conducted using the following parameters: a maximum of 10 motifs was displayed, while other settings remained at their default values.Gene and conserved motif structures, including exons and introns, were visualized using TBTools.

Chromosome location analysis of PDCB gene family
The positions of the members contained in the PDCB gene family on the chromosome were extracted and organized using TBtools according to those of the annotated genes in the gff3 file.Subsequently, the physical distribution of these PDCB members was visualized using the MapInspect software.

Analysis of the PDCB cis-regulatory element
To investigate the PDCB promoter in G .hirsutum and to predict the function of the PDCB gene, the analysis focused on regions predominantly 1,500 bp upstream of the gene, with a few exceptions.For thoroughness, additional fragments extending 500 bp upstream were analyzed.Therefore, a 2,000 bp sequence upstream from the initial codon of the gene was extracted for detailed examination.This sequence was then subjected to the analysis using Plantcare (https://bioinformatics.psb.ugent.be/webtools/plantcare/html/).

Quantitative real-time polymerase chain reaction analysis
At the full flowering stage, fiber samples from G. hirsutum L. (cv.CCRI45) cultivated in the field were collected at intervals of 5 days post-anthesis (DPA), 10 DPA, 15 DPA and 20 DPA, along with tissue samples from the stems(tender stem), leaves(fresh leaves the size of fingernails), and flowers.Three different biological replicates were taken for each sample and three technical replicates were performed.These samples were immediately frozen in liquid nitrogen and preserved at −80 • C to facilitate the subsequent extraction of total RNA.
The RNA Prep Pure Plant Kit (DP441; Tiangen, Beijing, China) was used to extract total RNA from each sample, and RNA quality was assessed using a Nanodrop2000 nucleic acid analyzer and gel electrophoresis.Subsequently, TranScript All-in-One First-Strand cDNA Synthesis SuperMix for qPCR (TransGen Biotech, Beijing, China) was used to synthesize cDNAs.An ABI 7500 Fast Real-Time PCR system (Applied Biosystems, Foster City, CA, USA) was used to perform RT-qPCR based on the TransStart Top Green qPCR SuperMix kit protocol (Transgen Biotech, Beijing, China).Primer-BLAST, sourced from the online NCBI database, was used to design specific primers for the differentially expressed genes (DEGs), with details provided in Table S1.Using the primer sequences R: 5 -TGTCCGTCAGGCAACTCAT-3 and F: 5 -ATCCTCCGTCTTGACCTTG-3 , the housekeeping β-actin gene was used as a reference to standardize relative expression levels.Relative gene expression levels were quantified using the 2 − Ct method (Livak & Schmittgen, 2001).

Virus induced gene silencing of the candidate genes
The procedure began by selecting a 300 bp viral encoding fragment that exhibited the best alignment with Gh_A12G1651 (https://vigs.solgenomics.net/).Subsequently, the fragments were amplified using the specific primers, virus induced gene silencing (VIGS): Gh_A12G1651-R and VIGS: Gh_A12G1651-F.The amplified fragment was ligated to the Clcrv vector after double digestion with SpeI and AscI.After sequencing, correct plasmid transformation was performed using Agrobacterium (LBA4404).It was then introduced into TM-1 through leaf back injection.Upon the emergence of albino plants, DNA extraction was verified in positive seedlings.Subsequently, RNA was reverse-transcribed for RT-qPCR analysis to assess expression levels.Phenotype identification was performed once the fibers reached maturity.

Identification of PDCB gene family members of G. hirsutum and analysis of their basic physical and chemical properties
The identified genes were screened for specific domains.A total of 45 members of the PDCB gene family were successfully identified in G. hirsutum (TM-1).Following the standard protein naming convention, these 45 members of the TM-1 genome were designated GhPDCB1-GhPDCB45.The fundamental physicochemical characteristics were predicted and analyzed (Table 1).The analysis indicated that the length of the amino acid sequences within the gene family varied, ranging from 112 amino sequences within the gene family varied, ranging from 112 amino acids in GhPDCB25 to 424 amino acids in GhPDCB2.However, the sequences of GhPDCB44 and GhPDCB45 were shorter than 100 amino acids, consisting of 97 and 56 amino acids, respectively.The isoelectric points of these proteins spanned from 4.47 (GhPDCB18) to 8.99 (GhPDCB26 ).The instability index indicated protein stability in the test tube (≤ 40, possibly stable; >40, possibly unstable).Furthermore, subcellular localization predictions indicated that 24 proteins were localized extracellularly, nine within the chloroplast, and four within the nucleus.

Phylogenetic tree analysis of the cotton PDCB gene family
This study involved multiple sequence alignments of Arabidopsis and cotton members, followed by the construction of a phylogenetic tree constructed from the PDCB amino acid sequences.The results presented a classification pattern similar to that of A. thaliana, in which the cotton PDCB gene family members were categorized into five distinct groups (I-IV) (Fig. 1).Specifically, in Group I, cotton exhibited seven members and A. thaliana had three members.In Group II, cotton comprised 11 members and A. thaliana comprised 15 members.Group III had the fewest members, with cotton possessing six members, and Arabidopsis containing one.In Group IV, cotton had eight members, and A. thaliana had two.In Group V, cotton encompassed 13 members, and A. thaliana had eight members.

Analysis of PDCB gene structure and protein conserved motifs in G. hirsutum
Analysis of protein motifs using MEME identified 20 potential motifs (Fig. 2A).Motif 1 was included in all GhPDCB proteins, indicating that it is conserved within the GhPDCB family.Notably,GhPDCB6,GhPDCB7,GhPDCB8,GhPDCB9,GhPDCB10,GhPDCB11,GhPDCB18,GhPDCB19,GhPDCB20,GhPDCB22,GhPDCB33,GhPDCB34,GhPDCB36,GhPDCB38, and GhPDCB39 exhibited similarities with motifs 4-7, with motif 15 appearing as the first.In the third group, GhPDCB1 and GhPDCB2 shared motif 8 and held the same positions.Similarly, within the fifth group, GhPDCB9, GhPDCB10, GhPDCB11, GhPDCB38, and GhPDCB39 displayed an exact match for motif 7, in a consistent order.The distribution of protein domains within the PDCB family (Fig. 2B) revealed that eight GhPDCBs possessed domains beyond the common X8 and Glyco_hydro superfamily domains.Further analysis of gene structure (Fig. 2C) highlighted similarities in the number of introns and exons among most family members, indicating their close evolutionary relationships.Notably, the majority of the PDCB genes contained 3-4 introns, whereas a few had 1-2 introns.

Analysis of chromosome mapping of PDCB family genes
Based on physical mapping of PDCB family members, 42 out of the 45 PDCB genes were situated on 20 chromosomes, with the remaining three genes positioned on a scaffold (Fig. 3).Physical mapping of the PDCB gene family revealed that 42 out of 45 genes were distributed across 20 chromosomes, and the remaining three genes were situated on scaffolds (Fig. 3).Notably, there were variations in the gene count and chromosomal locations between the At and Dt subgenomes.For instance, the chromosomes A04 and A11 lacked genes, whereas the chromosomes D04 and D11 possessed one gene as well.Conversely, chromosomes A05 and A12 each contained three genes, chromosome D05 had two genes, and D12 was characterized by four genes.

Collinearity analysis of the cotton PDCB gene family
To further investigate the PDCB gene family in upland cotton, a collinearity analysis was conducted G. hirsutum and with G. arboretum, G. barbadense and A. thaliana (Fig. 4).The analysis revealed homologous regions across the At and Dt subgenomes of upland cotton, demonstrating significant collinearity within species (Fig. 4A).Interspecific comparisons identified 31 instances of collinearity between G. hirsutum and G. arboreum (Fig. 4B), 57 instances between G. barbadense and G. hirsutum (Fig. 4C), and 19 instances between G. hirsutum and A. thaliana (Fig. 4D).This analysis shows that there is a strong collinearity within G. hirsutum and G. arboretum, G. barbadense and A. thaliana, indicating that they are significantly conservative in evolution.

GhPDCB gene promoter analysis
The analysis of the 2,000 bp upstream promoter region of the PDCB gene, conducted via PlantCare, revealed the presence of various cis-regulatory elements, as depicted in Fig. 5.The hormonal regulation played a pivotal role in the development of cotton fibers.This analysis highlighted more cis-regulatory elements associated with hormone

Analysis of expression patterns
Based on the results of fine mapping (PDCB-3, GH_A12G2014, GhPDCB9, QFL-A12-5) and the constructed evolutionary tree, genes homologous to GhPDCB9 were selected for the quantitative fluorescence analysis across three distinct tissue types (stem, leaf, and flower) and four developmental stages of fiber (5, 10, 15, and 20 DPA) (Figs. 6A and 6B).
RT-qPCR analysis of these 12 genes revealed that they shared similar expression patterns.These genes exhibited broad tissue expression profiles.GhPDCB9, GhPDCB22, GhPDCB10, GhPDCB11, and GhPDCB38 displayed notably higher expression levels in the stem, surpassing those observed in other tissues.This increased expression suggests their potential involvement in cell elongation.Conversely, GhPDCB36 was predominantly expressed in flowers, thus regulating flower development.
GhPDCB9 and GhPDCB22 expression levels exhibited a consistent reduction from 5 to 15 DPA, suggesting a potential association with fiber elongation.In this gene group, GhPDCB10, GhPDCB11, GhPDCB34, GhPDCB38, and GhPDCB12 displayed a progressive increase in expression levels from 5 to 20 DPA, primarily during late fiber development, indicating their involvement in secondary wall biosynthesis.These findings indicate a specific relationship between these genes and fiber development.

VIGS of GhPDCB9 in TM-1
The RT-qPCR results indicated that GhPDCB9 could affect the development of cotton fiber at 5 DPA, which was consistent with findings from previous QTL research (Lu et al., 2021).Consequently, the next phase involved a detailed examination of GhPDCB9's role.The function of GhPDCB9 was investigated using virus-induced gene silencing.In this experimental setup, PDS encodes a gene for chlorophyll synthesis.As a result of PDS gene silencing, cotton leaves failed to synthesize chlorophyll, with bleaching of new leaves at later stages.This bleaching served as a positive control for successful silencing.The albino phenotype was observed following infection with the cotton strain, confirming effective gene silencing (Fig. 7A).Subsequently, the DNA extracted from the gene-silenced cotton leaves was subjected to specific primer testing, enabling the selection of correctly silenced cotton plants for fluorescence quantification.The expression levels of GhPDCB9 were verified using RT-qPCR.Notably, the silenced plants exhibited a significant decrease in GhPDCB9 gene expression levels after silencing (Fig. 7B).RT-qPCR analysis revealed that the cotton fibers demonstrated enhanced silencing efficiency.A descriptive statistical method was used to analyze the length of mature fibers.The fiber length experienced a notable reduction after virus-induced silencing compared to that of the control group, with an average shortening ranging from 1.43 to 4.77 mm.These differences were statistically significant (Figs.7C and 7D).

DISCUSSION
Cotton fiber, a vital natural resource in the textile industry, is derived from the epidermal cells of the ovule.Among cotton varieties, upland cotton is the main species for cultivation because of its superior fiber yield and robust adaptability, encompassing approximately 95% of the total cotton cultivation area (Yoo & Wendel, 2014).Cotton fiber development comprises four sequential phases: initiation, cell elongation, secondary wall thickening, and maturation, wherein the length of the epidermal cells extends from to 10-20 µm to 3-6 cm.During the initial two phases, the fiber number and length were predominantly affected, whereas the later two stages were associated with fiber strength and thickness (Patel et al., 2020).The development and growth of cotton fibers are influenced by several factors, including transcription factors, pH levels, and plant hormones, such as IAA, GA, and brassinolide (Beasley, 1973;Raghavendra, Ye & Kinoshita, 2023;Sun et al., 2019;Suo et al., 2003).Because A. thaliana trichomes have a developmental pattern resembling that of cotton fibers, the genes promoting trichome advancement in A. thaliana are pivotal for initiating cotton fiber development (Li et al., 2023).
The plasmodesmata callose protein family is extensive, featuring the X8 domain with a signal sequence that enables the attachment of glycosylphosphatidylinositol to the external surface of the plasma membrane.The X8 domain was distinguished by its persistent configuration of one Phe residue and six Cys residues (Fig. 2), indicating its significance in carbohydrate binding (Barral et al., 2005).In Arabidopsis, a significant connection between intercellular communication and PDCB-mediated callose deposition has been established.Although this gene family has been extensively investigated in Arabidopsis, research on cotton is rare.Recent studies have identified a noteworthy association between SSR markers HAU0734 and qFL-12-5 and the physical position of G. hirsutum L., which is closely linked to GhUGT103.Gene expression patterns also indicate a relationship with fiber development (Xiao et al., 2019).Using high-density genetic maps, two QTL associated with fiber strength and six QTL related to fiber length were identified across the four chromosomes.Through integration of transcriptome data obtained from both qPCR analysis and parental lines, four genes linked to the QTL were identified.Among these, plasmodesmata callose-binding protein 3 (PDCB-3, GH_A12G2014, GhPDCB9, of qFL-A12-5) has emerged as a promising candidate gene with implications for fiber length (Lu et al., 2021).
A total of 45 PDCB genes were selected from upland cotton based on the unique X8 domain associated with PDCB.They share similar conserved protein motifs, and typically consist of two exons.Further investigation revealed that each pair of linear homologous genes exhibited identical or similar subcellular localization and sequences, suggesting parallel evolutionary homologs with comparable gene functions.The analysis of the 2,000 bp sequence upstream of the promoter revealed that cis-regulatory elements contained both hormone-related and abiotic stress-related elements.Given that hormones exerted a regulatory impact on fiber development, GhPDCB9 could regulate the fiber development through its involvement in the hormone synthesis.Furthermore, the prevalence of cisregulatory elements associated with abiotic stress demonstrated a potential relationship with plant stress resistance, which may require further research.Notably, QTL mapping identified qFL-A12-5, which was linked to fiber length, prompting the selection of 12 genes from the GhPDCB9 family for expression pattern analysis.RT-qPCR analysis revealed that the 12 genes displayed similar expression patterns.Among these, GhPDCB9, GhPDCB22, GhPDCB10, GhPDCB11, and GhPDCB38 exhibited notably higher expression levels in the stem than in the other tissues.GhPDCB36 was predominantly expressed in flowers, indicating its role in the regulation of flower development.Subsequently, gene silencing experiments targeting GhPDCB9 (VIGS) led to a significant reduction in fiber length, with an average shortening ranging from 1.43 to 4.77 mm.These findings suggest that GhPDCB9 is associated with fiber elongation and development.Overall, the observed characterizations indicate the significance of GhPDCB in cotton, with the identified gene GhPDCB9 (GH _A12G2014) serving as a valuable genetic resource for enhancing cotton fiber yield.
• Youwu Wang conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

Figure 2
Figure 2 Exon-intron structure and conserved motif of PDCB genes in G. hirsutum.(A) Analysis of conserved motif of GhPDCB protein sequences.Different motifs are shown in a specific color.(B) Gh-PDCB Protein domain prediction.(C) Intron and exon analysis of GhPDCB genes.Full-size DOI: 10.7717/peerj.17625/fig-2

Figure 3 Figure 4 Figure 5
Figure 3 Location of the PDCB gene on the chromosome.Chromosome names are shown on above and gene names are shown on the right.Full-size DOI: 10.7717/peerj.17625/fig-3

Figure 6
Figure 6 Analysis of the expression patterns in different tissues.(A) Relative transcript abundance of each gene in the different tissues.(B) Transcript abundance of different genes during different flowering periods.Full-size DOI: 10.7717/peerj.17625/fig-6