Genome-wide identification and characterization of non-specific lipid transfer proteins in cabbage

Plant non-specific lipid transfer proteins (nsLTPs) are a group of small, secreted proteins that can reversibly bind and transport hydrophobic molecules. NsLTPs play an important role in plant development and resistance to stress. To date, little is known about the nsLTP family in cabbage. In this study, a total of 89 nsLTP genes were identified via comprehensive research on the cabbage genome. These cabbage nsLTPs were classified into six types (1, 2, C, D, E and G). The gene structure, physical and chemical characteristics, homology, conserved motifs, subcellular localization, tertiary structure and phylogeny of the cabbage nsLTPs were comprehensively investigated. Spatial expression analysis revealed that most of the identified nsLTP genes were positively expressed in cabbage, and many of them exhibited patterns of differential and tissue-specific expression. The expression patterns of the nsLTP genes in response to biotic and abiotic stresses were also investigated. Numerous nsLTP genes in cabbage were found to be related to the resistance to stress. Moreover, the expression patterns of some nsLTP paralogs in cabbage showed evident divergence. This study promotes the understanding of nsLTPs characteristics in cabbage and lays the foundation for further functional studies investigating cabbage nsLTPs.


INTRODUCTION
Non-specific lipid transfer proteins (nsLTPs), which are involved in binding and transporting various lipids, widely exist in the plant kingdom (Edstam et al., 2011). All known plant nsLTPs precursors include an N-terminal signal peptide, indicating nsLTPs are secreted proteins (Carvalho & Gomes, 2007). The mature nsLTPs are small proteins characterized by an eight-cysteine motif (8CM) with the basic form of C-Xn-C-Xn-CC-Xn-CXC-Xn-C-Xn-C (José-Estanyol, Gomis-Rüth & Puigdomènech, 2004). These eight cysteines are engaged in four disulfide bonds that stabilize the three-dimensional structure of the hydrophobic cavity, which allows the binding of different lipids and hydrophobic compounds (Douliez et al., 2000;Salminen, Blomqvist & Edqvist, 2016).

Identification of cabbage nsLTP genes
The entire cabbage proteome was downloaded from the B. oleracea database (Bolbase, http://www.ocri-genomics.org/bolbase). Proteins with the HMMPfam domain PF00234 (protease inhibitor/seed storage/LTP family) in the cabbage proteome were identified by using the software platform HMMER3 (Eddy, 2011). Moreover, BLASTP was also used to identify the cabbage nsLTPs. All known A. thaliana nsLTP sequences were obtained from The Arabidopsis information resource (http://www.arabidopsis.org). These Arabidopsis sequences were then used as queries in a search against the cabbage protein database in Bolbase using BLASTP with the e-value 1e-3. The newly identified cabbage nsLTP amino acid sequences were assessed via an analysis of the 8CM backbone. The proteins lacking essential cysteines were discarded.

Amino-acid sequence analysis
The signal peptide cleavage sites in the candidate nsLTP precursors were analyzed by using the SignalP 4.1 program (Petersen et al., 2011). The proteins without N-terminal signal sequences (NSSs) were removed. The Arabidopsis protease inhibitor and storage protein sequences were used in a comparative analysis with the rest of the candidate nsLTPs to identify and discard potential protease inhibitors and storage proteins. The C-terminal glycosylphosphatidylinositol-anchored (GPI-anchored) signals in the candidate nsLTP proteins were analyzed with the big-PI Plant Predictor program (Eisenhaber et al., 2004). The subcellular localization of nsLTP was predicted by TargetP 1.1 and Plant-PLoc server. After removing the signal peptide, the theoretical isoelectric point (pI) and molecular weight (Mw) of mature nsLTP proteins were calculated by the compute pI/Mw tool provided in EXPASY (http://web.expasy.org/compute_pi/). MEME software (v4.12.0) was used to search for motifs in all 89 cabbage nsLTP proteins, with a motif window length from 6 to 50 bp (Bailey et al., 2009). The three-dimensional structures of the cabbage nsLTPs were constructed and analyzed by Phyre2 and PyMOL (Kelley et al., 2015).

Orthologous analysis and chromosome localization
The Arabidopsis nsLTP gene and protein sequences were used as queries in a BLAST search against the cabbage genome and proteome database, with a coverage of !0.75 and an e-value of 1e-10. The syntenic orthologous genes in cabbage and Arabidopsis were identified based on gene collinearity and sequence similarity (e-value 1e-20). All nsLTP genes were mapped to the cabbage chromosomes based on the genome information obtained from Bolbase. The chromosome localization map was made by using MapInspect software (http://mapinspect.software.informer.com/).

RNA-seq data and bioinformatic analysis
The expression levels of the nsLTP genes in seven different cabbage organs (root, callus, leaf, stem, bud, flower and silique) were investigated using RNA-seq data under the accession number GSE42891 in the Gene Expression Omnibus database. The expression patterns of the nsLTP genes in response to low and high temperatures were analyzed based on the RNA-seq data (ID: NN-0259-000003, NN-0259-000004, NN-0252-000006 and NN-0252-000003) from the National Agricultural Biotechnology Information Center (http://nabic.rda.go.kr/). An analysis of nsLTP gene expression in response to black rot disease in disease-resistant and disease-susceptible cabbage was carried out based on RNA-seq data (SRA098802) from the NCBI Sequence Read Archive (SRA) database. RNA-seq data (SRP091687) from male sterile/fertile buds was downloaded from the SRA database.
After the data processing of raw sequences, clean reads were aligned against the B. oleracea genome (http://www.ocri-genomics.org/bolbase/) using Tophat (v2.0.12). The transcript abundance of nsLTP genes was calculated by fragments per kilobase of exon model per million mapped reads (FPKM) or reads per kilobase per million mapped reads. The differential gene expression analysis was performed using DESeq (v1.16). Genes with FDR-adjusted P-value < 0.05 and fold-changes >2 were identified as differentially expressed genes (DEGs).

RNA extraction and subcellular localization
Total RNA from cabbage (line 02-12) leaves was isolated using the RNAprep Pure Plant Kit (TIANGEN, Beijing, China) according to the manufacturer's instructions. Total RNA was reverse-transcribed using the PrimeScript TM RT reagent kit (TaKaRa, Kyoto, Japan). The full coding sequences of Bol014756 (BoLTP1.7) and Bol021902 (BoLTP2.3) were PCR-amplified with the primers PF1, PR1, PF2 and PR2 (Table S1) and inserted into a pBWA(V)HS-GFP vector, resulting in an N-terminal fusion with GFP under the control of the constitutive CaMV35S promoter. The fusion constructs were introduced into tobacco leaf epidermis as previously described (Sparkes et al., 2006). The fluorescence signals were detected using the confocal laser-scanning microscope (Nikon C1, Tokyo, Japan).

Identification of the putative nsLTP genes in cabbage
The nsLTP genes were identified by using the HMM search program in HMMER3 and a BLAST search against cabbage proteome. Initially, a total of 135 proteins with the conserved HMMPfam domain PF00234 were retrieved. The cysteine residue patterns of these protein sequences were then analyzed. Fifteen proteins lacking the essential cysteine residues were omitted. After that, 10 proteins lacking NSSs were also excluded (Table S2). Moreover, 12 proteins similar to protease inhibitors or storage proteins and nine hybrid proline-rich proteins were also discarded ( Table S2). As a result, a total of 89 nsLTPs, designated BoLTPs in this study, were identified in the cabbage genome (Table 1). Based on the orthology analysis, 74 BoLTP genes were found to have orthologous relationships with 42 A. thaliana nsLTP genes (Table 1). Among these orthologous genes, 49 BoLTP genes were syntenic orthologues of 27 A. thaliana nsLTP genes (Fig. 1A).

Classification of nsLTP genes and their distribution in chromosomes
The plant nsLTPs can be divided into four major and several minor types according to intron position, sequence identity and spacing between the Cys residues in the 8CM, as well as the post-translational modifications (Edstam et al., 2011;Salminen, Blomqvist & Edqvist, 2016). Based on the presence of a GPI modification site, the classification was initiated by first sorting the identified BoLTPs into Type G. In the second round of classification, the remaining BoLTPs were sorted based on the identity matrix calculated from the multiple sequence alignments. When compared with the classification proposed by Edstam et al. (2011) and Salminen, Blomqvist & Edqvist (2016), we found that 80 out of the 89 BoLTPs could be categorized into six types (1, 2, C, D, E and G). Type1, type2, type D and type G nsLTPs, which encompassed 19, 12, 18 and 28 nsLTP genes, respectively, clearly represented a large proportion of the BoLTPs. Moreover, nine cabbage proteins displayed less than 30% identity with all other studied BoLTPs, so they were listed individually and named BoLTPÂ1 to BoLTPÂ9 (Table 1). The chromosomal location of each BoLTP gene was confirmed based on the cabbage genome information in Bolbase. A total of 68 (76.4%) BoLTP genes were distributed across nine chromosomes, and the rest were located on the unanchored scaffolds (Fig. 1B). Seven genes were located in chromosome 1 and 4, nine genes were located in chromosomes 2 and 8, 15 genes were located in chromosome 3, five genes were located in chromosome 5, 6 and 9, and six genes were located in chromosomes 7. In cabbage, seven direct repeat tandems consisting of 16 nsLTP genes were identified (Fig. 1B). One tandem of two duplicated nsLTP genes was presented in chromosome 1 (BoLTP1.6 and BoLTP1.7), chromosome 3 (BoLTPd13 and BoLTPd14), chromosome 4 (BoLTP1.13 and BoLTP1.14), chromosome 9 (BoLTPd17 and BoLTPd18) and Scaffold000118_P2 (BoLTPÂ4 and BoLTPÂ5). One tandem of three duplicated nsLTP genes was present in chromosome 2 (BoLTPd7, BoLTPd8 and BoLTPd9) and chromosome 8 (BoLTPg7, BoLTPg8 and BoLTPg9).

Characteristics of cabbage nsLTPs
The characteristics of the 89 BoLTPs are summarized in Table 1. All the BoLTP protein precursors possess a signal peptide of 17-34 amino acids. The putative subcellular localization of BoLTPs was analyzed. As expected, most of the proteins are predicted to be secreted except for BoLTPe1, BoLTPg3 and BoLTPg26, which have been predicted to be  (Table 1). Except for type G nsLTPs, the Mws of the mature BoLTPs usually range from 6,729 to 11,906 Da, indicating the cabbage nsLTPs genes encode small proteins. Because the mature nsLTPs of type G have more amino-acid residues in the C-terminus than the other mature BoLTP proteins, they have much higher Mws (range from 12,333.19 to 29,369.39 Da). Among the 89 BoLTPs, 57 display a basic pI (7.11-11.42) and the rest show an acidic pI (3.63-6.99). The gene ontology categories of BoLTPs are shown in Fig. 1C.
To further analyze the characteristics of the BoLTPs, a multiple sequence alignment of mature BoLTPs was conducted using MAFFT. Obviously, the 8CMs of the 89 BoLTPs are conservative (Fig. S1). Moreover, the amino acid sequence alignment of the 8CMs of BoLTPs reveals a variable number of inter-cysteine amino acid residues (Table 2). In order to better understand the ligand binding and tertiary structure of BoLTPs, BoLTP1.5, BoLTP2.1 and BoLTPd15 were selected as representative sequences of type 1, 2 and D for structural modelling. The structural analysis showed that BoLTP1.5 includes two conserved pentapeptides, T-P-V-D-R (positions 60-64) and P-Y-S-I-S (positions 100-104). It has been reported that these two consensus pentapeptides (T/S-X-X-D-R/K and P-Y-X-I-S) play an important role in catalysis or binding (Douliez et al., 2000). As shown in Fig. 2, the 3D structures of these three Table 2 Some characteristics of the different types of non-specific lipid transfer proteins found in cabbage.
Type GPI-anchored Spacing pattern 1 No C X 9 C X 13,14,16 CC X 19 CXC X 19,21,22,23,24 C X 13 C 2 No C X 7 C X 13 CC X 8 CXC X 23 C X 5,6 C C No C X 9 C X 16 CC X 9 CXC X 12 C X 6 C D No C X 9,10,14 C X 14,15,16,17,19 CC X 9,11,12 CXC X 19,22,24 C X 6,7,8,9,10 C E No C X 13 C X 15 CC X 9 CXC X 22 C X 6 C G Yes C X 6,9,10 C X 11,13,14,16,17,18 CC X 12 CXC X 23,24,25,26,29 C X 5,7,8,9 C Note: Character "X" represents any amino acid, and the Arabic numeral following "X" stands for the numbers of amino acid esidues. BoLTPs were predicted and analyzed by Phyre2 and PyMOL. Each BoLTP possesses a compact a-helical domain consisting of four or five a-helices connected by short loops (Fig. 2). Four disulfide bonds formed by the eight-cysteine residues can stabilize the three-dimensional structure of the hydrophobic cavity. In BoLTP1.5, the Cys residues 1-6, 2-3, 4-7 and 5-8 are paired ( Fig. 2A), whereas the Cys residues 1-5, 2-3, 4-7 and 6-8 are paired in BoLTP2.1 and BoLTPd15 (Figs. 2B and 2C). This phenomenon indicated that the central residue of the CXC motif may influence the Cys pairing and the overall fold of the protein. As shown in Fig. S1, the X position of CXC motif is hydrophilic in type 1 BoLTPs. However, X is a hydrophobic residue in the CXC motif of Type 2 and Type D BoLTPs. Except for the disulfide bonds, many H-bonds also engage in the stabilization of the three-dimensional structures of BoLTPs. The particular folding structure forms an internal tunnel-like cavity that can bind different hydrophobic molecules.

Phylogenetic analysis of the nsLTP family
To analyze the phylogenetic relationship of the nsLTPs among M. polymorpha, Physcomitrella patens, S. moellendorffii, Adiantum capillus-veneris, Pinus taeda, O. sativa, Arabidopsis thaliana and B. oleracea, 187 nsLTPs from these eight species were analyzed (Table S3). These nsLTP sequences were aligned using MAFFT (v7.037). Subsequently, the approximately-maximum-likelihood phylogenetic tree was constructed from the multiple sequence alignment based on the WAG+CAT model. Previously, the plant nsLTPs have been divided into 10 types (Edstam et al., 2011). Based on comparison with the previous dataset, the six groups divided in this study were in agreement with the type 1, 2, C, D, E and G of nsLTPs. As shown in the phylogenetic tree, members in type 1, 2, C and E formed specific clades, suggesting that the nsLTPs in these types share a common ancestor ( Fig. 3; Fig. S2). Although type D and G can be well distinguished from other types, they cannot formed separated clades ( Fig. 3; Fig. S2). In particular the nsLTPs of type G were divided into four subtypes (named G1-G4) (Fig. 3). It was also worth noting that no type E nsLTP was found in monocotyledon plants, which may discarded these genes during the evolutionary divergence between monocots and dicots. Generally, A. thaliana and B. oleracea are closer to each other and more distantly related to other species in each group of the phylogenetic tree, indicating the closer relationship between A. thaliana and B. oleracea.

Expression analysis of nsLTP genes
To explore the spatial expression patterns of the BoLTP genes, RNA-seq data from seven different organs (root, stem, leaf, callus, bud, flower and silique) were used for an expression analysis of BoLTPs. The expression level of each BoLTP gene was estimated by the FPKM value, and the genes with FPKM !1 were identified as truly expressed genes (Yao et al., 2015). In this study, 72 (81%) BoLTP genes were expressed in at least one of the seven organs ( Fig. 4A; Table S4). Interestingly, several BoLTP genes, such as BoLTP1.7, BoLTP1.10, BoLTP1.18, BoLTP2.2, BoLTP2.12 and BoLTPx2, were specifically expressed in the buds. However, these genes were significantly downregulated in the buds of 83121A, which is a male sterile mutant with a defective exine ( Fig. 4E; Table S5).
To analyze the relationship between BoLTPs and cabbage resistance to biotic and abiotic stress, comparative analyses of the expression of BoLTPs between the resistant and susceptible cabbage materials were conducted. Based on an analysis of RNA-seq data from cabbage treated with cold stress for 2 h, eight BoLTPs demonstrated significant changes between the cold-tolerant BN10600 and cold-susceptible BN10700. Of these eight DEGs, six were downregulated and two were upregulated in BN10700 ( Fig. 4B; Table S5). Moreover, RNA-seq data from two cabbage lines, heat-tolerant BN1HS and heat-susceptible BN2HS, that had been treated by heat shock for 1 h, were used to explore the responses of BoLTP genes to high temperature. As shown in Fig. 4C, seven DEGs were identified between BN1HS and BN2HS (Table S5). Of these DEGs, five were downregulated and two were upregulated in BN2HS. Black rot caused by Xanthomonas campestris pv. campestris is a major disease of cabbage. In order to explore the black rot resistance genes in cabbage, the differential expression analysis between the two cabbage parental lines, C1234, which is black rot disease-resistant and C1184, which is a susceptible line, were also carried out. According to Fig. 4D, seven DEGs were identified between C1234 and C1184 (Table S5). Of these DEGs, three were downregulated and four were upregulated in C1184. Significantly, among these DEGs, four BoLTPs (1.6, 1.8, d2 and d11) DEGs may be related to the resistance to at least two types of stress. The alteration of expression patterns is considered an important indicator of functional divergence between duplicated genes (Makova & Li, 2003;Ganko, Meyers & Vision, 2007;Hellsten et al., 2007). The differential expression analysis of the duplicated BoLTPs indicated significant expression differentiation among the paralogous BoLTPs (Fig. 5). For example, BoLTP1.1, BoLTP1.6 and BoLTP1.7 are orthologous to AT3G51590, which is known to be involved in sexual reproduction in A. thaliana (Ariizumi et al., 2002;Chae et al., 2009). BoLTP1.1 and BoLTP1.7 were mainly expressed in the floral organs, while BoLTP1.6 was mainly expressed in the leaves and significantly upregulated in the black rot disease-resistant C1234 (Figs. 4D and 5). As another example, BoLTPd7, BoLTPd8, BoLTPd9, BoLTPd12 and BoLTPd13 are orthologous to AT5G55450, which is involved in disease resistance in A. thaliana (McLaughlin et al., 2015). BoLTPd7 was expressed in every investigated organ, BoLTPd8, BoLTPd9 and BoLTPd13 were not expressed in any of the investigated organs, while BoLTPd12 was expressed in every investigated organ except the root. Moreover, BoLTPd7 and BoLTPd8 responded to cold and heat stress, but their responses to the stresses were opposite (Figs. 4C and 4D). BoLTPd9 responded to cold stress and was downregulated in the cold-susceptible BN10700 (Fig. 4C). These results indicate that the duplicated genes with different expression patterns may possess different physiological functions.

DISCUSSION
In this study, 89 genes putatively encoding nsLTPs in cabbage were identified. Based on the classification system described by Edstam et al. (2011) and Salminen, Blomqvist & Edqvist (2016), these BoLTPs could be classified into six types (1, 2, C, D, E and G). Bioinformatic analysis predicted that most of BoLTPs are secreted proteins. In order to verify the predicted results, the bud-specific BoLTP1.7 and black rot-responsive BoLTP2.3 were selected as representative proteins for subcellular localization. The results showed that the fluorescence signals of BoLTP1.7 and BoLTP2.3 fused with GFP were detected in the extracellular environment, indicating they are secreted proteins (Fig. S3). Orthology analysis showed that most of the BoLTP genes have orthologous nsLTP genes in A. thaliana, indicating the cabbage nsLTP genes were derived from a common ancestor shared with A. thaliana. Compared with the Arabidopsis genes, the numbers of cabbage nsLTPs in type 1 and D were expanded, while the gene numbers of the other types were reduced or unchanged. It is worth mentioning that 39 Arabidopsis nsLTP genes did not have a BoLTP ortholog. These results indicate that there are not only gene duplications and triplications but also gene loss or mutation in the evolutionary process of cabbage.
Many studies have suggested that nsLTPs participate in sexual reproduction processes, such as pollen development, pollen exine formation and fertilization (Ariizumi et al., 2002;Chae et al., 2009;Huang, Chen & Huang, 2013;Edstam & Edqvist, 2014). Promoter activity analysis has suggested that AT3G51590 (LTP12) is specifically expressed in tapetum at the uninucleate microspore stage and the bicellular pollen stage (Ariizumi et al., 2002). The type C nsLTP with exclusive expression in the tapetum, AT5G52160, has been shown contribute to pollen exine formation in Arabidopsis (Huang, Chen & Huang, 2013). In this investigation, a high number of BoLTP genes were found to be specifically or highly expressed in buds or flowers (Fig. 4A). Among these genes, BoLTP1.1 and BoLTP1.7 are orthologous to AT3G51590, and BoLTPc1 is orthologous to AT5G52160. Moreover, BoLTP1.7, BoLTP1.10, BoLTP1.18, BoLTP2.2, BoLTP2.12 and BoLTPx2 were significantly downregulated in the buds of a male sterile mutant with defective exine. Notablely, these nsLTP proteins were also significantly down-accumulated in the male sterile mutant buds (Ji et al., 2018). The results suggested that the BoLTPs specifically or highly expressed in floral organs may play an important role in the sexual reproduction progress in cabbage.
There is much evidence that nsLTPs are related to the resistance to various types of stresses, including the resistance to phytopathogens, freezing, drought and salt (Molina & García-Olmedo, 1997;Hincha, 2002;Jung, Kim & Hwang, 2003, 2005McLaughlin et al., 2015). Some Arabidopsis nsLTPs have been classified as pathogenesis-related PR-14 proteins, and most of them belong to the type 1 nsLTP group (Sels et al., 2008). In this study, we found that most of these pathogenesis-related Arabidopsis genes have BoLTPs orthologs in cabbage. According to recent studies, the paralogous genes AT5G59310 (AtLTP1.11) and AT5G59320 (AtLTP1.12) could negatively regulate plant immunity in Arabidopsis. The overexpression of AT5G59320 (AtLTP1.12) enhances susceptibility of Arabidopsis to virulent bacteria and reduces the resistance of Arabidopsis to avirulent bacteria (Gao et al., 2015). In contrast, the double mutant AtLTP1.11/AtLTP1.12 showed an increased resistance to Pseudomonas (Gao et al., 2015). Similar to the orthologous genes AtLTP1.11 and AtLTP1.12, BoLTP1.8 and BoLTP1.9 showed high expression levels in black rot-susceptible C1184 and undetectable expression levels in black rot-resistant C1234, indicating that the high expression of these two genes may contribute to black rot disease susceptibility in cabbage. Furthermore, an analysis of stress-relevant cis-elements in the promoter regions of the BoLTP genes was conducted. The results show that the promoters of the BoLTP genes possess at least one stress-related cis-element (Fig. 6), indicating that the BoLTPs are involved in the stress response. Given all that, the BoLTP genes identified in this study could be used to further seek stress-resistant genes in cabbage and other Brassica crops.
Moreover, the expression analyses among paralogs indicated that some duplicated BoLTP genes showed significantly different expression patterns (Fig. 5). Previous studies have shown that the alteration of expression patterns is an important indicator of functional divergence between duplicated genes (Makova & Li, 2003;Ganko, Meyers & Vision, 2007;Hellsten et al., 2007). In other words, functional divergence could eliminate the redundancy of these duplicated BoLTP genes, which may be beneficial in multiple biological processes, such as cabbage growth, sexual reproduction and resistance to stress. However, in-depth studies should be carried out to reveal the biological functions of the BoLTPs in the development of different organs and in resistance to stress.

CONCLUSIONS
A total of 89 BoLTPs were identified based on genome-wide research. These genes were classified into six different types (1, 2, C, D, E and G). The tertiary structure, phylogenetic development, and gene expression of the BoLTPs were also summarized. The expression analysis shows the functional importance of BoLTPs in sexual reproduction and stress response. It is important to continue to reveal the functions of these BoLTPs with basic experiments such as overexpression or knock-down strategies followed with detailed phenotypic investigations. Moreover, the BoLTPs identified in this