RNA-seq and phytohormone analysis reveals the culm color variation of Bambusa oldhamii Munro

Background The clumping bamboo Bambusa oldhamii Munro, known as “green bamboo”, is famous for its edible bamboo shoots and fast-growing timber. The green and yellow striped-culm B. oldhamii variety, named B. oldhamii f. revoluta W.T. Lin & J. Y. Lin, is an attractive system for researching the culm color variation of B. oldhamii. Methods Millions of clean reads were generated and assembled into 604,900 transcripts, and 383,278 unigenes were acquired with RNA-seq technology. The quantification of ABA, IAA, JA, GA1, GA3, GA4, and GA7 was performed using HPLC–MS/MS platforms. Results Differential expression analysis showed that 449 unigenes were differentially expressed genes (DEGs), among which 190 DEGs were downregulated and 259 DEGs were upregulated in B. oldhamii f. revoluta. Phytohormone contents, especially GA1 and GA7, were higher in B. oldhamii. Approximately 21 transcription factors (TFs) were differentially expressed between the two groups: the bZIP, MYB, and NF-YA transcription factor families had the most DEGs, indicating that those TFs play important roles in B. oldhamii culm color variation. RNA-seq data were confirmed by quantitative RT-PCR analysis of the selected genes; moreover, phytohormone contents, especially those of ABA, GA1 and GA7, were differentially accumulated between the groups. Our study provides a basal gene expression and phytohormone analysis of B. oldhamii culm color variation, which could provide a solid fundamental theory for investigating bamboo culm color variation.


INTRODUCTION
Bamboo belongs to the Poaceae subfamily Bambusoideae, comprises over 1,600 bamboo species, and is extensively distributed in tropical and subtropical regions, such as Africa, South America, and South Asia (Vorontsova et al., 2016). Owing to its rapid growth, bamboo is widely used as a material in the biofuels, charcoal, timber, craft, furniture, house building, and paper-making industries (Ramakrishnan et al., 2020). Based on the rhizome (Allan, Hellens & Laing, 2008). In apples, a cold-induced bHLH transcription factor, MdHLH3, can interact with its MYB partner to regulate the expression of anthocyanin biosynthesis genes MdDFR and MdUFGT and influence fruit coloration (Xie et al., 2012). MBW (MYB-bHLH-WDR) complexes can control flavonoid biosynthesis by regulating late biosynthetic gene expression (Xu, Dubos & Lepiniec, 2015).
The genome size of B. oldhamii was measured (Zhou et al., 2017), and the chloroplast genome size was found to be 139,350 bp (Wu et al., 2009). Phenylalanine ammonia-lyase (PAL) is a key enzyme in phenylalanine metabolism, phenylpropanoid biosynthesis, metabolic pathways, and the biosynthesis of secondary metabolites, and PAL influences the biosynthesis of lignins, alkaloids, flavonoids, and anthocyanins. The BoPAL gene was isolated from B. oldhamii and it has similar biochemical properties to those of PALs from other plants (Hsieh et al., 2010a;Hsieh et al., 2010b;Hsieh et al., 2011). RNA-seq technology changed the method of studying the transcriptome and exploring gene structure and expression (Shendure, 2008;McIntyre et al., 2011). In this paper, RNA-seq technology was applied to investigate the culm color variation of B. oldhamii. Key differentially expressed genes and transcription factors were discovered by comparing culm skin samples between B. oldhamii and B. oldhamii f. revoluta. Our results provide scientific and theoretical implications for understanding bamboo culm color variations.

Plant materials
The middle and lower culm internode epidermis samples that were removed from B. oldhamii (LZ) were labeled LZ_1, LZ_2, and LZ_3, and those from B. oldhamii f. revoluta W.T. Lin & J. Y. Lin (HLZ) were labeled HLZ_1, HLZ_2, and HLZ_3, representing three biological replicates of each type of bamboo. The culm skin samples were frozen in liquid nitrogen immediately for further phytohormone detection, RNA-seq, and relative gene expression. Total RNA was isolated using plant RNA isolation kits (Tiangen Biotech, Beijing, China).

Library construction and sequencing
Library construction and sequencing steps were performed based on the Illumina HiSeq platform for RNA-seq protocols (https://www.illumina.com) at Wuhan Metware Biotechnology Co., Ltd. (Wuhan, China). The output data contained raw reads in fastq format, that were then processed for quality control, including filtering and trimming of low confidence bases, biased nucleotide composition, adapters, duplicates and low-quality reads to acquire clean reads. Trinity (2.6.6) (Grabherr et al., 2011) and Corset (1.07) (Davidson & Oshlack, 2014) were used to assemble the clean data and process the transcript cluster analysis, and the longest transcripts in each cluster were filtered out as unigenes. All of the following analyses were performed using unigene sequences.

Gene annotation
Nr (NCBI nonredundant protein sequences), Pfam (Protein family), KOG (Protein family), Swiss-Prot, Trembl, KEGG (Kyoto Encyclopedia of Genes and Genomes), and GO (Gene Ontology) were used for gene annotation. The Nr, KOG, Swiss-Prot, Trembl, KEGG, and GO annotations were performed using BLAST (v2.7.1) with an e-value = 1e−5, and the Pfam annotation was performed using the hmmscan command of the HMMER 3.2 package with e-value = 0.01. Transcription factor annotation was performed with iTAK (1.7a) (Zheng et al., 2016) with default parameters.

Gene expression analysis
We used the assembled transcriptome of Trinity as a reference and then mapped the clean reads of each sample to the reference with RSEM software. FPKM (Fragments Per Kilobase of transcript per Million fragments mapped) values were calculated to estimate gene expression and abundance after normalization of the mapped reads and transcript lengths. The R package Pheatmap was used to draw a heatmap with normalized log2(FPKM+1) data and clusters of expression patterns with kmeans_k = 10. The color from red to blue indicates gene expression from high to low.

Differential expression analysis
After acquiring the abundance information and performing normalization, gene expression between the groups was compared. DESeq2 (1.22.2) was used to calculate the differentially expressed genes between the LZ and HLZ groups, which were corrected with FDR (False Discovery Rate) by Benjamini-Hochberg methods. Differentially expressed genes were filtered with the condition of |log2(Fold Change)| ≥ 1 and FDR < 0.05.

Validation of RNA-seq analysis via qRT-PCR
The RNA-seq results were validated for selected genes using qRT-PCR assays. cDNA was synthesized with HiScript R II Q RT SuperMix for qPCR (Vazyme, China). Quantitative Real-Time PCR (qRT-PCR) was performed on a LightCycler R 480 II Real-Time PCR system (Roche International Diagnostics system, Switzerland) using the Unique Aptamer TM qPCR SYBR R Green Master Mix. The components of the qRT-PCR were as follows: SYBR Premix Ex Taq (2x) (10 µl), forward primers (0.5 µM), reverse primers (0.5 µM), cDNA template (2 µl), and ddH 2 O to 20 µl. Then, qRT-PCR was performed as follows: initial denaturation at 95 • C for 5 min; 40 cycles of denaturation at 95 • C for 10 s and annealing at 72 • C; and finally, steps for melt-curve analysis (95 • C for 15 s, 60 • C for 60 s, 95 • C for 15 s). Actin was used as the internal control (Zeng et al., 2015; Table S1), and relative expression was calculated with the 2 − CT method (Livak & Schmittgen, 2001).

Statistical analysis
The enrichment of up-and downregulated genes was determined using GOseq and KOBAS (Mao et al., 2005). The cor.test function was used to calculate the correlation between phytohormone contents and gene expression within the corresponding periods. Bar chart data of the phytohormone contents are reported as the mean ± SEM (n = 9) with a significant difference (p < 0.05) according to unpaired t -tests.

Plant materials
B. oldhamii (LZ) is a species of clumping bamboo ( Fig. 1) and it has entirely green culms. The B. oldhamii variety referred to as B. oldhamii f. reboluta W.T. Lin & J. Y. Lin (HLZ) has green culms with yellow stripes of random widths. The culm skin was removed from LZ and HLZ to research the correlation of culm color variation on the phytohormone contents and gene expression levels.

Transcriptome sequences and data output
Total RNA was isolated from the culm epidermis samples of B. oldhamii with three biological replicates marked LZ_1 -LZ_3 and from B. oldhamii f. revoluta with three biological replicates marked HLZ_1 -HLZ_3. After the cDNA library was constructed and sequenced, approximately 282 million raw sequence reads were obtained from the RNA-seq experiment, and 267 million clean sequence reads remained after filtering with a Q20 above 98% after quality control was performed. The error correction and GC content are shown as follows (Table 1).

De novo assembly
The clean data were used for de novo assembly with Trinity (Grabherr et al., 2011), and overall, 604,900 transcripts and 383,278 unigenes were generated ( Table 2). The average length of the transcripts was 670 bp, with an N50 of 1,033 bp and an N90 of 264 bp. Most (80%) of the transcripts were between 200 and 1,000 bp, while the remaining 20% of the transcripts had a length longer than 1,000 bp. The average length of the unigenes was 906 bp, with an N50 of 1,221 bp and an N90 of 433 bp. About 70% of the unigenes were shorter than 1,000 bp, while the remaining 30% of the unigenes were longer than 1,000 bp.

Gene annotation and functional classification
All of the unigenes were annotated using seven databases ( The unigenes annotated with GO functions were assigned to three main ontologies: molecular function (MF), cellular component (CC), and biological process (BP). The terms cellular process (GO:0009987), metabolic process (GO:0008152), biological regulation (GO:0065007), and response to stimulus (GO:0050896) were the most common BP ontologies; the terms cell (GO:0005623), cell part (GO:0044464), and organelle (GO:0043226) were the most common CC ontologies; and binding (GO:0005488), catalytic activity (GO: 0003824), transporter activity (GO:0005215), and transcription regulator activity (GO:0140110) were the most common MF ontologies ( Fig. 2A; Table S2). The unigenes with KOG annotation were categorized as posttranslational modification, protein turnover, and chaperones; signal transduction mechanisms; translation, ribosomal structure and biogenesis; energy production and conversion; transcription; intracellular trafficking, secretion, and vesicular transport ( Fig. 2B; Table S3). The majority of the annotated genes were characterized as ribosome pathway (ko03010)

Expression patterns of differentially expressed genes (DEGs)
Differential expression analysis between the LZ and HLZ groups revealed 449 differentially expressed genes; approximately 190 DEGs were downregulated in the HLZ group, and 259 DEGs were upregulated in HLZ group. The downregulated genes in HLZ were classified into 10 clusters ( Fig. 3A; Table S5). The results showed that cluster 2 included 1 gene (unigene-21666.123476 ) annotated with metallothionein that was highly expressed in all six samples, and cluster 1 included 8 genes that were more highly expressed in LZ samples. unigene-21666.69211 (mitogen-activated protein kinase kinase kinase ANP1), unigene-21666.134631 (SAUR family protein), unigene-21666.128797 (EREBF-like factor), and others without a specific annotation were included. The upregulated genes in HLZ were classified into 10 clusters ( Fig. 3B; , and others without specific annotations (Fig. 3). Random DEGs were chosen for qRT-PCR analysis to validate the accuracy of the RNA-Seq data results. The relative expression results showed a strong correlation between the RNA-Seq and qRT-PCR data ( Fig. 4; Table S7).

Functional enrichment of up-and downregulated DEGs
Differential expression analysis between the LZ and HLZ groups showed that 190 and 259 DEGs were down-and upregulated in the HLZ culm skin samples. The GO functional enrichment revealed that the downregulated DEGs were more enriched in monocarboxylic acid transport ( Table S10).

Phytohormones control culm color variation
To reveal how phytohormones control culm color variation in B. oldhamii, the content of endogenous abscisic acid (ABA), auxin (indole-3-acetic acid, IAA), jasmonic acid (JA), and gibberellic acid (GA 1 , GA 3 , GA 4 , and GA 7 ) was detected (Fig. 6A). The majority of phytohormones were highly accumulated in LZ, especially GA 1 and GA 7 , which were significantly accumulated in LZ; the contents of ABA were relatively highly accumulated in HLZ ( Fig. 6A; Table S12). The phytohormones of GA 1 , GA 7 and ABA were choosen for further study, and the relationship between phytohormones (ABA, GA 1 , and GA 7 ) and gene expression was analyzed with corresponding samples. There were 18 genes whose expression was significantly related to ABA accumulation patterns (Table S13), while 35 and 91 genes were significantly related to GA 1 (Table S14) and GA 7 (Table S15) accumulation patterns respectively. The genes significantly related to ABA, GA 1 and GA 7 were more enriched in the GO terms of chloride channel complex (GO:0034707), ion channel complex (GO:0034702),    Table S16). All the genes significantly related to ABA, GA 1 and GA 7 were also enriched in the circadian rhythm (plant) pathway (ko04712) (Fig. 6C; Table S17). In the circadian rhythm (plant) pathway, the bZIP transcription factor HY5 (unigene-21666.126192 and unigene-21666.239597 ) and protein FLOWERING LOCUS T (FT ) (unigene-54902.0) were up-regulated. Moreover, the expression of genes significantly related to ABA, GA 1 and GA 7 showed that unigene-21666.90795 (phenylalanine/tyrosine ammonia-lyase, PTAL) was relatively highly expressed in HLZ (Fig. 6D). The gene of PTAL is a crucial gene in phenylalanine metabolism and phenylpropanoid biosynthesis pathway which might conduct bamboo culm color variation.

Differentially expressed transcription factors
Differentially expressed transcription factors were filtered out of the annotated unigenes. A total of 21 differentially expressed transcription factors were identified ( Fig. 7; Table S18). The genes unigene- 21666.57044 (bHLH), unigene-21666.218036 (MYB-related), and unigene-21666.41135 (SBP) were not detected in the HLZ samples and were only detected in the LZ samples. bHLH transcription factors can interact with MYB transcription factors to regulate anthocyanin biosynthesis (Hichri et al., 2010). However, unigene-21666.194872 (FAR1), unigene-21666.98709 (MADS-M-type), unigene-21666.139384 (MYB), and unigene-21666.212035 (TRAF) were not detected in the LZ samples and were only detected in the HLZ samples. MADS-box transcription factors are involved in flower promotion and development, and simultaneous death usually follows after mass production of bamboo flowers (Abe, Miguchi & Nakashizuka, 2001). Four genes belonged to the bZIP transcription factor family, among which three genes (unigene-21666.126192, unigene-21666.239597, and unigene-43769.0) were annotated as the transcription factor HY5. The target genes of HY5 participate in many biological signaling processes, such as light signaling, circadian clock, anthocyanin biosynthesis, and chlorophyll biosynthesis (Gangappa & Botto, 2016). HY5 could induce the expression of the structural genes CHS (chalcone synthase), CHI (chalcone isomerase), and FLS (flavonol synthase) to regulate anthocyanin biosynthesis (Gangappa & Botto, 2016). There were three genes (unigene-21666.139348, unigene-21666.15994, and unigene-61591.0) that were members of the MYB transcription factor family. MYB transcription factors could regulate anthocyanin biosynthesis (Wang et al., 2019) and combine with other transcription factors to form MYB-bHLH-WDR complexes to regulate flavonoid biosynthesis (Xu, Dubos & Lepiniec, 2015).

DISCUSSION
Chlorophyll is a natural green pigment, and during green plant senescence, chlorophyll breakdown leads to a decrease in green color (Hörtensteiner, 2009). In the ripening phase of many fruits, such as tomato and banana, the color variation is caused by the massive degradation of chlorophyll. The unigene-21666.231610 gene is upregulated in the photosynthesis pathway and was annotated as psbP (photosystem II oxygen-evolving enhancer protein 2). The psbP protein is required for the photosystem II complex and normal thylakoid architecture in Arabidopsis thaliana (Yi et al., 2007;Yi et al., 2009). The unigene-54902.0 (FT ) gene is upregulated in the circadian rhythm (plant) pathway. The FT gene (FLOWERING LOCUS T), a mobile stimulus expressed in leaves and then translocated to the shoot apex, is essential for floral induction in Arabidopsis (Liu, Zhang & Yu, 2020). The MADS-box transcription factor unigene-21666.98709 (MADS-M-type) is relatively highly expressed in HLZ. MADS-box genes are essential for flower induction, promotion, and maturation (Theißen, 2001). The reproductive cycle of bamboo varies from 3 to 120 years (Janzen, 1976). Mass flowering in some bamboos is usually followed by simultaneous death at some levels (Abe, Miguchi & Nakashizuka, 2001;Miyazaki et al., 2009). Flowering is a hallmark event in the bamboo life cycle, followed by senescence (Marchesini, Sala & Austin, 2009). The decreased chlorophyll contents and increased FT and MADS-box gene expression levels revealed that the culm color variation of B. oldhamii might be related to the bamboo flowering trends. ABA contents were higher in HLZ with green and yellow striped culms in our study. Exogenous ABA could promote the accumulation of anthocyanins in Lycium fruits, and the structural genes involved in the flavonoid biosynthetic pathway were upregulated by ABA treatment (Li et al., 2019). The application of ABA could influence the expression of R2R3 MYB and the bHLH family (Li et al., 2019). The transcription of structural anthocyanin biosynthesis genes is regulated by MYB-bHLH-WD40 complexes (Jaakola, 2013). In Prunus avium L., ABA treatment could influence the expression of PacMYBA and induce anthocyanin accumulation (Shen et al., 2014). The GA 1 and GA 7 contents were significantly higher in LZ with green culms. Gibberellins delayed both chlorophyll depletion and total carotenoid accumulation (Alós et al., 2006). GA 1 and GA 7 delayed fruit coloration in the flavedo of 'Washington' navel sweet orange when girding the fruit peduncle (Gambetta et al., 2012). GA 3 delayed flavedo chlorophyll degradation and delayed fruit color break by reducing β-cryptoxanthin and β-citraurin biosynthesis (Gambetta et al., 2014).

CONCLUSIONS
This paper focused on investigating the culm color variation of the clumping bamboo B. oldhamii via combined RNA-seq and endogenous phytohormone content variation analyses. The results showed that bZIP, MYB, HY5, and other differentially expressed transcription factors play a role in B. oldhamii culm color variation. Moreover, phytohormone contents, especially GA 1 and GA 7, were more highly accumulated in LZ, but many flower-regulated genes were more highly expressed in HLZ, which indicates that HLZ may flower more rapidly than LZ and that the senescence pathways may be involved in bamboo culm color variation. The transcription factors HY5, MYB, and bHLH participate in culm color variation by regulating pigment biosynthesis pathways to cause bamboo culm color variation, but how the regulatory pathways between transcription factors and phytohormones influence culm color variation still needs to be deeply investigated.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This work was financially supported by the Zhejiang Academy of Agricultural Sciences Youth Training Project (No. 2020R26R08E01), the Zhejiang Science and the Technology Major Program on Agricultural New Variety Breeding (No. 2021C02070-4), and the Wenzhou Agricultural New Variety Breeding Cooperation Project (No. 2019ZX004-2). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.