The effect of red light and far-red light conditions on secondary metabolism in Agarwood

Agarwood, a heartwood derived from Aquilaria trees, is a valuable commodity that has seen prevalent use among many cultures. In particular, it is widely used in herbal medicine and many compounds in agarwood are known to exhibit medicinal properties. Although there exists much research into medicinal herbs and extraction of high value compounds, few have focused on increasing the quantity of target compounds through stimulation of its related pathways in this species. In this study, we observed that cucurbitacin yield can be increased through the use of different light conditions to stimulate related pathways and conducted three types of high-throughput sequencing experiments in order to study the effect of light conditions on secondary metabolism in agarwood. We constructed genome-wide profiles of RNA expression, small RNA, and DNA methylation under red light and far-red light conditions. With these profiles, we identified a set of small RNA which potentially regulates gene expression via the RNA-directed DNA methylation pathway. We demonstrate that light conditions can be used to stimulate pathways related to secondary metabolism, increasing the yield of cucurbitacins. The genome-wide expression and methylation profiles from our study provide insight into the effect of light on gene expression for secondary metabolism in agarwood and provide compelling new candidates towards the study of functional secondary metabolic components.


Background
Agarwood is resinous heartwood derived from Aquilaria and Gyrinops trees. Due to the high economic value of these trees and the extensive deforestation, agarwood producing tree species have become endangered. The use of agarwood is prevalent in many cultures for religious ceremonies, perfumes, and especially in Chinese herbal medicine, where plant materials are commonly utilized [1,2]. Agarwood is one of the most used plant materials in Chinese medicine, second only to ginseng. The value of agarwood lies not only in its aromatic compounds [3], but also in its non-volatile compounds, which potentially have beneficial properties with regards to human medicine [4,5].
In our previous study, we presented a draft genome and a putative pathway for cucurbitacins E and I, compounds with known medicinal value, in Aquilaria agallocha [6], one of the largest producers of agarwood. Briefly, gene expression changes for in vitro samples treated with methyl jasmonate (MJ) were shown to be consistent with known responses of A. agallocha to biotic stress and a set of homologous genes related to cucurbitacin biosynthesis in Arabidopsis thaliana was identified. However, MJ treatment is perhaps not the most efficient protocol. Although there exists much research into Chinese medicinal herbs and extraction of high value compounds, few have focused on increasing the quantity of target compounds through stimulation of its related pathways in this species.
In this study, we demonstrate that the quantity of cucurbitacins can be controlled by utilizing different types of light. Red light (R) and far-red light (FR) are components of the solar spectrum that strongly affect plant tissues. Many studies have reported an interaction between plant defenses and R/FR responses [7,8]. Under low R/FR conditions, there is a dramatic decrease not only in the number of root nodules but also in the expression of jasmonic acid (JA) response genes. In a study on phytochrome B (phyB) mutants, JA-related gene expression levels have also been observed to be downregulated [9] and are known to participate in secondary metabolic pathways [10].
In order to better understand the effect of light conditions on cucurbitacin secondary metabolic pathways in A. agallocha, we performed highthroughput sequencing experiments under two different light conditions: red light, a factor activating phyB, and far-red light, a factor inhibiting phyB [11]. Three types of sequencing experiments were performed: RNA sequencing (RNA-seq) to study gene expression, whole-genome bisulfite sequencing to study DNA methylation, and small RNA (sRNA) sequencing to determine sRNAs that play a role in methylation. As epigenetic modifications may also play a role in the regulation of gene expression, studies on DNA methylation are becoming increasing important.
To higher organisms, DNA methylation plays an important and widespread role in epigenetic modification, mediated by DNA methyltransferases (DMTs). DNA methylation in the genome is known to provide protection from transposons and/or RNA viruses, where they play a role in regulating splicing. DNA methylation is also associated with major developmental reprogramming [12]. Small RNAs are also an essential factor in plants where they play a role in regulating the activation of functional genes and transposons [3].
The results of our analysis show that R/FR conditions have a large effect on gene expression levels in agarwood. RNA-seq data revealed an array of gene clusters with distinctive expression patterns, where individual gene clusters responded primarily to red light or far-red light. Differentially methylated regions (DMRs) discovered from whole-genome bisulfite sequencing data showed that there is also a large difference in methylation levels between R/FR conditions. We observed that sRNAs may potentially play a role in influencing the methylation levels of genes important to secondary metabolism and subsequently play a role in gene expression regulation.
These genome wide profiles provide insight into the regulatory interaction between red light and far-red light conditions in A. agallocha as well as identify compelling new candidates for secondary metabolic functional components. The data used in this study is freely available at our provided webserver (http://molas.iis.sinica.edu.tw/ agarwood) and at NCBI (Bioproject ID: PRJNA240626).

Red light conditions increase cucurbitacin E and I content
In our previous study, we showed that agarwood contained high cucurbitacin content and that MJ treatment increased content levels [6]. Here, we instead used red light conditions to stimulate cucurbitacin biosynthesis (Fig. 1). From LC-ESI-MS quantification, it was seen that cucurbitacin content increased as red light exposure increased, up to 356 μg/g of cucurbitacin I at day 2. Cucurbitacin I content decreased as far-red light exposure increased, down to 96 μg/g at day 2. Similarly for cucurbitacin E, content levels increased up to 972 μg/g under red light conditions at day 1 and decreased down to 567 μg/g under far-red light conditions at day 5. Under red light conditions, at peak levels, cucurbitacin content was significantly increased compared to normal light conditions with p-values of 1.09E-5 and 4.57E-6 for cucurbitacin I and E respectively in a two-sample t-test. Similarly for far-red light conditions, at the lowest levels, cucurbitacin content was significantly decreased compared to normal light conditions with p-values of 3.44E-2 and 1.32E-4 for cucurbitacin I and E respectively. Different types of light affect various biological pathways in plants. There are five classes of phytochromes which typically absorb red light and far-red light [13]. Previous studies on phyA and phyB photosensory functions show that red light activated phyB interacts with transcription factors to induce a phytochromedependent signaling cascade [7,8] and that vascular plant one-zinc-finger (VOZ) transcription factors interact with phyB [14]. VOZs are active transcription factors that promote SA and JA-mediated defense responses under biotic stress [14,15]. Far-red light is known to inhibit phyB and plays an antagonistic role in most pathways [11,14].
Previous studies have demonstrated that target compounds can be increased through stimulating biosynthetic pathways [6,16] and that light can be used as stimuli for increasing compound yield [17]. With the increasing commonality of plant factories, the use of light as stimuli instead of chemical treatment may be preferable due to a simpler protocol.

Red light and far-red light gene expression patterns in agarwood
In order to study the effects of different light on gene expression in agarwood, we performed high-throughput RNA sequencing under red light and far-red light conditions. The time-course RNA-seq data ( Table 1) was obtained from samples under red light and far-red light conditions at 1, 2, and 5 days, as well as normal conditions (white light control). Two biological replicates were sequenced.
We utilized the RNA-seq data and the previously constructed A. agallocha genome [6] for gene expression quantification, resulting in an average correlation coefficient of 0.9404 for gene expression levels between biological replicates. Genes were clustered into 16 clusters based on their expression patterns, requiring a two-fold change in expression and a p-value cut-off of 0.001 for differential expression (Fig. 2). In total, 8882 genes were determined to be differentially expressed and clustered into distinct expression patterns (Additional file 1: Table S1). Gene ontology (GO) classification was performed to identify each cluster's most significant biological process ( Table 2). Clusters 3 and 11 were observed to exhibit a pattern of up-regulation under red light conditions and repression under far-red light conditions, consistent with the observed changes in cucurbitacin content levels. The GO classifications show that 253 out of 495 genes, in clusters 3 and 11 combined, are classified as belonging to metabolic processes (Additional file 2: Figure S1). Furthermore, these clusters contain 3 genes classified as belonging to terpene biosynthesis, the main class of compounds related to the medicinal properties of agarwood [18][19][20]. Terpenoid content is induced under biotic stress as an immune response to resist various pathogens [6,21] and its derivatives have been shown to exhibit anti-microorganism, anti-tumour, and other pharmacological effects that are beneficial towards human medicine [4,5]. In addition to terpene biosynthesis, clusters 3 and 11 contained 26 genes related to defense response. Previous studies have shown that far-red light down-regulates the expression of defense response genes by reducing a plant's sensitivity to jasmonate (or methyl jasmonate) in Arabidopsis [7,8]. From the RNA-seq data, it was seen that some defense response genes were up-regulated under red light conditions and downregulated under far-red light conditions. These results are consistent with our expectations and suggest that controlled light conditions can be used in place of plant hormones to induce defense response genes in agarwood.

Red light and far-red light DNA methylation patterns in agarwood
In order to study the effect of different light on methylation patterns in agarwood, we performed whole-genome bisulfite sequencing with two biological replicates for red light day 2, far-red light day 2, and normal samples (Additional file 2: Table S2). The methylation levels for each sample were used to discover differentially methylated regions (DMR) between different light conditions. A characterization of DMRs (Fig. 3a) shows that DMR proportions in transposons and intergenic regions were not significantly changed by R or FR conditions. In genic regions, it was seen that there was a slight increase (~6.4 %) in DMR proportions at promoter regions under FR conditions. The number of DMRs for each light Fig. 1 Endogenous cucurbitacin content of in vitro agarwood. Content was measured after red and far-red light treatment over the course of 5 days. Data is represented as mean ± standard deviation (n = 5). At peak levels under red light conditions, cucurbitacin content was significantly increased compared to normal light conditions (paired t-test p-values 1.09E-5 and 4.57E-6 for cucurbitacin I and E respectively). At the lowest levels under far-red light conditions, cucurbitacin content was significantly decreased compared to normal light conditions (paired t-test p-values 3.44E-2 and 1.32E-4 for cucurbitacin I and E respectively)  . 3b) indicates that there is a large change in methylation levels between red light and far-red light conditions. We focused on hypo-DMRs under red light conditions, using the consensus hypo-DMRs between R/normal and R/FR data, resulting in 621 regions for analysis. The average methylation levels in red light hypo-DMRs (Fig. 4a) show that CHH methylation (where H represents A, T, or C) exhibit the most significant differences under red light conditions. This remains the trend for average weighted methylation levels [22] in genic regions (Fig. 4b), where the most significant differences in methylation levels were observed in promoter regions for CHH methylation. CHG methylation levels were also observed to be affected by red light while CG methylation levels were relatively unchanged. These results suggest that red light may regulate gene expression in agarwood by changing CHH and CHG methylation, primarily in promoter regions.

sRNAome of red light and far-red light conditions in agarwood
In order to identify sRNAs that play a role in changes to methylation under different light conditions, we performed sRNA sequencing with two biological replicates for red light day 2, far-red light day 2, and normal samples (Table S2). Overall, approximately 6 million distinct Fig. 2 Cluster analysis of gene expression patterns in agarwood. Sixteen clusters were identified by k-means clustering. The samples are represented on the x-axis, from left to right: FR day 5, FR day 2, FR day 1, normal, R day 1, R day 2, R day 5. The centered log2 fold-change is represented on the y-axis sRNAs were able to be mapped perfectly and uniquely to the genome. A characterization of mapped sRNAs (Additional file 2: Figure S2) revealed that the majority (56.28 %) of sRNAs were mapped to genic regions, within which, a large majority (61.11 %) were mapped to promoter regions. As well, we characterized the mapped sRNAs in terms of their length ( Table 3) and observed that 71.93 % of the sRNAs were 24-nt long overall, 73.37 % in promoter regions. These results support the idea that under different light conditions, sRNA may play a role in DNA methylation via AGO4 and the RdDM pathway in agarwood.

Regulation of secondary metabolic gene expression by RdDM pathway
Although DNA methylation in promoter regions and intergenic transposable elements generally inhibit gene expression [42], the role of DNA methylation in A. agallocha is still unclear. To further our understanding of DNA methylation in A. agallocha, we identified sRNAs that inhibit gene expression through the RdDM pathway  selected from the set of metabolic processes genes containing hypo-methylated regions (Additional file 2: Figure S3). As mentioned previously, different AGOs have different preferences. Here, we focused on sRNA sequences that suited AGO4 preferences and mapped to hypo-DMRs. We identified 61 genes in agarwood related to secondary metabolism that fit our criteria. Three candidate genes were selected for further analysis (Fig. 5), a sterol methytransferase (g16251), a hydroxysteroid dehydrogenase (g23648), and a cytochrome P450 (g29032). The selected genes show that sRNAs were mapped to red light hypo-DMRs with a corresponding increase in mRNA expression under red light conditions. The expression levels were also verified using qRT-PCR (Additional file 2: Figure S4).
In the three candidate genes, we detected three specific sRNAs that mapped perfectly to promoter regions under far-red light conditions. It was seen that these sRNAs had a positive relationship with DNA methylation levels and a negative relationship with gene expression levels. In contrast, for both the sRNA sequencing and qRT-PCR validation, these sRNAs were not able to be detected under red light conditions. This suggests that the effects of red light and far-red light on secondary metabolism gene expression in agarwood are antagonistic to each other and that these sRNAs potentially play a role in gene expression regulation through the RdDM pathway in cucurbitacin biosynthesis.
Sterols (steroid alcohols) belong to steroids and are ubiquitous in eukaryotic organisms, playing pivotal roles in membrane structure and as precursors of vitamins and steroid hormones [43]. Sterol methyltransferases are known to catalyze a single methyl addition, an important step in phytosterol synthesis [43], and important to biosynthesis of secondary metabolites such as cucurbitacin. Hydroxysteroid dehydrogenases belong to alcohol oxidoreductases, which catalyzes the dehydrogenation of hydroxysteroid in steroidgenesis by cofactor NADP(H) or NAD and may affect the activity of compounds [44]. Cytochrome P450s (CYP450s) are also ubiquitous in many organisms. In plants, one or more CYP450s participate in compound modification and affect compound activity in secondary metabolism [45]. As well, some CYP450s play an important role in steroidgenesis [46,47].
Although these three candidate genes belong to rather large gene families, the gene expression, sRNA, and methylation patterns under red light and far-red light conditions indicate that these genes are potentially important for cucurbitacin metabolism in agarwood.

Conclusion
In this study, we performed three types of sequencing experiments in order to study the effect of light conditions on cucurbitacin biosynthesis and secondary metabolism in agarwood. This resulted in a number of new insights regarding the global regulation of genes by red light and far-red light. From the RNA sequencing results, gene expression patterns were clustered into distinct clusters, many of which can be characterized as responding primarily to light conditions. In particular, two gene expression clusters clearly exhibited gene expression patterns in response to red light and far-red light. Significantly, the two clusters included genes related to terpene biosynthesis and defense response. In addition to gene expression, small RNA and DNA methylation were observed to be factors affected by different light conditions which in turn affect cucurbitacin metabolism in agarwood. We identified a set of small RNA which potentially regulates gene expression through the RdDM pathway.
The results from this study provide genome-wide profiles of RNA expression, small RNA, and DNA methylation with regards to light conditions. These profiles provide insight into the effect of light on gene expression for cucurbitacin biosynthesis in agarwood as well as provide compelling new candidates for functional secondary metabolic components, highlighting new questions to be addressed in future studies.
We also demonstrate that light conditions can be used in lieu of methyl jasmonate treatment to stimulate pathways related to secondary metabolism, increasing the yield of cucurbitacins. This has important implications for the increasing use of plant factories for the synthesis of high value compounds.

Methods
Plant materials for DNA and RNA extraction A plant regeneration system from shoot tips into in vitro plants was created using a tissue culture process similar to the processes described by He et al. [48]. LED light sources (Daina Electronics) were used to provide different light conditions (Table S3). Normal (white light~55 μmol m −2 s −1 ) in vitro plant materials were grown under long-day conditions (16 h of light, 8 h of darkness) at 25°C. Red light samples (~15 μmol m −2 s −1 , 680 nm) and far-red light samples (~15 μmol m −2 s −1 , 730 nm) were continuously exposed to their respective light conditions at 25°C and the materials used for sequencing were collected after 1, 2, and 5 days.
DNA was extracted from 1 g of in vitro materials using the Plant Genomic DNA MiniKit (Maestrogen) following the manufacturer's instructions. RNA was extracted from 1 g of in vitro materials using RNeasy Plant MiniKit following the protocol prescribed by the manufacturer. Normal light samples were collected from material grown under long-day conditions in white light. The DNA and RNA samples were sent to BGI for poly(A) RNA sequencing, whole-genome bisulfite sequencing, and small RNA sequencing.

LC-ESI-MS
In vitro materials were ground with liquid nitrogen and mixed with 1 mL of methanol. Supernatant was collected by centrifugation (12000 rpm, 1 min). The LC-ESI-MS system consisted of an ultra-performance liquid chromatography system (Ultimate 3000 RSLC, Dionex) and an electrospray ionization source of quadrupole time-of-flight mass spectrometer (maXis HUR-QToF system, Bruker Daltonics). The autosampler was set at 4°C. Separation was performed with reversed-phase liquid chromatography on a BEH C8 column (2.1 × 100 mm, Walters). The elution started from 99 % mobile phase A (0.1 % formic acid in ultrapure water) and 1 % mobile phase B (0.1 % formic acid in ACN), held at 1 % B for 1.5 min, raised to 60 % B in 6 min, further raised to 90 % in 0.5 min, and then lowered to 1 % B in 0.5 min. The column was equilibrated by pumping 1 % B for 4 min. The flow rate was set to 0.4 mL/min with an injection volume of 5 μL. LC-ESI-MS chromatogram were acquired under the following conditions: capillary voltage of 4500 V in positive ion mode, dry temperature of 190°C, dry gas flow maintained at 8 L/min, nebulizer gas at 1.4 bar, and acquisition range of m/z 100-1000. Five samples for each condition were independently measured for cucurbitacin content levels.

RNA sequencing analysis
The RNA-seq data for all samples (Table 1) were trimmed for low quality bases at the 3' terminal and then individually aligned to the set of annotated A.
agallocha transcripts using BWA [49]. For each dataset, expression quantification was performed using eXpress [50]. R/FR pair-wise differential gene expression analysis was performed using edgeR [51] incorporating all replicates. Genes which exhibit at least a two-fold change in expression with a p-value threshold of 0.001 between any red light and far-red light sample were retained for clustering analysis. Clustering analysis was performed on the expression profiles of differentially expressed genes using k-means clustering. Gene ontology classifications for each cluster was performed using BinGO [52].

Whole-genome bisulfite sequencing analysis
The whole-genome bisulfite sequencing data for red light day 2, far-red light day 2, and normal were trimmed for low quality bases at the 3' terminal. MOABS [53] was utilized to perform alignment to the A. agallocha genome, methylated cytosine calling, discovery of differentially methylated cytosines (DMCs), and discovery of differentially methylated regions (DMRs). Differentially methylated cytosines were discovered using a Fisher Exact Test, with a p-value threshold of 0.05, a minimum depth of 3, and a minimum of 33 % nominal difference in methylation ratios between conditions. Differentially methylated regions were discovered using a Fisher Exact Test, with a p-value threshold of 0.05, a minimum of 3 DMCs in a region, and a maximum distance of 300 bp between DMCs.

sRNA sequencing analysis
The sRNA sequencing reads for red light day 2, far-red light day 2, and normal were aligned to the A. agallocha genome using BWA [49]. Only sequences with perfect mappings (no mismatches, no gaps) and uniquely mapped (to one genome location only) were retained for analysis.

qRT-PCR analysis
Validation of RNA expression on three candidate genes was performed using qRT-PCR analysis. The RNA samples for each light condition were extracted from 1 g of in vitro A. agallocha shoots using RNeasy Plant MiniKit following the protocol prescribed by the manufacturer. Primers pairs were designed for each transcript (Table S4) with the ABI Prism 7500 sequence detection system (Applied Biosystems). Each primer pair was used to amplify the respective cDNA fragments using a cycling profile consisting of 58°C for 2 min, 95°C for 10 min, and 40 cycles of 95°C for 15 s and 60°C for 1 min. The relative gene expression was determined by the comparative CT method, 2 −ΔCT (ΔC T = C T , gene of interest -C T , control gene), using AcHistone as the internal control [54]. Four independent biological repeats were performed for each assay where the final expression value is the mean expression of the repeats.
Validation of sRNA used the same plant materials as described above. An endogenous sRNA (CGGTGGAAG AAATAATAGGGCCTG) was chosen as internal control due to its expression levels being stable under different light conditions (mean TPM of 237.00 ± 39.44) as well as uniquely mapping to an intergenic region and thus will not affect genes. For detecting sRNAs of g16251, g23648, and g29032, miScript Primer Assays (Qiagen) #MSC0074731, #MSC0074729, and #MSC0074727, respectively, as well as the miScript Universal primer were used. Five independent biological repeats were performed for each assay where the final expression value is the mean expression of the repeats.

Additional files
Additional file 1: Table S1. The set of genes in each gene expression cluster.
Additional file 2: Table S2. (a) Whole-genome bisulfite sequencing DNA libraries and (b) sRNA sequencing libraries. Table S3. Spectral data of lamps used for different light conditions in this study. Table S4. Gene specific primers for real-time PCR analysis of gene expression. Figure S1. Gene Ontology classifications of the set of transcripts in cluster 3 and cluster 11. Relative gene proportions were calculated separately for Biological Process and Molecular Function. Figure S2. The composition of sRNAs that mapped to the A. agallocha genome. Only sRNAs which mapped perfectly and uniquely to one genome location were retained for analysis. Figure S3. Gene Ontology classifications of hyper and hypo differentially methylated regions. Relative gene proportions were calculated separately for Biological Process and Molecular Function. The set of metabolic process genes containing hypo-methylated regions were curated for secondary metabolic function and sRNA which mapped to hypo-DMR regions. Figure S4. qRT-PCR validation of mRNA expression and sRNA expression. Expression quantification from sequencing data as FPKM and TPM of the mRNA and sRNA expression are also shown, respectively.