Biosynthetic Insights into p-Hydroxybenzoic Acid-Derived Benzopyrans in Piper gaudichaudianum

Piper gaudichaudianum Kunth (Piperaceae) accumulates gaudichaudianic acid, a prenylated benzopyran, as its major component. Interestingly, this trypanocidal compound occurs as a racemic mixture. Herein, transcriptomic investigations of Piper gaudichaudianum using the RNAseq approach are reported, and from the analysis of the transcripts expressed it was possible to propose a complete biosynthetic pathway for the production of gaudichaudianic acid, including the steps that originate its precursor, p-hydroxybenzoic acid. Peperomia obtusifolia (L.) A. Dietr. (Piperaceae) also accumulates racemic benzopyrans, however, its chromanes originate from the polyketide pathway, while the chromenes from Piper derives from the shikimate pathway. Recent transcriptomic and proteomic studies of the former species did not identify polyketide synthases involved in the production of the benzopyran moiety, but revealed the expression of tocopherol cyclase, which may be responsible for the cyclization of the 3,4-dihydro-2H-pyran ring. The analysis of the enzymes involved in the secondary metabolism of Piper gaudichaudianum and the comparison with the data previously obtained from Peperomia obtusifolia can provide valuable information on how these compounds are biosynthesized.


Introduction
2][3] This nucleus is considered a privileged structure that is very common in bioactive natural products, such as coumarins, flavones, tocopherols (vitamin E) and tetrahydrocannabinoids. 4,5 The benzopyran moieties, 2H-chromenes and chromanes, isolated from Piper gaudichaudianum and Peperomia obtusifolia, respectively, have been demonstrated as potent trypanocidal compounds. 6,7Curiously, both classes of compounds occur as racemic mixtures in these species, though their formation follows two distinct biosynthetic routes. 6,8,9In Piper gaudichaudianum, chromenes originate from the shikimate pathway and use p-hydroxybenzoic acid (p-HBA) as a precursor. 10In the case of Peperomia obtusifolia, chromanes are formed through the polyketide pathway and use orsellinic acid as a precursor. 1,11Additionally, the formation of both classes of metabolites involves the condensation of an aromatic unit (p-HBA or orsellinic acid) and an isoprene unit (dimethylallyl pyrophosphate, isopentenyl pyrophosphate or geranyl pyrophosphate), followed by cyclization that gives rise to the benzopyran ring.During cyclization, the carbon atom at C-2 becomes a stereogenic center and, different from other benzopyrans such as vitamin E, both enantiomers are biosynthesized (Figure 1).Thus, the study of the proteins and genes possibly responsible for the biosynthesis of benzopyrans in these species, as well as the comparison between them, may provide new insights into how these compounds are produced.
Recent studies in Peperomia obtusifolia using a combination of shotgun proteomics and transcriptome analysis did not identify an orsellinic acid synthase.
However, the transcriptome analysis revealed the expression of tocopherol cyclase that may be responsible for the cyclization of the prenylated orsellinic acid precursor that yields the 3,4-dihydro-2H-pyran ring.Because orsellinic acid is commonly found in sordariomycetes and eurotiomycetes, these results suggest that orsellinic acid-derived benzopyrans may be formed by the combination of biosynthetic efforts from the host plant and endophytic fungi. 12ome studies have already been performed in Piper gaudichaudianum aimed at elucidating the biosynthesis of gaudichaudianic acid (1) (Figure 2), 10 which is the major prenylated chromene isolated from this species and described as a potent trypanocidal compound against the Y-strain of Trypanosoma cruzi. 13Interestingly, trypanocidal assays indicated that the (+)-enantiomer was more active than its antipode and that the enantiomer mixtures showed a synergistic effect, with the racemic mixture being the most active. 6Regarding its biosynthesis, the origin of the terpene moieties was shown to involve both the mevalonate and methylerythritol phosphate pathways. 10us, to complement the data in the literature and elucidate the complete biosynthesis of chromenes from Piper gaudichaudianum, a transcriptome study was performed on leaves from this plant using the RNA-seq approach.

Transcriptome sequencing, de novo assembly and annotation
The racemic prenylated chromene gaudichaudianic acid (1) is the major constituent in all organs in Piper gaudichaudianum adult plants, though it is found exclusively in the roots of seedlings. 6The occurrence of this compound has been reported in leaves over 12 months of age, and it becomes the major compound by the 15 th month of growth. 14Thus, to obtain the Piper gaudichaudianum transcriptome, three RNA-seq libraries were constructed from a pool of leaves from young plants (older than 15 months) and sequenced using the HiSeq 2500 Illumina paired-end sequencing system.RNA-seq is an approach to transcriptome profiling that uses deep-sequencing technologies in which RNA is sequenced via a high-throughput manner enabling robust assessments of eukaryotic transcriptomes. 15,16Furthermore, paired-end sequencing produces twice the number of reads at the same time, and sequences aligned as read pairs enable more accurate read alignment. 17This approach has been particularly useful in non-model species. 16n this study, an average of 34,846,653 raw reads with a length of 100 bp each and a percentage of bases with Q (Phred quality score) ≥ 30 over 87.5% were generated from the three samples submitted to Illumina sequencing.These values are adequate to guarantee high-quality data.When the sequencing quality reaches Q30, virtually all the reads will be perfect and have zero errors and ambiguities.After removing the adaptors and low-quality reads, 30,449,535 clean reads were retained, which reflected a loss of ca.12.6%.These clean data were de novo assembled into a total of 216,786 transcripts with  average length of 770 bases and an N50 length (weighted metric that represents the minimum assembly size in which ≥ 50% of the assembled bases are found) 16 of 1,362 bases using the Trinity software, 18 which generates transcript contigs based on overlapping information.The transcript size distribution revealed that there were 165,192 contigs (76.2%) ranging from 100 to 1,000 bases, 32,880 contigs (15.2%) ranging from 1,001 to 2,000 bases, and 18,701 contigs (8.6%) longer than 2,000 bases (Figure S1, Supplementary Information).The details of the transcript assembly are summarized in Table 1.The present results are in accordance with those recently published for transcriptomes for the Piperaceae species Piper nigrum and Peperomia obtusifolia, which were assembled using Illumina HiSeq 2000 and 2500 platforms, respectively. 12,19quence homology searches were conducted for the transcript functional annotation.Therefore, all transcripts were submitted to BLASTX 20 searches against the NCBI non-redundant (Nr) protein database, and 148,113 (68%) transcripts had significant similarity with an E (expect) value less than 1e −5 (Table 2).The sequences were also searched against a custom plant protein database, resulting in 113,374 (52%) annotated transcripts (E values lower than 1e −5 ) (Table 2).Even after these two different searches, a significant number of transcripts (68,401, 32%) remained unannotated, which can be a great resource for the discovery of novel genes.This is an expected situation given the absence of genomic information for Piper gaudichaudianum.
To classify the functions of the predicted transcripts, functional annotations were also created by using the Blast2GO 21 software with the Gene Ontology (GO) database, which is an internationally standardized gene functional classification system.A total of 65,579 transcripts were assigned at least one GO term distributed among the three main ontologies comprising biological process, molecular function, and cellular component.The biological process ontology distribution (level 2 of detail) contained mainly proteins involved in metabolic and cellular processes (Figure 3).In the same level of detail, catalytic activity and binding in the molecular function  represented the major subcategories (Figure 3).For cellular components, the assignments were mostly given to cell and cell parts (Figure 3).This pattern of GO annotation distribution was similar to those of other species and is typically seen in the transcriptome of samples undergoing development processes. 19,22GG pathway mapping assignment The annotated transcripts were also analyzed by searching them against the KEGG (Kyoto Encyclopedia of Genes and Genomes) database for KEGG (K) number assignments and subsequent reconstruction of the biosynthetic pathways active in Piper gaudichaudianum leaves.KEGG is an integrated database resource for the biological interpretation of genome sequences and other high-throughput data. 23As a result, 35,192 transcripts were identified with a K number corresponding to a total of 3,594 distinct genes.Among them, 961 (26.7%) are involved in general metabolic pathways that include energy metabolism, carbohydrate and lipid metabolism, nucleotide and amino acid metabolism, and secondary metabolism subcategories.
Within the secondary metabolism subcategory, 423 genes were identified.However, only the transcripts with an FPKM (fragments per kilobase of transcript per million fragments mapped) ≥ 1 were considered, resulting in 393 genes encoding enzymes involved in some manner in secondary metabolism (Table S1, Supplementary Information).This result allowed for the identification of the active secondary metabolism pathways in Piper gaudichaudianum; terpenoid and steroid biosynthesis (61 genes) represented the largest group, followed by phenylpropanoid biosynthesis (18 genes), flavonoid biosynthesis (12 genes), and isoquinoline alkaloid biosynthesis (7 genes).The terpenoid and steroid group includes terpenoid backbone, monoterpenoid, sesquiterpenoid, diterpenoid, triterpenoid, steroid, and carotenoid biosynthesis.][26] Although there are reports of phenylpropanoids only in the leaves of seedlings, 14,27 the transcriptome indicates their occurrence in the young plant.These data may suggest that these compounds are produced and then immediately used as precursors of lignins, since the complete biosynthesis pathways of p-hydroxyphenyl, guaiacyl, 5-hydroxyguaiacyl, and syringyl lignins have been identified (Figure S2, Supplementary Information).These phenylpropanoids may also act as precursors for the chromenes, flavokawains and flavonoids reported in this species. 26,28,29With respect to isoquinolinic alkaloids, the genes identified are involved in the early steps of their biosynthesis involving dopamine production (Figure S3, Supplementary Information).Although there are no reports of the occurrence of these compounds in Piper gaudichaudianum, two aporphinoid alkaloids, cepharadione A and piperolactam E, were isolated from Piper caninum and Piper taiwanense, respectively. 30

Candidate genes involved in benzopyran biosynthesis
In Piper gaudichaudianum, chromenes originate from the precursor p-HBA via the shikimate pathway.Despite the simple structure of p-HBA and its widespread distribution in plants, the enzymatic steps for its biosynthesis are not clearly understood. 31Hydroxybenzoates have been reported to originate directly from shikimate via chorismic acid or from phenylalanine (Figure 4). 31,32n the former case, chorismate-pyruvate lyase (CPL) converts chorismic acid into p-HBA. 32Although some reports mention that this enzyme is restricted to bacterial lineages, 33,34 a chorismate pyruvate lyase-catalyzed reaction, similar to that observed in Escherichia coli, seems to also occur in eukaryotic microorganisms. 32,35However, as expected, CPL was not identified in the Piper gaudichaudianum leaf transcriptome, suggesting that p-HBA is not biosynthesized by this pathway.This result is in agreement with literature data on microorganisms being the only known organisms to transform chorismic acid into p-HBA.
In the latter case, the biosynthesis of p-HBA from phenylalanine may follow two different routes, either by means of a β-oxidative or non-β-oxidative pathway. 36he first route proceeds via p-coumaroyl-CoA and p-hydroxybenzoyl-CoA for the formation of p-HBA from p-coumaric acid.This pathway involves the activation of p-coumaric acid by the action of the enzyme 4-coumarate-CoA ligase (4CL), leading to its thioester with subsequent chain-shortening into p-hydroxybenzoyl-CoA in a reaction mechanism analogous to that of NAD-dependent β-oxidation of fatty acids. 37,38The operation of this pathway is supported by in vitro enzymatic activity assays using cell-free extracts from Lithospermum erythrorhizon cell cultures. 37However, this conversion requires steps not fully elucidated yet, i.e., those corresponding to the hydration, dehydrogenation, and thiolation of the β-oxidative cycle followed by the final hydrolysis of the 4-hydroxybenzoyl-CoA thioester. 33 Petunia gene encoding the bifunctional peroxisomal enzyme cinnamoyl-CoA hydratase-dehydrogenase (CHD), which is responsible for the initial two-step conversion of cinnamoyl-CoA into benzoic acid, was identified by a functional genomics approach and has been shown to be active with p-coumaroyl-CoA (Figure S4, Supplementary Information).39 The next steps are expected to be catalyzed by thiolases and CoA thioesterases.A 3-ketoacyl-CoA thiolase (PhKAT1) was confirmed to be involved in the production of benzoyl-CoA from cinnamoyl-CoA in Petunia hybrid.36 The specific 4-hydroxybenzoyl-CoA thioesterase (4HBT), which is part of the bacterial 2,4-dichlorobenzoate degradation pathways, appears to occur only in microorganisms.However, CoA thioesterases members of the 4HBT family (1,4-dihydroxy-2-naphthoyl-CoA thioesterase 1 and 2) capable of hydrolyzing aromatic acyl-CoA substrates, including benzoyl-CoA, have been identified in Arabidopsis.40 The second route involves a nonoxidative pathway for the conversion of p-coumaric acid or p-coumaroyl-CoA to p-hydroxybenzaldehyde in a retro-aldol reaction with no co-factor requirement.37,38,41 A 4-hydroxybenzaldehyde synthase (HBS) and 4-hydroxycinnamoyl-CoA hydratase/lyase (HCHL) catalyzes the penultimate step of p-HBA biosynthesis by performing the phenylpropanoid side-chain cleavage of p-coumaric acid and p-coumaroyl-CoA, respectively.31,41 Then, the biosynthesis of p-HBA proceeds via p-hydroxybenzaldehyde in a reaction catalyzed by the enzyme 4-hydroxybenzaldehyde dehydrogenase.31 Studies with Daucus carota and Vanilla planifolia support the operation of this pathway.31,38 Analysis of the transcriptome of Piper gaudichaudianum leaves against the NCBI non-redundant and custom protein databases allowed for the identification of all the enzymes involved in the biosynthesis of p-coumaric acid and p-coumaroyl-CoA, starting from photosynthesis, glycolysis, and pentose phosphate pathways (Table 3).However, the next steps in the formation of p-HBA have not been clearly determined.Many transcripts appear to encode 3-ketoacyl-CoA thiolases, and one transcript presented homology with the gene encoding 4-hydroxybenzoyl-CoA thioesterase from Sphingomonas sp., but the FPKM for this transcript was < 1.Furthermore, the enzymes from the non-β-oxidative pathway were not identified.Although these data suggest that p-HBA biosynthesis proceeds via the β-oxidative pathway, it is not possible to ensure which route is actually operating in Piper gaudichaudianum.
Thus, to confirm the operant pathway in this species, the transcripts were re-analyzed by comparing them with a second custom database containing the sequences of enzymes from the different pathways involved in benzoate biosynthesis (from plants, fungi, and bacteria).By using this more specific approach, 774 transcripts were annotated (E value ≤ 1e −5 ).Of these transcripts, 34 showed significant homology (E value ≤ 1e −50 ) and similarity over 80%  With respect to the terpenoid portion, all the genes codifying the enzymes from both the mevalonate and methylerythritol phosphate pathways were identified in this species, along with those genes encoding isopentenyldiphosphate isomerase and geranyl-diphosphate synthase (Table 3).This result confirms the simultaneous operation of these two pathways in Piper gaudichaudianum, which is in accordance with previous studies. 10Many transcripts were identified as genes encoding prenyltransferases.
Considering only E values ≤ 1e −50 and FPKM ≥ 1, 5 transcripts were initially aligned with 4-hydroxybenzoate polyprenyltransferase genes as being involved in ubiquinone biosynthesis.However, these transcripts also present significant homology with 4-hydroxybenzoate geranyltransferase 2 genes (Table S3, Supplementary Information).These enzymes may be related to the prenylation of p-HBA that yields the 2H-pyran moiety after cyclization.
Finally, similar to what was recently observed for Peperomia obtusifolia, the transcriptome analysis of the Piper gaudichaudianum leaves revealed the presence of tocopherol cyclase. 12By using the BLASTN 20 web tool from NCBI, it was possible to verify that the transcripts from both species present a high degree of similarity (Figure S5, Supplementary Information).Thus, apart from catalyzing the formation of (S)-tocopherols, these enzymes may also be responsible for the non-stereoselective cyclization that yields the racemic chromane and chromene moieties.All the enzymes identified that appear to participate in gaudichaudianic acid biosynthesis in Piper gaudichaudianum are presented in Table 3, and a proposed scheme for this pathway is shown in Figure 5.

Conclusions
This study presents for the first time the transcriptome analysis of Piper gaudichaudianum using a next-generation sequencing approach.Approximately 10% of the transcribed genes identified are somehow involved in secondary metabolism.This approach allowed for the identification of the main active biosynthetic pathways in this species, such as those involved in the formation of terpenoids, phenylpropanoids, flavonoids, and isoquinoline alkaloids.These data corroborate and complement previous studies performed on Piper gaudichaudianum.However, the main contribution of this work concerns the biosynthesis of p-hydroxybenzoic acid-derived benzopyrans.Despite the advent of sequencing technologies that facilitate the elucidation of plant secondary metabolism, many genes in the benzoic acid derivative biosynthetic network remain to be discovered. 42With this study, it was possible to propose that p-HBA is produced via the β-oxidative pathway, providing the first insights into its biosynthesis in Piper species.Moreover, the transcriptome analysis revealed the presence of prenyltransferases and tocopherol cyclase enzyme genes, which may be responsible for the prenylation and cyclization that yield the 2H-pyran moiety.These findings are in agreement with those found in other Piperaceae species, such as Peperomia obtusifolia.

General experimental procedures
Total RNA was extracted using the RNeasy ® Plant Mini Kit (Qiagen, Hilden, Germany).A NanoDrop 2000 UV-Vis Spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, USA) was used to determine the concentration and quality of each RNA sample.The quality of the isolated RNA was checked on a 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA).The RNA libraries were prepared using a TruSeq RNA Library Preparation Kit (Illumina, San Diego, CA, USA) and were sequenced using a HiSeq2500 Sequencing System (Illumina, San Diego, CA, USA).

Plant materials
Leaves from young Piper gaudichaudianum Kunth specimens, which were cultivated under greenhouse conditions at the Instituto de Química (UNESP, Araraquara, SP, Brazil), were collected.The specimens were identified by Dr Elsie F. Guimarães, and a voucher specimen (Kato-0093) was deposited at the Herbarium of the Instituto de Botânica (São Paulo, SP, Brazil).

RNA extraction, library preparation, and RNA sequencing
Total RNA was extracted in triplicate from a pool of Piper gaudichaudianum leaves using the RNeasy ® Plant Mini Kit following the manufacturer's instructions.The concentration and quality of the three isolated RNA samples were checked by using the A260/280 and A260/230 ratios from a NanoDrop spectrophotometer.The quality of the samples was also checked in a Bioanalyzer for the presence of intact 28S and 18S bands.Paired-end libraries were prepared using a TruSeq RNA Library Preparation Kit according to the manufacturer's protocol.After that, the resulting libraries were sequenced using an Illumina HiSeq 2500 device.

De novo transcriptome assembly and annotation
The raw reads obtained after sequencing were qualityfiltered using the Trimmomatic software (version 0.33) 43 with default parameters to remove the Illumina adapters and low-quality bases.The filtered reads were subjected to a digital normalization algorithm to decrease the sampling variation, discard the redundant data, and remove most of the errors.For links to the digital normalization software, see Supplementary Information section.De novo assembly of the filtered clean reads was conducted with the Trinity software, 18 version r20140717, using default parameters (for the assembled sequences, see Supplementary Information section).Fragments per kilobase of transcript per million fragments mapped (FPKM) values were calculated using the Bowtie2 program. 44or annotation, all the assembled transcripts were searched using the BLASTX 20 tool with an E value cut-off of 1e −5 against the following databases: (i) the non-redundant NCBI protein database; (ii) a custom protein database with a total of 948,000 sequences from plant proteins, derived from the RefSeq and UniProt/ Swiss-Prot public banks; and (iii) a custom database containing 13,675 enzymes from the different pathways involved in benzoate biosynthesis (from plants, fungi, and bacteria).This benzoate biosynthesis database included the sequences for chorismate pyruvate lyase, cinnamoyl-CoA hydrates dehydrogenase, 3-ketoacyl-CoA thiolase, 4-hydroxybenzoyl-CoA thioesterase, 1,4-dihydroxy-2-naphthoyl-CoA thioesterase, hydroxycinnamoyl-CoA hydratase lyase, and hydroxybenzaldehyde dehydrogenase, which were derived from the RefSeq and UniProt/Swiss-Prot public banks.The annotation assigned to each transcript was based on the best hit (highest score).The Blast2GO 21 program was used to assign Gene Ontology (GO) terms to the annotated transcripts according to biological process, molecular function, and cellular component ontologies.
The comparison of the nucleotide sequences to verify their degree of similarity was performed using the BLASTN 20 webtool from NCBI.

KEGG pathway mapping assignment
Using the KEGG BlastKOALA 23 annotation web tool, the transcripts with significant hits against the custom plant protein database were also searched against a nonredundant set of KEGG genes from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database to assign K numbers.The transcripts with assigned K numbers were mapped using the Search&Color Pathway tool offered by the KEGG database. 45This analysis was focused on transcripts with functions that were assigned to a given secondary metabolism biosynthetic pathway.

Figure 1 .
Figure 1.The proposed biosynthetic pathway of benzopyrans from Peperomia obtusifolia and Piper gaudichaudianum.

Figure 2 .
Figure 2. Chemical structure of the gaudichaudianic acid.

Figure 4 .
Figure 4. Schematic representation of the possible pathways for the biosynthesis of the p-hydroxybenzoic acid.

Figure 5 .
Figure 5. Biosynthesis of gaudichaudianic acid in Piper gaudichaudianum including all proteins identified from the transcriptomic study.

Table 1 .
Overview of the sequencing and de novo assembly of a Average of the triplicate.Q30: Phred quality score of 30; GC: guaninecytosine content; N50: minimum contig length needed to cover 50% of assembled bases.

Table 2 .
Summary of sequence annotation for Piper gaudichaudianum

Table 3 .
Enzymes involved in gaudichaudianic acid biosynthesis based on the transcriptome data of Piper gaudichaudianum

Table S2 ,
Supplementary Information).Interestingly, all 34 annotated transcripts are related to p-HBA biosynthesis via the β-oxidative pathway, which corroborates this as the main operant pathway in Piper gaudichaudianum for p-HBA production.

Table 3 .
Enzymes involved in gaudichaudianic acid biosynthesis based on the transcriptome data of Piper gaudichaudianum (cont.)