Genome and transcriptome analysis to understand the role diversification of cytochrome P450 gene under excess nitrogen treatment

Panax notoginseng (Burk.) F. H. Chen (P. notoginseng) is a medicinal plant. Cytochrome P450 (CYP450) monooxygenase superfamily is involved in the synthesis of a variety of plant hormones. Studies have shown that CYP450 is involved in the synthesis of saponins, which are the main medicinal component of P. notoginseng. To date, the P. notoginseng CYP450 family has not been systematically studied, and its gene functions remain unclear. In this study, a total of 188 PnCYP genes were identified, these genes were divided into 41 subfamilies and clustered into 9 clans. Moreover, we identified 40 paralogous pairs, of which only two had Ka/Ks ratio greater than 1, demonstrating that most PnCYPs underwent purification selection during evolution. In chromosome mapping and gene replication analysis, 8 tandem duplication and 11 segmental duplication events demonstrated that PnCYP genes were continuously replicating during their evolution. Gene ontology (GO) analysis annotated the functions of 188 PnCYPs into 21 functional subclasses, suggesting the functional diversity of these gene families. Functional divergence analyzed the members of the three primitive branches of CYP51, CYP74 and CYP97 at the amino acid level, and found some critical amino acid sites. The expression pattern of PnCYP450 related to nitrogen treatment was studied using transcriptome sequencing data, 10 genes were significantly up-regulated and 37 genes were significantly down-regulated. Combined with transcriptome sequencing analysis, five potential functional genes were screened. Quantitative real-time PCR (qRT-PCR) indicated that these five genes were responded to methyl jasmonate (MEJA) and abscisic acid (ABA) treatment. These results provide a valuable basis for comprehending the classification and biological functions of PnCYPs, and offer clues to study their biological functions in response to nitrogen treatment.

Apart from the function on metabolites synthesis, some CYP450 genes also play roles in resistance to stresses, such as drought, salt, pest infestation and bacterial pathogens. For instance, OsDSS1, which is a member of the cytochrome CY450 gene family, and its mutants (dwarf and small seed 1, dss1) enhance Oryza sativa drought tolerance due to the accumulation of ABA and metabolites [24]. The AtCYP709B3 T-DNA insertion mutants (cyp709b3) show a salt intolerance phenotype [25]. The transient expression of saponin synthesis pathway genes in tobacco leaves indicates that BvCYP72A552 catalyzed the formation of hederageninbased saponins [26]. The reaction products of hydroperoxide lyase (HPL, CYP74B) in Arabidopsis thaliana, can be involved in the deterrence of insect pests [27]. Oscyp71Z2 overexpression plants show obvious resistance to bacterial fusarium caused by rice-XOO [28]. Some CYP450 genes have a metabolic detoxification function, and are involved in the decomposition of environment toxins [29]. Under treatment with basic macronutrients such as nitrogen, the expression level of some genes of CYP450 can also change. Decreased transcript levels of PtCYP711A1 have been detected in roots with excess N treatment, and the transcript level of PtCYP707A1 is up-regulated in N-starved roots [30]. In addition to its functions on compounds synthesis and biological or abiotic stress resistance, CYP450 has also been reported to be a growth factor. The CYP78A mutation in Arabidopsis and rice indicate that CYP78A is involved in regulating organ size and cell proliferation [7].
In recent years, many articles focused on the CYP gene family systematical studies were published, including A. thaliana [31], O. sativa [32], Vitis vinifera [33] and Panax ginseng [34]. Whereas, identification, structural and evolutionary analysis of the PnCYP gene family have not been reported, and there are few studies on the function of PnCYP genes. In this study, we performedgenome-wide bioinformatics analysis of the PnCYP gene family, including phylogenetic analysis, gene structure, conserved motifs, chromosomal locations, gene duplications, cis-regulatory elements and functional divergence. Meanwhile, due to the high medicinal value of P. notoginseng, abusing nitrogen (N) fertilizer is very common in P. notoginseng planting [35,36]. Previous studies have shown that plant hormones combined with N signals regulate root growth and development [30,37,38], thereby the PnCYPs expression pattern under nitrogen treatment have been studied. Genes related to the synthesis of plant hormones (ABA, MEJA, GA) and saponins were screened by transcriptomic data analysis, the tissue expression pattern and change trend of gene expression levels under hormone treatment were analyzed by quantitative realtime PCR (qRT-PCR). Furthermore, some studies have predicted that CYP genes may be the miRNA target [32,39], so we utilized this prediction in our paper. All of these were helpful to study the CYPs function of P. notoginseng, and optimize the plant environment of P. notoginseng to cultivate excellent varieties.

Classification of identified PnCYP450s
The 188 putative CYP450 genes were obtained from P. notoginseng. Here, we designated the PnCYP genes according to the classification Ofp. ginseng CYP genes by sequence similarity. The genes were classified into 9 clans with 40 subfamilies. They were grouped into A-type (CYP71 clan) with 17 subfamilies comprising 102 genes and non-A type (CYP51, 72, 74, 85, 86, 97, 710, 711 clans) with 18 subfamilies including 86 PnCYPs. The CYP92, CYP703A, CYP710A, CYP711A family contained only one gene, while the CYP71 family was the largest family containing 27 memebers. The sequence length of 188 CYP450 genes ranged from 618 to 6279 bp and the molecular weights (Mw) of these proteins ranged from 22.5 to 239 kDa. The isoelectric points ranged from 5.17 to 9.51 (Table S1).

Phylogenetic analyses and selective pressure analyses of PnCYP genes
An unrooted neighbor-joining (NJ) phylogenetic tree of 188 P. notoginseng genes was built to explore the relationships among PnCYP genes. Those genes were classified into nine clans, some clans (CYP51, CYP74, CYP97, CYP710, CYP711) contained only one gene family, while the others (CYP71, CYP72, CYP85, CYP86) were multifamily clans (Fig. 1). CYP71 was the largest CYP450 clan with functional diversity. In P. notoginseng, the members of this branch accounted for 54.3% of the total. CYP71 clan members in other plants have been reported to play roles in plant secondary metabolism, such as amino acid derivatives, fatty acids, alkaloids, terpenoids and precursors of hormones [9].
To investigate the evolutionary relationships of plant CYP450s, an unrooted NJ tree among P. notoginseng, A. thaliana, P. ginseng, O. sativa, P. trichocarpa and P. patens was constructed (Fig. S1). A total of 47 subfamilies were identified in Arabidopsis, of which seven subfamilies (CYP73, CYP83, CYP93, CYP702, CYP705, CYP708, CYP709) were not found in P. notoginseng. Forty-one subfamilies were identified in P. ginseng, two subfamilies (CYP720 and CYP724) were not found in P. notoginseng, while CYP710 was not found in ginseng. Some families were only found in moss, including CYP752, CYP753, CYP751, CYP762 and CYP764 families. Meanwhile, CYP99 family, CYP723 family and CYP729 family were only present in rice. Moreover, O. sativa contained the CYP51H and CYP51G subfamilies, and other species only contained the CYP51G subfamily. A. thaliana, P. notoginseng, P. ginseng contained only the CYP74A and CYP74B subfamilies, but the other three plants contained at least three CYP74 subfamilies (e.g. CYP74C, CYP74D et al.).
To further explore the evolutionary patterns of the CYP450 gene family in plants, paralogs of P. notoginseng, orthologs betweeen P. notoginseng and A. thaliana, P. notoginseng and P. ginseng were identified and listed in Table S2. Forty paralog pairs in Pn-Pn, 26 ortholog pairs in Pn-At, 70 ortholog pairs in Pn-Pg were identified (Table S2). To investigate the selection pressure of the PnCYP450 genes, the non-synonymous substitution rate/ synonymous substitution rate (Ka/Ks) value of these homologous pairs was calculated (Table S3 and  Table S4). In total, only two paralog pairs (PnCYP736A2/ PnCYP736A12, PnCYP716A2/PnCYP716A7) in Pn-Pn had Ka/Ks ratios greater than 1. Six orthologs pairs (PnCYP76A1/PgCYP76A2v2, PnCYP81B2/PgCY-P81D3p, PnCYP72A4/PgCYP72A8, PnCYP74B1/ PgCYP74B1, PnCYP707A1/PgCYP707A3, PnCYP704C2/ PgCYP704C4) had Ka/Ks ratios larger than 1 (Fig. S2). All Ka/Ks values of Pn-At orthologs pairs were less than 1. This indicated that most genes have undergone purification selection.
We analyzed the gene structure of the PnCYP genes (Fig. S3). Some subfamilies consisted of intronless genes, such as CYP77 and CYP710. In 71 clan, most members (83 genes, 81.4%) contained only one intron, while all CYP701 subfamily members contained six introns. The characteristic of 72 clan was that most genes had 4 introns. In 86 clan, almost all subfamilies (CYP86, CYP94 and CYP96) members had only one intron except the CYP704 subfamily. In contrast, CYP97 clan and CYP85 clan had very complex gene structures, PnCYP97A1 contained the maximum number of 15 introns.

Cis-regulatory elements analyses and miRNA target prediction
The 136 gene promoters contained two kinds of cis-regulatory elements (Fig. 4, Table S5), one of which was related to biological stress. Among these cis-regulatory regulatory elements, ABA response element (ABRE) accounted for the maximum ratio, 87 gene promoters (64%) contained this element. The cis-regulatory elements associated with MEJA (CGTCA-motif and TGACG-motif ) were respectively found in 72 and 73 gene promoters. There were another two cis-regulatory elements with a higher proportion: TGA-element (IAA-related, 53 genes) and TCA-element (salicylic acid (SA)-related, 58 genes).  Table S1 The promoters related to biological stress also included the GARE-motif and P-box elements which are involved in GA response. The other cis-regulatory element was related to abiotic stress, including MBS (41 genes) and G-box (94 genes) elements related to drought, low temperature stress response (LTR, 43 genes) and TC-rich (31 genes) repeats related to defense.
Three pairs of miRNA-target interactions were found, containing 2 miRNAs and 3 CYP450, (Table S6). MiR164 mediated transcriptional regulation of PnCYP736A9 and PnCYP736A10, while PnCYP86A3 was possibly a target of miR171. Both MiR164 and miR171 are associated with stress resistance in other species [40], while PnCYP736A9 and PnCYP86A3 were responded to nitrogen treatment in this study.

GO, Kyoto encyclopedia of genomes (KEGG) annotation and subcellular localization prediction of the PnCYP450 gene superfamily
A total of 187 PnCYP450s were annotated on Blast2GO software except PnCYP715A1. Those CYP450 protein were assigned to 21 GO terms ( Fig. 5a and Table S7). The terms were classified into three major functional categories, including 11 biological process subcategories, 7 cellular component subcategories and 3 molecular function subcategories. A total of 174, 176, and 186 proteins were annotated on the terms of biological process, cellular component, and biological process, respectively. Most genes were associated with metabolic reactions under biological process (GO:0008152), indicating that CYP450 genes function in the synthesis of various primary and secondary metabolites. Only three genes (PnCYP703A1, PnCYP90A2, PnCYP90A1) were predicted to be involved in cellular component organization or biogenesis (GO:0071840, GO:0022414, GO:0000003). For cellular component, most genes were related to membrane part (GO:0016020, GO:0044425) and only three genes (PnCYP94A1, PnCYP94A2, PnCYP86B1) were related to macromolecular complex (GO:0032991). On the terms of molecular function, most genes were shown to play roles in catalytic process (GO:0003824) but only one gene (PnCYP94A3) was related to the molecular function regulator.
A total of 77 PnCYP450s were annotated in the KEGG database. These genes were annotated on 17 pathways which were most related to compound synthesis (Fig. 5b). PnCYP73A1 and PnCYP73A2 were annotated into degradation of aromatic compounds (ko01220). PnCYP74 subfamily was related to alpha-Linolenic acid metabolism (ko00592). PnCYP82, PnCYP88, PnCYP701 subfamilies were shown to play roles in diterpenoid biosynthesis (ko00904). And PnCYP97, PnCYP707 subfamilies were predicted to have a function on carotenoid biosynthesis (ko00906). This indicates that CYP450 genes from different subfamilies may play roles in the same metabolic pathway. The subcellular localization analysis of the 188 PebHLH proteins predicted they were mostly located in the cytoplasmic (114, 60.6%), 65 members were located in innermembrance (34.6%), six were located in outermembrance (3.2%), only three were located in periplasmic (1.6%) (Table S7).

Expression analyses of PnCYP450 genes treated with different forms of nitrogen fertilizers
Excessive use of ammonium salt as the sole source of nitrogen hindered plant growth. To analyze expression levels of CYP450s in response to nitrogen fertilizers treatment, 188 PnCYP genes expression data (Per Kilobase of exon model per Million mapped reads, FPKM) were counted (Table S9). One hundred seven PnCYP genes whose expression data > 1 in one or more treatment were selected for further analysis (Fig. S4, Table S10). A total of 47 differentially expressed genes were identified, among them, 10 genes were significantly up-regulated and 37 genes were significantly down-regulated (fold change (FC) > 1.5 and p-values < 0.05) (Fig. 6

Quantitative real-time PCR (qRT-PCR) expression profiles of PnCYP genes
Nitrogen treatment can affect the synthesis of some hormones (ABA, MEJA, GA) in plants. In the present study, a MEJA synthesis related gene (PnCYP94B2), one gene related to GA synthesis (PnCYP701A1) and one gene related to ABA synthesis (PnCYP707A1) were screened for the further study. The main medicinal components in P. notoginseng were saponins. Therefore, two saponin synthesis related genes (PnCYP716A1, PnCYP716A6) were screened. Primers used for the qRT-PCR analysis of PnCYP gene expression were listed in Table S13. Tissue expression pattern analysis showed that the expression levels of PnCYP94B2, PnCYP701A1 and PnCYP707A1 in leaves were higher than these in the other three tissues, while the expression levels of PnCYP716A1 and PnCYP716A6 in the roots were higher than these in the other three tissues (Fig. 7). We also analyzed the expression trends of these five candidate genes under ABA and MEJA treatment. The expression level of the PnCYP707A1 gene was significantly decreased under treatment with ABA and MEJA (> 2), while the expression levels of the remaining four genes were all increased under treatment with these two hormones. In the MEJA treatment, PnCYP94B2, PnCYP701A1, PnCYP716A1 and PnCYP716A6 showed similar changes, that is, the expression levels of PnCYP94B2, PnCYP701A1, PnCYP716A6 were significantly increased at 1 h, 6 h and 12 h (> 2). The expression patterns of PnCYP94B2 and PnCYP701A1 were similar in the ABA treatment, and the expression levels of PnCYP716A1 and PnCYP716A6 were significantly increased at 1 h, 3 h, 6 h, 12 h and 24 h (Fig. 7).

Discussion
To investigate the evolutionary relationships of plant CYP450s, an unrooted NJ tree among P. notoginseng, P. ginseng, Physcomitrella patens (moss), O. sativa (rice), A. thaliana (Arabidopsis), Populus trichocarpa (poplar) was constructed (Fig. S1). The number of PnCYPs was less than that of Arabidopsis (245), poplar (310), rice (326), and more than algae (98) and ginseng (138) [31][32][33][34], indicating that the CYP family has a significant degree of evolutionary extension among plant species. The molecular weight (MW) and isoelectric point (PI) partly determined the gene molecular structure and biochemical function [41], and the large range of PnCYP MW and PI may be due to their role in different synthetic pathways, which also shows the functional diversity of PnCYP. Protein divergence analysis showed that even in the conservative subfamilies, functional divergence still existed. In 51, 74, and 97 subfamilies, the type I functional divergence coefficients (θ I ) among all branches were greater than 0, indicating that functional divergence occurred between members of each subfamily. In addition, in the 97 subfamily, the θ I coefficient was greater than that in the other two subfamilies, which proved that the functional divergence possibility in this subfamily was greater than that of 51, 74 subfamily. Identifying potential critical amino acid positions in CYP51, CYP74, and CYP97 proteins can provide more information for analyzing the evolution of the PnCYP450 family. Some subfamilies and clans showed closer evolutionary relationship. For example, CYP83 and CYP99 family were inside the CYP71 family on the phylogenetic tree. The monocot specific CYP723 family was obviously closest to CYP89 family, CYP729 family was obviously closest to CYP88 family. CYP51 clan, CYP710 clan and CYP85 clan were gathered on the same branch. Some members of the three clans were reported to have function on sterol synthesis and processing. CYP51G is reported to be involved in the synthesis of sterols, some CYP710 family members are known as sterol C22-desaturases, CYP85 clan members can process plant sterols [42]. In summary, CYP710 clan and CYP85 clan may be evolved from CYP51. Meanwhile, different subfamilies in the CYP85 clan (CYP90 and CYP724B subfamilies) are involved in the synthesis of brassinosteroids [43,44], and phylogenetic tree analysis showed that the two subfamilies were clustered together. CYP74, CYP727 and CYP751 were also clustered on the same branch. CYP751 was only presented in moss and was clustered on the clan of CYP727, so we predicted that 751 and 727 may come from the same ancestor. CYP97, CYP72 and CYP86 clustering together on the phylogenetic tree indicated that they may be evolved from a common ancestor. CYP97 contained three distinct subfamilies, and CYP97C was scattered in the CYP97B branch. This proved that in the evolutionary process, the genetic relationship between CYP97B and CYP97C was closer, compared with CYP97A. The members of CYP72 clan are relevant to the synthesis of plant hormones (GA), cytokinins, isoprenoids and fatty acids [11,44]. Some members of CYP86 clan (CYP86 and CYP94 subfamilies) encode fatty acid hydroxylases or alkane hydroxylase [45,46], and the two subfamilies were clustered together. Based on the above analysis, we speculated that the CYPS superfamily acquired new functions through duplicate divergence.
Motif and gene structure analysis strongly supported the results of subfamily classification. For some subfamilies in the same branch, motif compositions and gene structure were similar, such as the CYP71, CYP736 and CYP82 subfamilies in 71clan, indicating that these subfamilies might have similar functions. In this study, we found five cis-acting regulatory elements (ABA, MeJA, Auxin, SA and GA) associated with plant hormones were. The promoters of PnCYP82C1, PnCYP701A1, PnCYP736A3 and PnCYP714A1 contained all of these five cis-acting elements. At the same time, three cis-acting elements (Drought, Low Tem, Defense) associated with abiotic stress were found in the promoters. PnCYP76A5, PnCYP76A6, PnCYP78A5 and PnCYP72A2 contained all of these three cis-acting elements. This suggests that these genes may play an important role in the development of P. notoginseng. The expression patterns under nitrogen treatment revealed that only one paralog pair (CYP76A4/CYP76A6) had the similar expression pattern, suggesting functional divergence among PnCYP. Structural analysis showed that the frequency of segment duplication in P. notoginseng was greater than that of tandem duplication, and the Ka/Ks ratio of two paralogous pairs (Pn-Pn) and six orthologous pairs (Pn-Pg) was greater than 1.0. All of these demonstrated that in the process of evolution, some genes undergo positive selection, and the phenomenon of functional divergence did existed.
In plants, nitrogen treatment affected some phytohormones' synthesis, such as ABA, GA, JA and SA. Transcriptome data indicated that some genes related to hormone synthesis were responsive under the three conditions. CYP707A, which is also known as ABA8Ox, was incrementally down-regulated with the three kinds of treatments, PnCYP707A1 showed significantly reduced expression levels with at least 2.3fold. In the qRT-PCR experiment, the PnCYP707A1 expression level was significantly decreased under ABA treatment (> 2) (Fig. 7). And PnCYP707A1 promoter contained ABA relevant cis-element (ABRE). The 9-cisepoxycarotenoid dioxygenases (NCEDs) which were also associated with ABA synthesis were identified (Table S12). And PnNCED3, PnNCED4 and PnNCED5 showed sharp decrease of transcripts in at least one treatment. PnNCED3 was down-regulated 4.4-fold and 6.2-fold respectively under 15A and 15AN treatment. While PnNCED4 was significantly up-regulated in 15 N. Furthermore, PnCYP707A1, PnNCED3 and PnNCED4 showed a significant correlation of expression (Fig. 8a). In the JA synthesis pathway, two lipoxygenases (PnLOXs) (PnLOX1 and PnLOX8) were down regulated under 15A and 15 N treatment (Fig. 8b). Eight 12-oxophytodienoate reductases (PnOPRs) were identified, however, only one of them (PnOPR8) responded to nitrogen treatment. In Arabidopsis, CYP94B1, CYP94B3 and CYP94C1 were found to be associated with the partial deactivation of JA-Ile hormone [47]. PnCYP94B2, PnCYP94C2 and PnCYP94C3 whose promoters contain the CGTCA-motif and TGACG-motif (Table S5) were all differentially expressed genes in this study. In the qRT-PCR experiment, the PnCYP94B2 expression level was significantly increased under ABA treatment (> 2) (Fig. 7). In co-expression network, PnCYP94B2, PnCYP94C2 showed a connection to PnLOX1 and PnOPR8. Moreover, GA synthesis is also affected by nitrogen in the soil [30]. Only one entcopalyl diphosphate synthase gene (CPS) was identified and it (PnCPS1) had a more than 3-fold increase in which was also involved in gibberellin synthesis and contained the GARE-motif in its promoter, showed significant downregulation in all treatments. Likewise, the co-expression analysis indicated that these three genes were closely related to one another (Fig. 8a).
Saponins are the main medicinal ingredients of P. notoginseng [2], and CYP genes participate in saponin synthesis [19][20][21][22][23], so the genes related to the saponin synthesis pathway were identified and their transcripts levels were counted. Squalene synthase (SS) was responsible for the synthesis of triterpene essential substrate (C30 isoprenoid squalene), which showed decreased expression levels under 15A (Fig. 8b). Triterpenoid compounds oxidation genes (squalene epoxidase [SE]) PnSE2 showed at least 2.9-fold down regulation under all three conditions, while PnSE3 was up-regulated at least 2.0-fold under all three conditions. Only one dammarenediol-II synthase (DS) gene, which was a kind of oxidosqualene cyclase, was found in P. notoginseng. It was down-regulated in response to 15A and 15 N. CYP716A, which plays roles in processing protopanoxadiol [23],also responded to nitrogen treatment. PnCYP716A1 and PnCYP716A6 showed significantly down-regulated expression levels in three treatments. According to the tissue expression pattern analysis, the expression levels of PnCYP716A1 and PnCYP716A6 were highest in the roots, which are the main medicinal tissue of P. notoginseng [2]. These two genes both responded to MEJA and ABA treatment. PnSE2, PnSE3, PnDS1, PnCYP716A1 and PnCYP716A6 showed a significant correlation of expression (Fig. 8a). PnCYP genes participating in the synthesis of hormones and saponin pathways were showed in Fig. 8b.

Conclusions
We identified 188 PnCYP genes, which were divided into 40 subfamilies and clustered into 9 clans. We have performed phylogeny, gene structure, chromosome location, duplicated event, GO and KEGG annotation, and functional divergence analysis on this supergene family, which showed that PnCYP genes have undergone frequent evolution and functional diversity. Through the analysis of the transcriptome under different nitrogen treatments, some genes involved in the synthesis of plant hormones such as ABA, JA and GA were identified, and some genes involved in saponin synthesis were also identified. These five genes were screened for tissue expression pattern analysis and the expression level change analysis under hormone treatment. Our findings can provide a reference for studying the functional diversification, the supergene family members correlation, and researched the PnCYP genes involved in the biosynthesis of notoginseng saponins.

Phylogenetic tree construction and structure analysis of PnCYP450 gene family members
To investigate the phylogenetic relationships among the CYPs, the CYP450 protein sequences of Physcomitrella patens (moss), O. sativa (rice), A. thaliana (Arabidopsis), P. trichocarpa (poplar), P. notoginseng and P. ginseng (P. ginseng) were aligned by Clustalw. Neighbor-Joining method was employed to construct an un-rooted phylogenetic tree using the MEGA 5.0 software, and the replications were set as 1000 [51]. Both the phylogenetic trees of six species and individual P. notoginseng phylogenetic trees were constructed using the same method.
The Multiple Expectation Maximization for Motif Elicitation (MEME) online tool (http:// meme. sdsc. edu/ meme/ intro. html) [52] was used to identify PnCYP conserved motifs with the following parameters: the number of different motifs was set as as 20, and the range of motif width were set as 6 to 50.
Chromosome sequence information was downloaded from NCBI website with the accession number JACBWS000000000 [53]. And chromosome location map was utilized by TBtools [54].

Selective pressure analysis
Phylogeny-based and bidirectional best-hit methods were used to identify the homologous pair of putative paralogous pairs in P. notoginseng, orthologous pairs between P. notoginseng and A. thaliana, P. notoginseng and P. ginseng [58]. Nucleotide sequences were used in this analysis. The length of each gene in a homologous pair was longer than 300 bp, and the sequence similarity between two genes in each homologous pair was greater than 40% [59]. MEGA 5.0 was used to perform genes pair alignment, and Dnasp 5.10.1 software [60] was used to calculate the synonymous substitution rate (Ks) and non-synonymous substitution rate (Ka) substitutions per site between duplicated gene pairs [61]. Values of (Ka/Ks) < 1, = 1 and > 1 indicate negative selection, neutral evolution and positive selection, respectively [62].

GO, KEGG annotation analysis and subcellular localization prediction
The 188 PnCYP450 protein sequences were annotated using the Blast2GO software [63,64]. The annotation results were obtained with following parameters: e-value was 1.0E-6, annotation cutoff was 55, Go weight was 5. KAAS website (https:// www. genome. jp/ kaas-bin/ kaas_ main) and BlastKOALA website (https:// www. kegg. jp/ blast koala/) were used to perform KEGG annotation [65]. On the KAAS website, the BBH (bi-directional best hit) assignment method was used. And A.thaliana, P. trichocarpa and Vitis vinifera were selected as GENES data set. The family_eukaryotes genus_prokaryotes database was used on the BlastKOALA website. The subcellular localization of the PnCYP proteins was predicted using CELLO v.2.5 (http:// cello. life. nctu. edu. tw/) [66].

Estimation of functional divergence
The DIVERGE 3.0 software was used to study whether functional divergence occurred among the CYPs subfamily members [67]. There were two main types of functional divergence: Type I represented the evolutionary rate change between replicated genes, where a site that was conserved in one gene but changed in another gene, and Type II represented the evolutionary nature change of replicated genes [68,69]. If the correlation coefficients (θ I and θ II ) were significantly greater than 0, it means that functional divergence has occurred. We then calculated posterior probability value (Q k ), and when a site with a Q k value greater than 0.67, it can be defined as the functional divergence-related site. SPSS software was used for a t-value test to analyze the significant difference of the posterior probability value, and the significance threshold set as *P < 0.05.

Expression and co-expression network and analysis of PnCYP450 genes treated with different forms of nitrogen fertilizers
To study the expression levels of PnCYP450 genes when P. notoginseng roots were treated with different forms of nitrogen fertilizers ( [15AN]), the RNAseq data were downloaded from NCBI GEO DataSet (https:// www. ncbi. nlm. nih. gov/ gds/) with accession number GSE112437 and NCBI Sequence Read Archive (SRA) database (https:// www. ncbi. nlm. nih. gov/ sra) with accession number SRP136626. Genes with fold change (FC) greater than 1.5 and significant p-values < 0.05 were defined as differentially expressed genes [70]. The FPKM values of PnCYPs and other genes which participated in hormone synthesis were used for heat map generation by TBtools software. And scale methods were set as a row scale in the heat map drawing. Co-expression networks were performed using the OmicStudio tools (https:// www. omics tudio. cn/ index), and p-values < 0.05 were considered statistically significant.

Plant material and real-time PCR analysis
Growing two-year-old P. notoginseng was collected from the Wenshan County, Yunnan Province, China. For plant hormone treatment, P. notoginseng leaves were treated with 10 μM ABA and 100 μM MEJA. These treated P. notoginseng plants were grown in greenhouses under 14/10 h of light/dark conditions at a continuous temperature of 23 ± 1 °C. Total RNA of untreated plant roots, stems, leaves flowers and hormone-treated leaves were extracted using Trizol (Sangon Biotech, Shanghai, China) according to the manufacturer's instructions. First-strand cDNA was synthesized by a Prime-Script ™ RT Reagent Kit (TaKaRa, Tokyo, Japan) according to the manufacturer's instructions. Primer Premier 5.0 was used to design the specific primers (Table S13) required in real-time PCR experiments, 26S RNA was used as an internal reference [71]. In qRT-PCR, 20 μL reaction volume contained 10 μL 2 × ApexHF FS PCR Master Mix, 0.4 μL forward primer, 0.4 μL reverse primer, 1 μL cDNA template and 8.2 μL ddH2O. qPCR reaction program was set as follows: 98 °C for 10 s; 35 cycles of 55 °C for 10 s and 72 °C for 10 s. GraphPad 8.0 software was used for data analysis.