Transcriptome analysis of diploid and triploid Populus tomentosa

Triploid Chinese white poplar (Populus tomentosa Carr., Salicaceae) has stronger advantages in growth and better stress resistance and wood quality than diploid P. tomentosa. Using transcriptome sequencing technology to identify candidate transcriptome-based markers for growth vigor in young tree tissue is of great significance for the breeding of P. tomentosa varieties in the future. In this study, the cuttings of diploid and triploid P. tomentosa were used as plant materials, transcriptome sequencing was carried out, and their tissue culture materials were used for RT-qPCR verification of the expression of genes. The results showed that 12,240 differentially expressed genes in diploid and triploid P. tomentosa transcripts were annotated and enriched into 135 metabolic pathways. The top six pathways that enriched the most significantly different genes were plant-pathogen interaction, phenylpropanoid biosynthesis, MAPK signalling pathway-plant, ascorbate and aldarate metabolism, diterpenoid biosynthesis, and the betalain biosynthesis pathway. Ten growth-related genes were selected from pathways of plant hormone signal transduction and carbon fixation in photosynthetic organisms for RT-qPCR verification. The expression levels of MDH and CYCD3 in tissue-cultured and greenhouse planted triploid P. tomentosa were higher than those in tissue-cultured diploid P. tomentosa, which was consist ent with the TMM values calculated by transcriptome.


INTRODUCTION
Populus tomentosa is an important timber and ecological tree species in northern China, and it is one of the most important fast-growing native tree species in farmland shelterbelt, timber forest, and landscape tree from North China to Northwest China (Fan et al., 2005;Li et al., 2020). In terms of wood, it has good wood quality (with long fiber and white wood), high-fiber content, good paint, and cementation properties. In terms of growth characteristics, it has fast growth, long life (compared to other trees in northern China), drought and salinity resistance, strong smoke and pollution resistance, and a crown shape (Ci et al., 2019). Based on its excellent characteristics and the needs of ecological and economic construction, researchers have paid more and more attention to the field of forest breeding in China (Zong et al., 2019).
At present, due to the problems of high genetic heterozygosity and the long breeding cycle of forest trees, the genetic improvement process of conventional cross-breeding is seriously hindered, and the productivity of varieties used in forestry production is difficult to meet the needs of industrial development for wood, biomass, paper, fuel and biomaterials (Harfouche et al., 2012;Zong et al., 2019). An in-depth study of the biological basis of wood formation and analysis of the genetic regulation mechanism of wood formations was essential to accelerate the process of improved forest varieties (Lin et al., 2017). The study of tree transcriptomes enables us to analyze the genetic basis of the formation of tree traits (Hao et al., 2011;Zhang et al., 2019;Sun et al., 2019). Because of long cycles, breeding of new varieties with expected traits cannot be rapidly achieved through traditional crossing methods alone (Harfouche et al., 2012). The combination of conventional breeding and modern breeding techniques represented by genomics is the main trend in the field of forest breeding in the future (Harfouche et al., 2012). Combining conventional breeding methods with transcriptome sequencing technology to screen tree growth-related genes is not only a breakthrough in variety selection but also a new field of exploration, and to identify candidate transcriptome-based markers for growth vigor in young tree tissue is very important (Hao et al., 2011;Sun et al., 2019).
Polyploidy is important for the evolution of plants (Sattler, Carvalho & Clarindo, 2016;Liu & Sun, 2019). RNA interference and dosage compensation in a polyploid cell often leads to epigenetic changes as it alters the gene expression levels (Osborn et al., 2003;Soltis, Soltis & Tate, 2004). Many studies have found significant differences between the expression levels of genes in diploid and polyploid plants (Osborn et al., 2003;Gutierrez-Gonzalez & Garvin, 2017;Li et al., 2019). Polyploid plants often exhibit commercially beneficial qualities, e.g., increased vigour, improved product quality, enlarged organs, enhanced tolerance to both biotic and abiotic stresses and increased heterozygosity and heterosis, in contrast to their diploid relatives (Sattler, Carvalho & Clarindo, 2016;Liu & Sun, 2019). Polyploidy often results in downregulated fertility because the expression of fertility-related genes is lower than that of compared to their diploid relatives (Li et al., 2019). Developing polyploidy in plants has been the focus of many plant breeders for some time (Huang, Li & Cong, 1990;Bancroft et al., 2011;Rambani, Page & Udall, 2014;Li et al., 2019;Shenton et al., 2020).
There are many advantages of triploid P. tomentosa plants compared with diploid ones (Zhu et al., 1995;Chen et al., 2017). Studies have shown that triploid P. tomentosa is superior to diploid P. tomentosa in both volumes per plant and papermaking (Chen et al., 2017;Li & Zhang, 2000). P. tomentosa has a great advantage in the construction of shelterbelt and fast-growing and high-yield forest (Li et al., 2020). However, the related studies on the differences in growth between diploid and triploid P. tomentosa are mostly focused on phenotypic studies, while there are few studies on the genes that control the differences in their growth characteristics. In this study, the transcriptome analysis was subjected to an analysis of the differences in gene expression between diploid and triploid P. tomentosa, which derived from genome doubling and hybridization with different genotypes (Zhang et al., 2008), and RT-qPCR analysis was used for verification of the results of the transcriptome analysis. It would lay a foundation to screen growth-related genes for providing high-quality, fast-growing industrial timber and speeding up the genetic improvement of forest trees.

Materials
Diploid and triploid P. tomentosa root sprouts were collected in Kunming World Horticultural Expo Park and on the campus of Southwest Forestry University in May 2019. These sites were adjacent to each other. The triploid poplar [(P. tomentosa × P. bolleana) × P. tomentosa] was produced from Beijing Forestry University (Zhang et al., 2008), whereas the diploid plants were native to the area. To obtain samples of the same physiological age, about one-year-old (variability in their ages about one month) well-grown root sprouts were selected and sampled at 9 a.m. The leaves were removed, and about one gram of stem segments under the fifth leaf down from the tip of the branch was sent to Wuhan Huada Gene Company for transcriptome sequencing (Ye et al., 2020). Among them, diploid P. tomentosa stem segments were used as the control group (D1, D2, D3), and triploid P. tomentosa stem segments were considered as the treatment group (T1, T2, T3). D1, D2, D3, T1, T2 and T3 were used to represent biological replicates.

Sequencing data filtering
RNA was extracted using the Tiangen kit (TIANGEN Biotech Beijing Co., Ltd.), which was produced by the Wuhan Huada Gene Company. The library was generated according to the operation manual of the NEBNext R Ultra TM RNA Library Prep Kit for Illumina R (NEB, USA). The samples were sequenced using the Bgiseq-500 platform. The data were pairedend, and fastq formatted files were used at the beginning of the analysis. The data was filtered using Trimmomatic v0.36 software (Bolger, Lohse & Usadel, 2014), and Soapnuke v1.4.0 (Chen et al., 2018) was used for statistics. A trimming of the polyA tails was conducted, and a cut-off threshold of Q score 30 was chosen and bases with Q-score less than 30 were trimmed.

Gene annotation
After obtaining clean reads, Bowtie2 v2.2.5 (https://sourceforge.net/projects/bowtiebio/files/bowtie2/2.2.5/) was used to compare clean reads to P. trichocarpa reference genome (Tuskan et al., 2006) (http://popgenie.org/start?genelist=close). Then RSEM was used to calculate the expression levels of genes and transcripts (Li & Dewey, 2011). Seven major functional database annotations-KEGG Ortholog, KO; Gene Ontology, GO; NCBI nonredundant protein sequences, NR; NCBI nucleotide sequences, NT; A manually annotated and reviewed protein sequence database, SwissProt; Protein family, PFAM; and Clusters of Orthologous Groups of proteins, KOG-were performed on the assembled UniGenes. BLAST was used for the functional annotation of NT, Diamond was used for NR, KOG, SwissProt and KEGG, Hmmscan for PFAM and Blast2GO for GO, with the e-value of 1e −10 (Feng et al., 2019).

Detection of differentially expressed genes (DEGs)
Using the transcriptome data of D1, D2, D3, T1, T2 and T3, DESeq2 software (Anders & Huber, 2010) was employed to screen DEGs with parameters of Fold Change 2 and adjusted P-value ≤ 0.001. The genes with a fold of more than twice in triploid P. tomentosa compared to diploid P. tomentosa and Q-value ≤ 0.001 were screened as differentially expressed genes. QUANT (https://www.quantsoftware.com/) was used to normalise and reduce the variance and help detect DEGs. The transcripts were filtered to show the minimum count before assessing the differential expression. TMM values were calculated by using the edgeR software (Maza, 2016).

Analysis of gene ontology (GO) function and KEGG function of DEGs
GO annotations were performed using nucleotide data. According to the GO annotation and KEGG annotation results, as well as the functional annotation, the differential genes were functionally classified. The detection of transcription factor families was based on these annotations. Meanwhile, the phyper function (Evans, Hastings & Peacock, 2000) in R software was used for enrichment analysis, the p-values were calculated and then corrected by False Discovery Rate (FDR) (Burger, 2018). Finally, the function with Q-value ≤0.05 was deemed significant enrichment. The test conducted was one tailed for enrichment.
GetORF was used to detect the ORF of UniGene, while HMMER 3.0 hmmsearch was used to compare the ORF. The characteristics of the transcription factor family were then analysed by PlantTFDB (Sanseverino et al., 2010). The gene alignments were annotated into the plant-disease resistance gene database (PRGdb) by using DIAMOND (https://github.com/bbuchfink/diamond) v0.8.31 software (Buchfink, Xie & Huson, 2015).
The candidate coding region in UniGenes was identified by TransDecoder v2.0.1 software, and then the PFAM protein homologous sequence was examined by BLAST alignment SwissProt database and Hmmscan 3.0 (http://hmmer.org), thus predicting the coding region CDSs. MISA (http://pgrc.ipk-gatersleben.de/misa) was used to detect simple sequence repeats (SSR) in UniGene with the parameters set as default (Sen et al., 2018).

Tissue culture of diploid and triploid P. tomentosa
The leaves of diploid and triploid P. tomentosa were inoculated on a callus induction medium (MS+1.2 mg/l 6-BA+0.6 mg/l NAA) for 15 days. After which, they were transferred to an aseptic differentiation medium (MS+1.0 mg/l 6-BA+0.4 mg/l NAA) for 28 days. Once the adventitious buds had grown by two to three cm, they were moved to a rooting medium (1/2 MS + 0.4mg/L IBA). After the adventitious roots had grown by two to three cm, the seedlings were then moved to a greenhouse.

Real-time quantitative polymerase chain reaction (RT-qPCR) validation
To verify the reliability of transcriptome sequencing results and the expression of key genes, diploid and triploid Chinese white poplar tissue culture seedlings of the same age and growth conditions, which were derived from the same trees used for transcriptome, tissue culture plants and planted in the greenhouse, at 1 month, 4 months, 7 months, 10 months and 13 months old plants, were used for RT-qPCR validation. As in transcriptome sequencing, stem segments under the fifth leaf down from the tip of a branch were selected for validation. The method of tissue culture was carried out according to the method reported by Hu et al. (2005). The stem cuttings were sampled at 9 o'clock in the morning. Ten growth-related genes were selected from the pathway of plant hormone signal transduction (Ko04075), carbon fixation in photosynthetic organisms (Ko00710), nitrogen metabolism (Ko00910) and tryptophan metabolism (Ko00380) for RT-qPCR validation. These genes were up-regulated in triploid P. tomentosa compared to diploid P. tomentosa, and taken as candidate markers for plant breeders in the future. Using the EF1a as an internal reference gene, and using the gene sequences of the transcriptome, the primers were designed using Primer primer 5 (Lalitha, 2000), as listed in Table 1. The RT-qPCR processes were described according to Li et al. (2019) on a Bio-Rad CFX96TM Real-time PCR Detection System (Bio-Rad, California, USA), with a final volume of 25.0 µl, containing 2.5 units of Taq DNA polymerase, 0.4 mM deoxyribonucleotides (dNTPs), 20 µl of ddH 2 O, 1 µl of cDNA, and 500 nM of each primer. Each sample was carried out in triplicate for the RT-qPCR reactions. The data were normalised with the housekeeping gene EF1α, and the method was employed in line with the previous study (Liu et al., 2018). It could be seen that the annotations obtained in NT (NBCI nucleic acid sequence database) were the most, 93.04%; the overall annotation rate was 93.81%; and the lowest was 65.12% in the Swissprot database. A total of 86.43% of the annotations were obtained in the NR (NCBI protein database). According to the specific species distribution chart of the annotations, a total of five species were matched. The highest of the first three were P. trichocarpa, with 53.95% of the annotations, followed by P. euphratica, with 35.68% of the annotations, and finally, P. tomentosa, with 3.26% annotation.

Transcription factor (TF) prediction
The TF prediction results showed that the genes belonged to a total of 55 transcription factor families, of which, with the largest number of genes, was the MYB gene family, with a total of 438 genes involved in the expression, followed by the AP2-EREBP gene family with 290 genes, and finally the bHLH gene family with 239 genes (see Fig. 2). Among these transcription factor families, both the MYB family and the WRKY family (including 208 genes) are involved in plant growth and development processes, which can provide relevant information for our subsequent screening of growth-related genes.

Cluster analysis and GO classification of DEGs
The number of up-regulated genes in diploid P. tomentosa compared with triploid P. tomentosa was 15,690 and the down-regulated gene was 16,971. The scatter plot of DEGs showed that the difference of transcriptional profiles between triploid and diploid samples was obvious (see Fig. 3). The GO function was divided into three branches: molecular function, cellular component, and biological process. Figure 4 shows the functional classification based on differential gene detection.
A total of 22,375 differentially expressed genes that had GO annotations were obtained in GO classification entries by using the classification of 32,661 common differential genes. There were 13,720 DEGs in biological processes, including 6,481 up-regulated genes and 6,879 down-regulated genes; 15,963 DEGs in cell composition, including 7,687 up-regulated genes and 8,276 down-regulated genes; and 1,092 DEGs in molecular function, including 8,553 up-regulated genes and 9,539 down-regulated genes.
Among the three branches of GO function entries, the number of DEGs of binding was the largest in molecular function, the number of DEGs of the cell and cell part was the largest in the cellular component, and the number of DEGs of the cellular process was the largest in the biological process. There are 44 DEGs that belonged to lignin production-related transcripts (see Table 2), which are important for tree breeders.
A total of 38 entries were enriched in GO function (Q ≤ 0.05) (see Table 3). Biological processes accounted for 36.8% of the total, of which catalytic activity accounted for the largest proportion of biological processes, at 42.86%; cell components accounted for 18.5% of the total, of which cell accounted for 42.85%; and molecular functions accounted for 44.7% of the total, of which catalytic activity accounted for 47.05%.

Analysis of pathway function of DEGs
Using Q-value ≤ 0.05 as the standard, 32,661 differential genes were separately analysed by pathway enrichment using the KEGG database. There were 16 significantly enriched KEGG metabolic pathways. Among them, the most frequently occupied pathways were the metabolism branch, with 13, and a total of 19,469 DEGs involved; the second was the biological system branch, with two, and a total of 1,447 DEGs involved; and finally, the environmental information process branch, with one, with 868 DEGs involved.
The results showed that there were 12,240 significantly different genes annotated in the KEGG pathways of diploid and triploid samples. The top differentially expressed genes significantly enriched six pathways were as follows: 1,115 differential gene expression in the plant-pathogen interaction (KO: ko04626) pathway; 396 differential genes in the phenylpropane biosynthesis (KO: ko00940) pathway; 868 differential genes in the mitogen-activated protein kinase (MAPK) plant signalling pathway (KO: ko04016); 170 differential genes expression in the ascorbate and aldarate metabolism pathway (KO: ko00053); 64 differential genes in the diterpenoid biosynthesis pathway (KO: ko00904); and 43 differential genes in the beets red pigment biosynthesis pathway (KO: ko00965). All differentially expressed genes significantly enriched pathways were shown in Table 4. The results showed that there were 496 DEGs upregulated and 619 downregulated in the plant-pathogen interaction pathway, and that there were 864 DEGs enriched in the organismal systems category of GO annotations, 217 DEGs in environmental information processing, 25 DEGs in genetic information processing and nine DEGs in metabolism. The differential genes in the plant-pathogen interaction (KO: ko04626) pathway could be enriched in 46 GO function entries. These GO function entries are integral components to fungus and ribosome biogenesis in the following components: the membrane, signal transduction, defence response, DNA-templated transcription, calcium ion binding, ADP binding, regulation of membrane potential, cell surface receptor signalling pathway, abscisic acid-activated signalling pathway, protein phosphorylation, peptidyl-serine phosphorylation, lipid metabolic process, intracellular signal transduction, proteolysis, protein autophosphorylation, ATP binding, plasma membrane, mRNA transcription, protein folding, 2-alkenal reductase [NAD(P)] activity, activation of protein kinase activity, kinase activity, regulation of mitotic cell cycle, calcium ion homeostasis, signal transduction by protein phosphorylation, intracellular, response to stress, membrane, extracellular region, metal ion transport, primary miRNA processing, stress-activated protein kinase signalling cascade, cellular transition metal ion homeostasis, response to temperature stimulus, calmodulin-dependent protein kinase activity, mitochondrion, endosome, protein kinase activity, developmental process involved in reproduction, DNA-templated regulation of transcription, plant-type hypersensitive response, defence response to bacterium, incompatible interaction, lipid catabolic process, pre-miRNA processing and defence response. Among them, the top five entries were 184 differential genes in the integral component of membrane, 174 in the signal transduction, 145 in defence response, 128 in DNA-templated transcription and 53 in calcium ion binding. In addition, there were 104 differential genes in the plant-pathogen interaction pathway that had zero expression in the triploid poplar.
In the process of analysing the growth-related pathways, it was found that the growthrelated genes were up-regulated and down-regulated, so it was difficult to explain the difference between diploid and triploid samples. The NCBI database was used for gene information annotation; the details are shown in Table 5.

RT-qPCR validation
Genes related to growth which up-regulated in the triploid P. tomentosa compared to diploid P. tomentosa, as candidate markers for plant breeders in the future, were selected for RT-qPCR validation. There was no peak in the dissolution curves of RT-qPCR  products of GH3, CA, and A-ARR gene. And the other seven genes showed a single peak curve, indicating that their amplification products did not contain primer dimers or nonspecific amplification products, and each primer PCR reaction was specific. The results of comparison of the transcriptome analysis and the RT-qPCR analysis are shown in Fig. 5A. In diploid plants, the expression levels of the SAUR, FDP, ALDH, AUX1 and ABF in the RT-qPCR analysis were significantly higher than those in the transcriptome analysis. In triploid plants, the expression levels of the ALDH and AUX1 in the RT-qPCR analysis were significantly lower than those in the transcriptome analysis, whereas the expression levels of the malate dehydrogenase (MDH ) and CYCD3 in the RT-qPCR analysis were significantly higher than those in the transcriptome analysis. The expression levels of AUX1, CYCD3, and MDH in tissue-cultured triploid poplar were higher than those of tissue-cultured diploid samples, which were consistent with the changes of TMM values calculated by transcriptome. Among them, the expression levels of the AUX1 gene in triploid samples were significantly higher than those in diploid samples (about 6.88 times, logFC value). However, the expression of the ABF gene in diploid samples was higher than that of triploid samples, which was not in accordance with the TMM values. In further experiments, the expression levels of MDH and CYCD3 in tissue-cultured and greenhouse planted triploid poplar were significantly higher than those of tissue-cultured and greenhouse planted diploid samples. And the expression levels of MDH and CYCD3 increased with the age gradually (Figs. 5B, and 5C).

DISCUSSION
The transcriptome analysis of the new rooting stem segments of diploid and triploid P. tomentosa showed that most of the significantly different genes were concentrated in plant-pathogen interaction, phenylpropane biosynthesis pathway, and MAPK signalling pathway-plant. Under the condition of comprehensive screening of the GO function enrichment and KEGG function analysis of transcriptome data, it is difficult to determine the expression of specific genes when the genes associated with plant growth appear up-regulated and down-regulated. Huang, Li & Cong (1990) shown there to be larger branches, leaves and fruits in the triploid variety of pear when compared to the diploid variety. And the production of tetraploid radish is 20% higher than that of ordinary diploids (Liu, Wang & Yan, 2003). Zhu et al. (1995) reported that allotriploidy of P. tomentosa had  greater values than diploidy under the same growth conditions in tree height, diameter at breast height, and single plant volume at the age of eight years. It is interesting that although the transcriptome of the diploidy and triploidy of P. tomentosa was analysed here, according to the annotations of the specific species distribution chart, only 3.26% was annotated by P. tomentosa. This may partly be due to the fact that the triploid materials used in this study were obtained by the hybridised P. bolleana, and some of the mRNAs did not belong in P. tomentosa. Of course, it may also partly be due to the fact that the plant stems were taken from annual plants in this study, which is different to previous studies on P. tomentosa. Previous studies on the transcriptome analysis of P. tomentosa were mostly based on aseptic seedlings, root one-month-old stem and leaf sample materials Wang, Wu & Bo, 2018), and the expression of mRNA was organ-specific (Ohtsuki et al., 2005).
The phytohormone signal transduction pathway controls plant cell division, cell elongation, cell enlargement, and stem elongation, which is closely related to plant growth and development (Guo et al., 2020). Compared with previous data of transcriptome analysis on phytohormone signal transduction, we selected AUX1, GH3, A-ARR, CYCD3, ABF and five other genes for the RT-qPCR validation. Huge differences were found between the expression levels of the selected genes in the RT-qPCR analysis and those of the transcriptome analysis, and we inferred that it might be due to the different growth stages of the experiment's materials. Photosynthesis is an important metabolic process in plants, and its strength has an important effect on plant growth, development, and stress resistance. Li & Zhang (2000) found that the lower leaves of the fast-growing triploid P. tomentosa clones could maintain a higher photosynthetic rate when measuring the leaf net photosynthetic rate of diploid and triploid P. tomentosa. Photosynthesis includes a series of complex reactions in which carbon fixation is a central link in the regulation of photosynthesis (Feng et al., 2006). We compared the data of carbon fixation pathways in previous photosynthetic organism kiwifruit (Li et al., 2019) and selected the up-regulated malate dehydrogenase (MDH ) for the RT-qPCR validation, and the expression levels of MDH gene were up-regulated in triploid P. tomentosa plants, compared to the diploid ones. With the possible exception of the stomatal dimension, the response to polyploidy can be very variable and complex. It has been proved in oilseed rape (Bancroft et al., 2011), sugarcane (Manners & Casu, 2011), cotton (Rambani, Page & Udall, 2014), wheat  (Leach et al., 2014), kiwifruit (Li et al., 2019), rice (Shenton et al., 2020) and numerous studies in Arabidopsis species, some of which specifically consider triploidy (Fort et al., 2016;Fort et al., 2017;Pacey, Maherali & Husband, 2019). Related to the phenomenon, the gene expression levels were not upregulated in any polyploid plant, and over-, underor mixed-expression of genes were found in the polyploid plant (Osborn et al., 2003;Gutierrez-Gonzalez & Garvin, 2017;Li et al., 2019). Some genes in polyploid plants were upregulated while others were downregulated compared to diploid plants, which was also found in this study. It illustrated that the allopolyploidy -in particular in P. tomentosa -also alter the gene expression profile and levels as well as those in autopolyploid plants, compared to their diploid relatives. In this study, the expression levels that related to the growth genes such as MDH and CYCD3 in triploid P. tomentosa were higher than those of diploid P. tomentosa. MDH is mainly involved in the metabolism of plant photosynthesis (Sawada et al., 2002). The main function of the protein encoded by the MDH gene is to control the carbon dioxide levels during photosynthesis (Sawada et al., 2002). Higher expression levels of the MDH gene was proved to be related to higher rates of photosynthesis (Kandoi, Mohanty & Tripathy, 2018), which in turn contributes to higher timber yield. CYCLIN D3 (CYCD3) is a cell-cycle gene, and overexpression of CYCLIN D3;1 (CYCD3;1) in transgenic plants can increase mitotic cycles and reduce endocycles (Menges et al., 2006). CYCD3 was found to regulate cambial cell proliferation and secondary growth, and the protein encoded by the CYCD3 gene is required for normal vascular development in Arabidopsis (Collins, Maruthi & Jahn, 2015). Previous studies have indicated a relationship between faster 1, 1 month old; 2, 4 months old; 3, 7 months old; 4, 10 months old; 5, 13 months old. **, the difference is very significant (P-value < 0.01). RT-qPCR was performed on 3 diploid and 3 triploid plants, which were driven from the same tree used for transcriptome analysis, normalized with housekeeping gene EF1α, repeated 3 times. The 2 (− Ct) method was utilized to process the data (Liu et al., 2018). Because the difference of the values in (A) was too large, the y-axis of (A) was changed to a log scale.
Full-size DOI: 10.7717/peerj.10204/ fig-5 growth and the increased expression of the CYCD3 gene. Hence, the upregulation of MDH and CYCD3 in the triploids is biologically meaningful. There were many DEGs between the diploid and triploid poplar plants that were enriched for a plant-pathogen interaction pathway, stress resistance and several growthrelated transcripts too. Genes in the plant-pathogen interaction pathway are known to be diverse and large in plants and are known to be involved in reproduction isolation across Arabidopsis and Poplars too (Liao et al., 2014;Qian et al., 2018). In this study, one differential gene of the developmental process involved in reproduction was also found in the plant-pathogen interaction pathway. Further, the enrichment of differential expression transcripts in this category could provide useful information to tree breeders who intend to generate heterotic F1s. The pathway enrichment approach also shows several other interesting candidates that would be worth elaborating on in the future relating to studies demonstrating higher stress resistance in the triploid poplar.

CONCLUSION
A total of 32,661 DEGs were identified in triploid and diploid Chinese white poplar, of which 15,690 were up-regulated and 16,971 were down-regulated in triploidy compared to diploidy. Through the comprehensive analysis of GO functional enrichment analysis and the pathway functional annotation of transcriptome data of diploid and triploid P. tomentosa, no significantly enriched entries and pathways related to growth were found. Compared to diploidy, the growth-related genes were found to be up-regulated and downregulated in the natural diploid and triploid P. tomentosa trees. Although the expression levels of genes were unstable in the different environments and different growth stages, the expression levels of MDH and CYCD3 in triploid P. tomentosa were higher than those of diploid P. tomentosa in young tree tissue, which was consistent with the values calculated using the transcriptome data.