Abstract
This study focused on the construction of a database of transposable elements (TEs) from Rosaceae plants, the third most economically important plant family in temperate regions, and its transcriptomics applications. The evolutionary effects of TEs on gene regulation have been explored, and TE insertions can be the molecular bases of changes in gene structure and function. However, a specific Rosaceae plant TE database (RPTEdb) is lacking. The genomes of several Rosaceae plants have been sequenced, providing the opportunity to mine TE data at a whole-genome level. Therefore, we constructed the RPTEdb, a collective and comprehensive database of 19,596 annotated TEs in the genomes of Rosaceae plants using previously described identification and annotation methods and published genome sequences. The user-friendly web-based database provides access to research tools through hyperlinks, including Browse, TE tree, tools, JBrowse, and search sections, and through the inputting of sequences on the main webpage. Next, we performed one advanced application in which TEs near predicted long non-coding RNA (lncRNA) and mRNA domains within white and red petal-tissue transcriptomes of Prunus mume ‘Fuban Tiaozhi’ were identified, revealing 16 TEs that overlapped or were near 16 differentially expressed lncRNA domains, and 54 TEs that overlapped or were near 54 differentially expressed mRNA domains, and the TEs’ possible functions were also discussed. We believe that the RPTEdb will contribute to the understanding of TE roles in the structural, functional and evolutionary dynamics of Rosaceae plant genomes.
Similar content being viewed by others
Abbreviations
- TE:
-
Transposable element
- lncRNA:
-
Long non-coding RNA
- mRNA:
-
Messenger RNA
- LTR:
-
Long terminal repeat
- MITE:
-
Miniature inverted repeat transposable element
- RPTEdb:
-
Rosaceae plant transposable elements database
- HMM:
-
Hidden Markov model
- WT:
-
White petal tissues
- RT:
-
Red petal tissues
- FPKM:
-
Fragments per kilobase of transcript per million fragments
References
Bao ZR, Eddy SR (2002) Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res 12(8):1269–1276
Bao WD, Kojima KK, Kohany O (2015) Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA 6(1):11
Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer ELL (2002) The Pfam protein families database. Nucleic Acids Res 30(1):276–280
Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27(2):573–580
Bonchev G, Parisod C (2013) Transposable elements and microevolutionary changes in natural populations. Mol Ecol Resour 13(5):765–775
Cheng KC, Stromvik MV (2008) SoyXpress: A database for exploring the soybean transcriptome. BMC Genom 9(1):368
Chopra S, Brendel V, Zhang JB, Axtell JD, Peterson T (1999) Molecular characterization of a mutable pigmentation phenotype and isolation of the first active transposable element from sorghum bicolor. Proc Natl Acad Sci USA 96(26):15330–15335
Dong Q, Schlueter SD, Brendel V (2004) PlantGDB, plant genome database and analysis tools. Nucleic Acids Res 32(90001):354–359
Duvick J, Fu A, Muppirala UK, Sabharwal M, Wilkerson MD, Lawrence CJ, Lushbough C, Brendel V (2007) PlantGDB: a resource for comparative plant genomics. Nucleic Acids Res 36(Database issue):959–965
Eddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7(10):e1002195
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797
Edgar RC, Myers EW (2005) PILER: identification and classification of genomic repeats. Bioinformatics 21(Suppl 1):152–158
Feschotte C, Jiang N, Wessler SR (2002) Plant transposable elements: where genetics meets genomics. Nat Rev Genet 3(5):329–341
Guttman M, Garber M, Levin JZ, Donaghey J, Robinson JT, Adiconis X, Fan L, Koziol MJ, Gnirke A, Nusbaum C, Rinn JL, Lander ES, Regev A (2010) Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol 28(5):503–510
Han YJ, Wessler SR (2010) MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res 38(22):e199
Hirochika H (2001) Contribution of the Tos17 retrotransposon to rice functional genomics. Curr Opin Plant Biol 4(2):118–122
Huang CR, Burns KH, Boeke JD (2012) Active transposition in genomes. Annu Rev Genet 46(1):651–675
Hutchins AP, Pei DQ (2015) Transposable elements at the center of the crossroads between embryogenesis, embryonic stem cells, reprogramming, and long non-coding RNAs. Chin Sci Bull 60(20):1722–1733
Jia H, Osak M, Bogu GK, Stanton LW, Johnson R, Lipovich L (2010) Genome-wide computational identification and manual annotation of human long noncoding RNA genes. RNA 16(8):1478–1487
Jiang N, Bao ZR, Zhang XY, Eddy SR, Wessler SR (2004) Pack-MULE transposable elements mediate gene evolution in plants. Nature 431(7008):569–573
Johnson R, Guigó R (2014) The RIDL hypothesis: transposable elements as functional domains of long noncoding RNAs. RNA 20(7):959–976
Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J (2005) Repbase update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110:462–467
Kannan S, Chernikova D, Rogozin IB, Poliakov E, Managadze D, Koonin EV, Milanesi L (2015) Transposable element insertions in long intergenic non-coding RNA genes. Front Bioeng Biotechnol 3:71
Kapusta A, Kronenberg Z, Lynch VJ, Zhuo XY, Ramsay L, Bourque G, Yandell M, Feschotte C (2013) Transposable elements are major contributors to the origin, diversification, and regulation of vertebrate long noncoding RNAs. PLoS Genet 9(4):e1003470
Kashkush K, Khasdan V (2007) Large-scale survey of cytosine methylation of retrotransposons and the impact of readout transcription from long terminal repeats on expression of adjacent rice genes. Genetics 177(4):1975–1985
Kashkush K, Feldman M, Levy AA (2003) Transcriptional activation of retrotransposons alters the expression of adjacent genes in wheat. Nat Genet 33(1):102–106
Kelley D, Rinn J (2012) Transposable elements reveal a stem cell-specific class of long noncoding RNAs. Genome Biol 13(11):R107
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14(4):R36
Komatsu S, Wang X, Yin XJ, Nanjo Y, Ohyanagi H, Sakata K (2017) Integration of gel-based and gel-free proteomic data for functional analysis of proteins through Soybean Proteome Database. J Proteom 163:52–66
Kong L, Zhang Y, Ye ZQ, Liu XQ, Zhao SQ, Wei LP, Gao G (2007) CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res 35(Web Server issue):345–349
Kronmiller BA, Wise RP (2008) TEnest: automated chronological annotation and visualization of nested plant transposable elements. Plant Physiol 146(1):45–59
Lai JS, Li YB, Messing J, Dooner HK (2005) Gene movement by Helitron transposons contributes to the haplotype variability of maize. Proc Natl Acad Sci USA 102(25):9068–9073
Li SF, Zhang GJ, Zhang XJ, Yuan JH, Deng CL, Gu LF, Gao WJ (2016) DPTEdb, an integrative database of transposable elements in dioecious plants. Database 2016:baw078
Lisch D (2013) How important are transposons for plant evolution. Nat Rev Genet 14(1):49–61
Lv J, Liu H, Yu SH, Liu HB, Cui W, Gao Y, Zheng T, Qin G, Guo J, Zeng TB, Han ZB, Zhang Y, Wu Q (2015) Identification of 4438 novel lincRNAs involved in mouse pre-implantation embryonic development. Mol Genet Genom 290(2):685–697
Ma B, Li T, Xiang ZH, He NJ (2015) MnTEdb, a collective resource for mulberry transposable elements. Database 2015:bav004
Macia A, Blanco-Jimenez E, García-Pérez JL (2015) Retrotransposons in pluripotent cells: impact and new roles in cellular plasticity. BBA-Gene Regul Mech 1849(4):417–426
Mao XZ, Cai T, Olyarchuk JG, Wei Liping (2005) Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary. Bioinformatics 21(19):3787–3793
McCarthy EM, McDonald JF (2003) LTR_STRUC: a novel search and identification program for LTR retrotransposons. Bioinformatics 19(3):362–367
McClintock B (1950) The origin and behavior of mutable loci in maize. Proc Natl Acad Sci USA 36(6):344–355
Nishizaki Y, Matsuba Y, Okamoto E, Okamura M, Ozeki Y, Sasaki N (2011) Structure of the acyl-glucose-dependent anthocyanin 5-O-glucosyltransferase gene in carnations and its disruption by transposable elements in some varieties. Mol Genet Genom 286(5–6):383–394
Ponting CP, Oliver PL, Reik W (2009) Evolution and functions of long noncoding RNAs. Cell 136(4):629–641
Price AL, Jones NC, Pevzner PA (2005) De novo identification of repeat families in large genomes. Bioinformatics 21(Suppl 1):351–358
Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang NZ, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer EL, Eddy SR, Bateman A, Finn RD (2012) The Pfam protein families database. Nucleic Acids Res 40(1):290–301
Rho M, Tang HX (2009) MGEScan-non-LTR: computational identification and classification of autonomous non-LTR retrotransposons in eukaryotic genomes. Nucleic Acids Res 37(21):e143
Seberg O, Petersen G (2009) A unified classification system for eukaryotic transposable elements should reflect their phylogeny. Nat Rev Genet 10(4):276
Shulaev V, Sargent DJ, Crowhurst RN, Mockler TC, Folkerts O, Delcher AL, Jaiswal P, Mockaitis K, Liston A, Mane SP, Burns P, Davis TM, Slovin JP, Bassil N, Hellens RP, Evans C, Harkins T, Kodira C, Desany B, Crasta OR, Jensen RV, Allan AC, Michael TP, Setubal JC, Celton JM, Rees DJG, Williams KP, Holt SH, Rojas JJR, Chatterjee M, Liu B, Silva H, Meisel L, Adato A, Filichkin SA, Troggio M, Viola R, Ashman TL, Wang H, Dharmawardhana P, Elser J, Raja R, Priest HD, Bryant DW, Fox SE, Givan SA, Wilhelm LJ, Naithani S, Christoffels A, Salama DY, Carter J, Girona EL, Zdepski A, Wang WQ, Kerstetter RA, Schwab W, Korban SS, Davik J, Monfort A, Denoyes-Rothan B, Arus P, Mittler R, Flinn B, Aharoni A, Bennetzen JL, Salzberg SL, Dickerman AW, Velasco R, Borodovsky M, Veilleux RE, Folta KM (2011) The genome of woodland strawberry (Fragaria vesca). Nat Genet 43(2):109–116
Skinner ME, Uzilov AV, Stein LD, Mungall CJ, Holmes IH (2009) JBrowse: A next-generation genome browser. Genome Res 19(9):1630–1638
Sytnikova YA, Rahman R, Chirn G, Clark JP, Lau NC (2014) Transposable element dynamics and PIWI regulation impacts lncRNA and gene expression diversity in Drosophila ovarian cell cultures. Genome Res 24(12):1977–1990
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511–515
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3):562–578
Velasco R, Zharkikh A, Affourtit JP, Dhingra A, Cestaro A, Kalyanaraman A, Fontana P, Bhatnagar SK, Troggio M, Pruss D, Salvi S, Pindo M, Baldi P, Castelletti S, Cavaiuolo M, Coppola G, Costa F, Cova V, Dal Ri A, Goremykin V, Komjanc M, Longhi S, Magnago P, Malacarne G, Malnoy M, Micheletti D, Moretto M, Perazzolli M, Si-Ammour A, Vezzulli S, Zini E, Eldredge G, Fitzgerald LM, Gutin N, Lanchbury JS, Macalma T, Mitchell JT, Reid J, Wardell B, Kodira CD, Chen Z, Desany B, Niazi F, Palmer M, Koepke T, Jiwan D, Schaeffer S, Krishnan V, Wu C, Chu VT, King ST, Vick J, Tao Q, Mraz A, Stormo A, Stormo K, Bogden R, Ederle D, Stella A, Vecchietti A, Kater MM, Masiero S, Lasserre P, Lespinasse Y, Allan AC, Bus V, Chagné D, Crowhurst RN, Gleave AP, Lavezzo E, Fawcett JA, Proost S, Rouzé P, Sterck L, Toppo S, Lazzari B, Hellens RP, Durel CE, Gutin A, Bumgarner RE, Gardiner SE, Skolnick MH, Egholm M, van de Peer Y, Salamini F, Viola R (2010) The genome of the domesticated apple (Malus × domestica Borkh.). Nat Genet 42(10):833–839
Verde I, Abbott AG, Scalabrin S, Jung S, Shu S, Marroni F, Zhebentyayeva T, Dettori MT, Grimwood J, Cattonaro F, Zuccolo A, Rossini L, Jenkins J, Vendramin E, Meisel LA, Decroocq V, Sosinski B, Prochnik S, Mitros T, Policriti A, Cipriani G, Dondini L, Ficklin S, Goodstein DM, Xuan PF, Fabbro CD, Aramini V, Copetti D, Gonzalez S, Horner DS, Falchi R, Lucas S, Mica E, Maldonado J, Lazzari B, Bielenberg DG, Pirona R, Miculan M, Barakat A, Testolin R, Stella A, Tartarini S, Tonutti P, Arús P, Orellana A, Wells CE, Main D, Vizzotto G, Silva H, Salamini F, Schmutz J, Morgante M, Rokhsar DS (2013) The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat Genet 45(5):487–494
Wang LK, Feng ZX, Wang X, Wang XW, Zhang XG (2010) DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics 26(1):136–138
Wang X, Ai G, Zhang CL, Cui L, Wang JF, Li HX, Zhang JH, Ye ZB (2016a) Expression and diversification analysis reveals transposable elements play important roles in the origin of Lycopersicon-specific lncRNAs in tomato. New Phytol 209(4):1442–1455
Wang ZX, Schwacke R, Kunze R (2016b) DNA damage-induced transcription of transposable elements and long non-coding RNAs in Arabidopsis is rare and ATM-dependent. Mol Plant 9(8):1142–1155
Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, Flavell AJ, Leroy P, Morgante M, Panaud O, Paux E, Sanmiguel P, Schulman AH (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8(12):973–982
Wu J, Wang ZW, Shi ZB, Zhang S, Ming R, Zhu SL, Khan MA, Tao ST, Korban SS, Wang H, Chen NJ, Nishio T, Xu X, Cong L, Qi KJ, Huang XS, Wang YT, Zhao X, Wu JY, Deng C, Gou CY, Zhou WL, Yin H, Qin GH, Sha YH, Tao Y, Chen H, Yang YN, Song Y, Zhan DL, Wang J, Li LT, Dai MS, Gu C, Wang YZ, Shi DH, Wang XW, Zhang HP, Zeng L, Zheng DM, Wang CL, Chen MS, Wang GB, Xie L, Sovero V, Sha SF, Huang WJ, Zhang SJ, Zhang MY, Sun JM, Xu LL, Li Y, Liu X, Li QS, Shen JH, Wang JY, Paull RE, Bennetzen JL, Wang J, Zhang SL (2013) The genome of the pear (Pyrus bretschneideri Rehd.). Genome Res 23(2):396–408
Xiong WW, Li TH, Chen K, Tang KL (2009) Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information. Nucleic Acids Res 37(17):5632–5640
Xiong WW, He LM, Lai JS, Dooner HK, Du CG (2014) HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. Proc Natl Acad Sci USA 111(28):10263–10268
Xu Z, Wang H (2007) LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 35(Suppl 2):265–268
Xu HE, Zhang HH, Xia T, Han MJ, Shen YH, Zhang Z (2013) BmTEdb: a collective database of transposable elements in the silkworm genome. Database 2013:bat055
Xu HY, Nelson ADL, Shippen DE (2015) A transposable element within the non-canonical telomerase RNA of Arabidopsis thaliana modulates telomerase in response to DNA damage. PLoS Genet 11(6):e1005281
Xu ZZ, Liu J, Ni WC, Peng Z, Guo Y, Ye WW, Huang F, Zhang XG, Xu P, Guo Q, Shen XL, Du JC (2017) GrTEdb: the first web-based database of transposable elements in cotton (Gossypium raimondii). Database 2017:bax013
Young MD, Wakefield MJ, Smyth GK, Oshlack A (2010) Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol 11(2):1–12
Zhang QX, Chen WB, Sun LD, Zhao FY, Huang BQ, Yang WR, Tao Y, Wang J, Yuan ZQ, Fan GY, Xing Z, Han CL, Pan HT, Zhong X, Shi WF, Liang XM, Du DL, Sun FM, Xu ZD, Hao RJ, Lv T, Lv YM, Zheng ZQ, Sun M, Luo L, Cai M, Gao YK, Wang J, Yin Y, Xu X, Cheng TR, Wang J (2012) The genome of Prunus mume. Nat Commun 3(176):1318
Acknowledgements
This study was funded by the National Natural Science Foundation of China (No. 31501787), the Fundamental Research Funds for the Central Universities (No. 2016ZCQ02), and the Special Fund for Beijing Common Construction Project.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interests.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by S. Hohmann.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Ma, K., Zhang, Q., Cheng, T. et al. Identification of transposons near predicted lncRNA and mRNA pools of Prunus mume using an integrative transposable element database constructed from Rosaceae plant genomes. Mol Genet Genomics 293, 1301–1316 (2018). https://doi.org/10.1007/s00438-018-1449-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00438-018-1449-y