Skip to main content

Integrated Modeling of Structural Genes Using MCuNovo

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1858))

Abstract

Correct modeling of protein-coding genes based on genome and cDNA data is a prerequisite for functional studies. Various programs such as MAKER, Cufflinks, Oases, and Trinity have been developed, each with advantages and drawbacks. Manual integration of different models for a single gene is cumbersome and becomes a daunting task for 14,000–18,000 genes in a typical holometabolous insect. We developed methods to evaluate the output of MAKER, Cufflinks, Oases and Trinity and select the best models to constitute the MCOT1.0 set for Manduca sexta, a biochemical model insect. To apply these methods in other organisms, we improved the algorithm (designated MCuNovo Gene Selector) and automated the data processing. In this chapter, we describe background information of algorithm development and how to prepare and run this program.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Metzker ML (2010) Sequencing technologies—the next generation. Nat Rev Genet 11(1):31–46

    Article  CAS  Google Scholar 

  2. Koboldt DC et al (2013) The next-generation sequencing revolution and its impact on genomics. Cell 155(1):27–38

    Article  CAS  Google Scholar 

  3. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1):57–63

    Article  CAS  Google Scholar 

  4. Park PJ (2009) ChIP–seq: advantages and challenges of a maturing technology. Nat Rev Genet 10(10):669–680

    Article  CAS  Google Scholar 

  5. Yandell M, Ence D (2012) A beginner's guide to eukaryotic genome annotation. Nat Rev Genet 13(5):329–342

    Article  CAS  Google Scholar 

  6. Holt C, Yandell M (2011) MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491

    Article  Google Scholar 

  7. Trapnell C et al (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3):562–578

    Article  CAS  Google Scholar 

  8. Grabherr M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29(7):644–652

    Article  CAS  Google Scholar 

  9. Schulz M et al (2012) Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics (Oxford, England) 28(8):1086–1092

    Article  CAS  Google Scholar 

  10. Cao X, Jiang H (2015) Integrated modeling of protein-coding genes in the Manduca sexta genome using RNA-Seq data from the biochemical model insect. Insect Biochem Mol Biol 62:2–10

    Article  CAS  Google Scholar 

  11. Korf I (2004) Gene finding in novel genomes. BMC Bioinformatics 5:59

    Article  Google Scholar 

  12. Stanke M, Waack S (2003) Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19(Suppl 2):ii215–ii225

    Article  Google Scholar 

  13. Lomsadze A et al (2005) Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res 33(20):6494–6506

    Article  CAS  Google Scholar 

  14. Haas BJ et al (2008) Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 9(1):1–22

    Article  Google Scholar 

  15. Brown JB et al (2014) Diversity and dynamics of the Drosophila transcriptome. Nature 512(7515):393–399

    Article  CAS  Google Scholar 

  16. Saha S et al (2017) Improved annotation of the insect vector of citrus greening disease: Biocuration by a diverse genomics community. Database 1–20

    Google Scholar 

  17. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31(19):3210–3212

    Article  Google Scholar 

  18. Camacho C et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421

    Article  Google Scholar 

  19. Chang Z et al (2015) Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biol 16:30

    Article  Google Scholar 

  20. Hoff KJ et al (2016) BRAKER1: unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics 32(5):767–769

    Article  CAS  Google Scholar 

  21. Pertea M et al (2016) Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc 11(9):1650–1667

    Article  CAS  Google Scholar 

  22. Liu J et al (2016) BinPacker: packing-based De Novo transcriptome assembly from RNA-seq data. PLoS Comput Biol 12(2):e1004772

    Article  Google Scholar 

Download references

Acknowledgments

This study is supported by NIH grants GM58634 and AI112662. This work was approved for publication by the Director of Oklahoma Agricultural Experimental Station and supported in part under project OKLO2450.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haobo Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Cao, X., Jiang, H. (2019). Integrated Modeling of Structural Genes Using MCuNovo. In: Brown, S., Pfrender, M. (eds) Insect Genomics. Methods in Molecular Biology, vol 1858. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8775-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-8775-7_5

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-8774-0

  • Online ISBN: 978-1-4939-8775-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics