RNA-seq analysis of the C. briggsae transcriptome

  1. Nansheng Chen1,2,8
  1. 1Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada;
  2. 2CIHR/MSFHR Bioinformatics Training Program, Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia V5Z 1G1, Canada
    1. 3 These authors contributed equally to this work.

    • Present addresses: 4European Molecular Biology Laboratory, Heidelberg 06221-387-0, Germany;

    • 5 Department of Medical Genetics, University of British Columbia, Vancouver V6T 1Z4, Canada;

    • 6 GenomeDX Biosciences Inc., Vancouver V6J 1J8, Canada;

    • 7 Department of Medical Genetics, University of British Columbia, Vancouver V6T 1Z4, Canada.

    Abstract

    Curation of a high-quality gene set is the critical first step in genome research, enabling subsequent analyses such as ortholog assignment, cis-regulatory element finding, and synteny detection. In this project, we have reannotated the genome of Caenorhabditis briggsae, the best studied sister species of the model organism Caenorhabditis elegans. First, we applied a homology-based gene predictor genBlastG to annotate the C. briggsae genome. We then validated and further improved the C. briggsae gene annotation through RNA-seq analysis of the C. briggsae transcriptome, which resulted in the first validated C. briggsae gene set (23,159 genes), among which 7347 genes (33.9% of all genes with introns) have all of their introns confirmed. Most genes (14,812, or 68.3%) have at least one intron validated, compared with only 3.9% in the most recent WormBase release (WS228). Of all introns in the revised gene set (103,083), 61,503 (60.1%) have been confirmed. Additionally, we have identified numerous trans-splicing leaders (SL1 and SL2 variants) in C. briggsae, leading to the first genome-wide annotation of operons in C. briggsae (1105 operons). The majority of the annotated operons (564, or 51.0%) are perfectly conserved in C. elegans, with an additional 345 operons (or 31.2%) somewhat divergent. Additionally, RNA-seq analysis revealed over 10 thousand small-size assembly errors in the current C. briggsae reference genome that can be readily corrected. The revised C. briggsae genome annotation represents a solid platform for comparative genomics analysis and evolutionary studies of Caenorhabditis species.

    Footnotes

    • Received November 9, 2011.
    • Accepted April 30, 2012.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    | Table of Contents

    Preprint Server