- Split View
-
Views
-
Cite
Cite
Agostinho Antunes, Joan Pontius, Maria João Ramos, Stephen J. O'Brien, Warren E. Johnson, Mitochondrial Introgressions into the Nuclear Genome of the Domestic Cat, Journal of Heredity, Volume 98, Issue 5, July/August 2007, Pages 414–420, https://doi.org/10.1093/jhered/esm062
- Share Icon Share
Abstract
Translocation of mtDNA into the nuclear genome, also referred to as numt, was first reported in the domestic cat (Felis catus) by Lopez et al. (1994). The Lopez-numt consisted of a translocation of 7.9 kbp of mtDNA that inserted into the domestic cat chromosome D2 around 1.8 million years ago. More than a decade later, the release of the domestic cat whole-genome shotgun sequences (1.9× coverage) provides the resource to obtain more comprehensive insight into the extent of mtDNA transfer over time in the domestic cat genome. MegaBLAST searches revealed that the cat genome harbors a wide variety of numts (298 320 bp), one-third of which likely correspond to the Lopez-numt tandem repeat, whereas the remaining numts are probably derived from multiple independent insertions, which in some cases were followed by segmental duplication after insertion in the nucleus. Numts were detected across most cat chromosomes, but the number of numts assigned to chromosomes is underestimated due to the relatively high number of numt sequences with insufficient flanking sequence to map. The catalog of cat numts provides a valuable resource for future studies in Felidae species, including its use as a tool to avoid numt contaminations that may confound population genetics and phylogenetic studies.
Cytoplasmic mitochondrial (cymt) DNA sequences may integrate into an organismal nuclear genome giving rise to nuclear DNA sequences of mitochondrial origin, also known as numts (Lopez et al. 1994). Mammalian numt integrations seem to be dead on arrival (pseudogenes) in part due to the differences between the nuclear and mitochondrial genetic codes (Gellissen and Michaelis 1987; Perna and Kocher 1996). Numts have been reported in dozens of animal and plant species (reviewed in Bensasson et al. 2001), but the broad view of the extent of mtDNA transfer among eukaryotes has been achieved only recently with the public release of several whole-genome sequence projects (e.g., Mourier et al. 2001; Pereira and Baker 2004; Richly and Leister 2004).
The human genome-sequencing project has facilitated the evaluation of mtDNA transfer and the evolutionary dynamics of numts (Mourier et al. 2001; Tourmen et al. 2002; Woischnik and Moraes 2002; Mishmar et al. 2004). Several studies have shown that hundreds of numts have been integrated in the human nuclear genome over a continuous evolutionary process (Tourmen et al. 2002; Woischnik and Moraes 2002) and that additional numt copies also arise from segmental duplication after insertion into the nucleus (Tourmen et al. 2002; Bensasson et al. 2003; Hazkani-Covo et al. 2003). Some of these numts are variable in human populations with respect to presence/absence, suggesting that they have only arisen recently (Ricchetti et al. 2004). Moreover, numts have occasionally been integrated into genes, leading to human genetic diseases (Willett-Brozick et al. 2001; Turner et al. 2003; Goldin et al. 2004).
Within the Felidae family, there have been 2 well-documented cases of numt integrations. The first consisted of the translocation of 7.9 kbp of the mitochondrial genome into the nuclear genome of an ancestral of the domestic cat (Felis catus) around 1.8 million years ago (Lopez et al. 1994). This large segment is tandemly repeated 38–76 times on cat chromosome D2. The second case, described in the Panthera genus species is an independent insertion of 12.5-kbp mtDNA segment that integrated into chromosome F2 around 3.5 million years ago (Kim et al. 2006). More than a decade after the initial characterization of the Lopez-numt (Lopez et al. 1994), the release of the domestic cat nuclear genome sequences (1.9× coverage) provides the resource to obtain a more comprehensive insight into the extent of mtDNA transfer in F. catus and its significance for the study of genome evolution. In this study, we attempt to reconstruct the evolutionary dynamics of numt integration over time in the domestic cat nuclear genome.
Material and Methods
Numts Search in the Domestic Cat Genome
The “light” coverage genome sequence (1.9× coverage) of an Abyssinian cat (named Cinnamon) completed in late 2005 by Agencourt Bioscience Corp, Beverly, MA, has been recently assembled (including its gene annotation and chromosome mapping) and released in the feline genome web browser (http://lgd.abcc.ncifcrf.gov; Pontius et al. forthcoming 2007). We performed MegaBLAST searches (Zhang et al. 2000) using the domestic cat mtDNA genome sequence (GenBank accession NC_001700; Lopez et al. 1996) as query against the Abyssinian cat nuclear genome. The MegaBLAST arguments that were used (-D 3 -m 8 -s 50 -r 1 -q -1 -X 40 -W 16) require an exact match between the 2 genomes of at least 16 nucleotides (or 8 nucleotides [W 8] in less stringent searches to discern further stretches of homology). As the whole-genome shotgun (WGS) approach used for sequencing the Abyssinian cat genome can also retrieve shotgun sequences of mtDNA origin, we were able to reconstruct the Abyssinian cat mtDNA genome (cymt) based on the high similarity bits (mostly >99% ID) to the reference domestic cat mtDNA genome.
Sequencing Analyses
To unravel the phylogenetic relationships between numts and homologous mtDNA sequences, we first performed multiple alignments using ClustalX (Thompson et al. 1997), followed by phylogenetic analyses using a Kimura 2-parameter model and the minimum-evolution tree-building algorithm in MEGA2 (Kumar et al. 2001). Reliability of nodes defined by the phylogenetic trees was assessed using bootstrap replications.
An estimate of the molecular date for the origin of each numt was estimated applying the equation of Li et al. (1981) whereby the fraction of sequence divergence is δ = (μ1 + μ2)t, where μ1 = 2.5 × 10−8 substitutions/sites/year for mtDNA (Hasegawa et al. 1985; Lopez et al. 1997), μ2 = 4.0 × 10−9 substitutions/sites/year for nuclear pseudogene distance (Li 1997; Lopez et al. 1997), and t is the time elapsed.
Results and Discussion
The MegaBLAST search of the cat nuclear genome sequences (contigs and unplaced reads) with the domestic cat mtDNA genome resulted in 489 hits totalling 334 458 bp (Table 1). Of these, 36 138 bp (10.8%) were classified as authentic mtDNA or cymt (shotgun sequences of mtDNA that were misassembled with the nuclear sequences), leaving 298 320 bp of numts (covering ∼99% of cymt sequence span). One-third of the numt sequences (96 094 bp) showed higher sequence identity to the previously described Lopez-numt than to the cat mtDNA genome and likely correspond to this numt lineage (Lopez et al. 1994). Indeed, scaffold 112 167 (∼78 kbp) showed 12 Lopez-numts (Figure 1) and likely represents a portion of the chromosome D2 numt tandem repeat (Lopez et al. 1994). However, given that domestic cats can have 38–76 repeats of the 7.8 kbp Lopez-numt copy (Lopez et al. 1994), the predicted proportion of hits should have been around 302 000–604 000 bp or 3–6 fold greater than what was observed in the 1.9× coverage of the Abyssinian cat genome.
BLAST 16 words match | BLAST 8 words match | |||
Query: Cymt (reconstructed) | mtDNA (NC_001700) | Lopez-numt (U20754) | mtDNA (NC_001700) | |
Hits | 489 | 499 | 283 | 699 |
Base pairs | 334 458 | 336 563 | 204 482 | 433 076 |
BLAST 16 words match | BLAST 8 words match | |||
Query: Cymt (reconstructed) | mtDNA (NC_001700) | Lopez-numt (U20754) | mtDNA (NC_001700) | |
Hits | 489 | 499 | 283 | 699 |
Base pairs | 334 458 | 336 563 | 204 482 | 433 076 |
BLAST 16 words match | BLAST 8 words match | |||
Query: Cymt (reconstructed) | mtDNA (NC_001700) | Lopez-numt (U20754) | mtDNA (NC_001700) | |
Hits | 489 | 499 | 283 | 699 |
Base pairs | 334 458 | 336 563 | 204 482 | 433 076 |
BLAST 16 words match | BLAST 8 words match | |||
Query: Cymt (reconstructed) | mtDNA (NC_001700) | Lopez-numt (U20754) | mtDNA (NC_001700) | |
Hits | 489 | 499 | 283 | 699 |
Base pairs | 334 458 | 336 563 | 204 482 | 433 076 |
The remaining two-thirds of the numt sequences (202 226 bp) probably represent distinct insertions from the Lopez-numt copy, as they have a broader range of homology, and included portions of the cymt sequence that are not part the Lopez-numt (spanning <CR-12S-16S-ND1-ND2-CO1-CO2>). These genes, from CO2/ATP8 to CytB/CR, accounted for the ∼130 kbp difference between BLAST searches using as query the cymt or the Lopez-numt sequence (Table 1). Phylogenetic analyses of homologous numts also show evidence of multiple historic insertions into the cat genome (Figure 2). These comprehended several numt insertions since the onset of the Felidae family radiation (approximately 10.8 million years ago; Johnson et al. 2006) that included segments present (e.g. ND1 570 bp) and absent (e.g., CytB 426 bp) in the Lopez-numt (Figure 2).
Most of the numt sequences did not have corresponding flanking nuclear DNA regions (5′ or 3′) and thus could not be mapped in the cat genome (Figure 3) (it should be noted, however, that when numt sequences are part of a large repeat motif, the probability of finding the flanking nuclear regions lowers considerably). The majority of numts were assigned to an “unknown chromosome aligning with mtDNA” (including most numts without nuclear flanking regions, namely, the multiple copies of the Lopez-numt lineage and also authentic mtDNA or cymt that were misassembled with the nuclear sequences). A few numts were assigned to an “unknown chromosome” and to an “unknown chromosome aligning with the dog genome” (Figure 3). Only 14.8% of the detected numts (i.e., 83 mtDNA sequence matches using a W16 search and bitscore >150) could be mapped into cat chromosomes (Figure 3). These numts were distributed across most of the cat chromosomes, supporting the hypothesis that multiple independent numt insertions occurred in the cat genome within the last 11 million years (Figures 4 and 5; older integrations could be retrieved with the less stringent W8 search). None of the multiple copies homologous of the previously described Lopez-numt could be assigned to its chromosome D2 location (Lopez et al. 1994).
Because only a reduced portion of the cat numts could be assigned to specific chromosomes, it is likely that many other numts exist in the cat genome. Considering the overall amount of numts retrieved in the Abyssinian cat genome, the proportion of numt integration over time in cat seems to be relatively high (9.9 × 10−3%). Although this value should be interpreted with caution, as it was obtained from a 1.9× coverage of the domestic cat genome, it suggests that the proportion of numts in the cat genome is comparable to that of man (9.6 × 10−3%), the mammalian species with the larger numt repertoire described to date (Richly and Leister 2004). In addition, less stringent (W8) MegaBLAST sequence searches with the whole F. catus mtDNA genome identified over 100 kbp of more-divergent putative numts (Table 1), indication of series of more ancient numt insertions that awaits further annotation.
The overall proportion of numts in a genome is often larger than the incidence of novel numt insertions (primary integration), as mtDNA-like sequences may also result from duplication after insertion into the nucleus (secondary integration) (Tourmen et al. 2002; Bensasson et al. 2003; Hazkani-Covo et al. 2003). One of the best examples is that around one-third of the cat numts (3.2 × 10−3%) probably resulted from a unique insertion event occurring 1.8 million years ago in chromosome D2 and followed segmental duplication (Lopez et al. 1994). Secondary integration seems also a likely explanation for the high frequency of related 16S-numts particularly on cat chromosomes D1 and E3, and additional duplication events likely occurred also in other chromosomes (see Figure 5).
The cataloguing of numts in the cat genome is a valuable resource to recognize sources of numt contaminations in population genetics (e.g. F. catus species-specific numt insertions) and phylogenetic studies within Felidae species (e.g., for numt insertions predating the F. catus species divergence). Indeed, numts can be identical in length to the usual size of the fragments isolated by polymerase chain reaction (PCR) technology (around 85% of the numts assigned to cat chromosomes are less than 850 bp long and have up to 95% sequence similarity with cymt) and can be preferentially amplified or coamplified (Perna and Kocher 1996; Zhang and Hewitt 1996a). Because numt are paralogs of the authentic mitochondrial sequences, they will confound genetic analyses when inadvertently included, especially when using more slowly evolving segments (Arctander 1995; Collura and Stewart 1995; Vanderkuyl et al. 1995; Zhang and Hewitt 1996b). The occurrence of numts bearing high similarity to cymt DNA, as with sequence heteroplasmy, necessitates more complicated data collection and analysis and in some species, like gorillas, can make analysis of mtDNA impractical (Thalmann et al. 2004). Furthermore, problems can also arise through the inadvertent incorporation of artifacts produced by in vitro recombination between numt and cymt sequence types during PCR amplification (Anthony et al. 2007). One implication is that explicit measures need to be taken to authenticate mtDNA sequences generated (Bensasson et al. 2001; Anthony et al. 2007).
Conclusions
Compared with previous fine-scale studies on cat numt integration (Lopez et al. 1994, 1996), the recent release of the Abyssinian cat WGS sequences (1.9× coverage) allows a much more comprehensive understanding of the extent of mtDNA transfer into the domestic cat genome. The cat genome harbors a wide variety of numts that originated from multiple independent insertions, in some cases followed by segmental duplication, and are now widespread across most cat chromosomes. Notwithstanding, the 1.9× coverage of the cat genome does not provide a complete catalog of cat numts, effectively mapping only around 15% of the detected numts onto cat chromosomes because of the lack of sufficient amount of flanking nuclear DNA sequence for most numt insertions. However, this catalog of numts based on the 1.9× coverage of the cat genome provides insight into the evolutionary dynamics of numt integration and genome evolution in general. Moreover, it provides the framework to recognize sources of numt contaminations in genetic surveys in the domestic cat and possibly within most species of the Felidae family. Indeed, some numts can be retraced over 40 million years of evolution (Schmitz et al. 2005). In more closely related species, such as man and chimpanzee (which diverged less than 6.3 million years ago; Patterson et al. 2006), the proportion of shared numts (orthologous numts found in both genomes at identical loci) can be quite high (>80%) (Hazkani-Covo and Graur 2007), thus demonstrating the potential utility of the domestic cat numts's catalog for studies across the 37 species of the Felidae family, which originated less than 10.8 million years ago (Johnson et al. 2006).
Acknowledgments/Funding
This project has been funded in whole or in part with federal funds from the National Cancer Institute, National Institutes of Health, under contract N01-CO-12400. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government. This project has been funded in whole or in part with federal funds from the National Cancer Institute, National Institutes of Health, under contract N01-CO-12400. This Research was supported [in part] by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research.
References
Author notes
This paper was delivered at the 3rd International Conference on the Advances in Canine and Feline Genomics, School of Veterinary Medicine, University of California, Davis, CA, August 3–5, 2006.
Corresponding Editor: Urs Giger