Main

The amniote lineage divided into the ancestral lineages of mammals and reptiles 320 million years ago. Today, the surviving members of those lineages are mammals, comprising 4,500 species, and reptiles, containing 17,000 species. Within the reptiles, the two major clades diverged 280 million years ago: the lepidosaurs, which contains lizards (including snakes) and the tuatara; and the archosaurs, containing crocodilians and birds (the position of turtles remains unclear)6. For simplicity, we will refer here to lepidosaurs as lizards (Fig. 1).

Figure 1: Amniote phylogeny based on protein synonymous sites showing major features of amniote evolution.
figure 1

Major characteristics of lizard evolution including homogenization of GC content, high sex chromosome turnover and high levels of repeat insertion are featured. Sex chromosome inventions are indicated in red. Branch length is proportional to dS (the synonymous substitution rate); dS of each branch is indicated above the line.

PowerPoint slide

The study of the major genomic events that accompanied the transition to a fully terrestrial life cycle has been assisted by the sequencing of several mammal (K.L.-T. et al., manuscript submitted) and three bird genomes2,3,4. The genome of the lizard A. carolinensis thus fills an important gap in the coverage of amniotes, splitting the long branch between mammals and birds and allowing more robust evolutionary analysis of amniote genomes.

For instance, almost all reptilian genomes contain microchromosomes, but these have only been studied at a sequence level in birds2,7, raising the question as to whether the avian microchromosomes’ peculiar sequence features are universal across reptilian microchromosomes8. Another example is the study of sex chromosome evolution. Nearly all placental and marsupial mammals share homologous sex chromosomes (XY)9 and all birds share ZW sex chromosomes. However, lizards exhibit either genetic or temperature-dependent sex determination10. Characterization of lizard sex chromosomes would allow the study of previously unknown sex chromosomes and comparison of independent sex chromosome systems in closely related species.

Anolis lizards comprise a diverse clade of 400 described species distributed throughout the Neotropics. These lizards have radiated, often convergently, into a variety of ecological niches with attendant morphological adaptations, providing one of the best examples of adaptive radiation. In particular, their diversification into multiple replicate niches on diverse Caribbean islands via interspecific competition and natural selection has been documented in detail11. A. carolinensis is the only anole native to the USA and can be found from Florida and Texas up to North Carolina. We chose this species for genome sequencing because it is widely used as a reptile model for experimental ecology, behaviour, physiology, endocrinology, epizootics and, increasingly, genomics.

The green anole genome was sequenced and assembled (AnoCar 2.0) using DNA from a female A. carolinensis lizard (Supplementary Tables 1–4). Fluorescence in situ hybridization (FISH) of 405 bacterial artificial chromosome (BAC) clones (from a male) allowed the assembly scaffolds to be anchored to chromosomes (Supplementary Table 5 and Supplementary Fig. 1). The A. carolinensis genome has been reported to have a karyotype of n = 18 chromosomes, comprising six pairs of large macrochromosomes and 12 pairs of small microchromosomes12. The draft genome sequence is 1.78 Gb in size (see Supplementary Table 3 for assembly statistics) and represents an intermediate between genome assemblies of birds (0.9–1.3 Gb) and mammals (2.0–3.6 Gb).

We find that few chromosomal rearrangements occurred in the 280 million years since anole and chicken diverged, as had been hinted at by previous comparisons using Xenopus and chicken13. There are 259 syntenic blocks (defined as consecutive syntenic anchors that are consistent in order, orientation and spacing, at a resolution of 1 Mb) between lizard and chicken (Supplementary Table 6 and Supplementary Fig. 2). Interestingly, 19 out of 22 anchored chicken chromosomes are each syntenic to a single A. carolinensis chromosome over their entire lengths (Fig. 2a); by contrast, only 6 (of 23) human chromosomes are syntenic to a single opossum chromosome over their entire lengths, even though the species diverged only 148 million years ago14. Segmental duplications follow trends seen in other amniote genomes (Supplementary Note, Supplementary Table 7 and Supplementary Fig. 3).

Figure 2: A. carolinensis –chicken synteny map reveals synteny of reptile microchromosomes but dissimilar GC and repeat content.
figure 2

a, Very few rearrangements have occurred in the 280 million years since A. carolinensis and chicken diverged. A. carolinensis microchromosomes are exclusively syntenic to chicken microchromosomes. Horizontal coloured bars depict the six A. carolinensis macrochromosomes (1–6) and the six (of 12) A. carolinensis microchromosomes that have sequence anchored to them that is syntenic to the chicken genome (7, 8, 9, X, LGg, LGh). Chromosomes that could be ordered by size were assigned a number; the smaller microchromosomes that could not be distinguished by size were assigned a lowercase letter. Each colour corresponds to a different chicken chromosome as indicated in the key. Any part of an A. carolinensis chromosome that is syntenic to a chicken microchromosome is indicated by ‘m’. b, Chicken microchromosomes have both higher GC content and lower repeat content than chicken macrochromosomes, whereas A. carolinensis chromosomes do not vary in GC or repeat content by chromosome size. Large circles designate the GC percentage of each chromosome in the chicken and lizard genomes with greater than 100 kb of sequence anchored to it. Small circles designate the percentage of the genome made up of repetitive sequence of each chromosome in the chicken (blue circles) and lizard (red circles) genomes.

PowerPoint slide

Approximately 30% of the A. carolinensis genome is composed of mobile elements, which comprise a much wider variety of active repeat families than is seen for either bird2 or mammalian15 genomes. The most active classes are long interspersed (LINE) elements (27%) and short interspersed (SINE) elements (16%)16 (Supplementary Table 8). The majority of LINE repeats belong to five groups (L1, L2, CR1, RTE and R4) and seem to be recent insertions based on their sequence similarity (divergence ranges from 0.00–0.76%; ref. 17). This contrasts with observations of mammalian genomes, where only a single family of LINEs—L1—has predominated over tens of millions of years. The DNA transposons comprise at least 68 families belonging to five superfamilies: hAT, Chapaev, Maverick, Tc/Mariner and Helitron18. As with retrotransposons, the majority of DNA transposon families seem to be relatively young in contrast to the extremely few recently active DNA transposons found in other amniote genomes (Supplementary Table 9). Overall, A. carolinensis mobile elements feature significantly higher GC content (43.5%, P < 10−20) than the genome-wide average of 40.3%. In addition to mobile elements, A. carolinensis exhibits a high density (3.5%) of tandem repeats, with length and frequency distributions similar to those of human microsatellite DNA15. We now know that amniote genomes come in at least three types: mammalian genomes are enriched for L1 elements and have a high degree of mobile element accumulation, bird genomes are repeat poor with very little mobile element activity, while the lizard genome contains an extremely wide diversity of active mobile element families but has a low rate of accumulation, which is reminescent of the mobile element profile of teleostean fishes19.

Most reptile genomes contain microchromosomes, but the numbers vary among species; the A. carolinensis genome contains 12 pairs of microchromosomes12, whereas the chicken genome contains 28 pairs. Bird microchromosomes have very distinctive properties compared to bird macrochromosomes, such as higher GC and lower repeat contents2, whereas lizard microchromosomes do not exhibit these features (Fig. 2b). Remarkably, all sequence anchored to microchromosomes in A. carolinensis also aligns to microchromosomes in the chicken genome, and all but one A. carolinensis microchromosome is syntenic to only a single corresponding chicken microchromosome (Fig. 2a). Microchromosomes conserved between A. carolinensis and chicken thus could have arisen in the reptile ancestor, whereas the remaining chicken microchromosomes could be derived in the bird lineage. Alternatively, the remaining chicken microchromosomes could have been present in the reptile ancestor but fused to form macrochromosomes in the lizard lineage.

The A. carolinensis genome has surprisingly little regional variation of GC content, substantially less than previously observed for birds and mammals; it is the only amniotic genome known whose nucleotide composition is as homogenous as the frog genome5 (Supplementary Figs 4 and 5). Figure 3 illustrates how local GC content is evolutionarily conserved between human chromosome 14 and chicken chromosome 5, but to a much lesser degree with A. carolinensis chromosome 1. As all sequenced amniote genomes other than A. carolinensis contain these homologous varying levels of GC content (‘isochores’)20, the ancestral amniote GC heterogeneity is likely to have eroded towards homogeneity in this lizard’s lineage. It has been proposed that isochores with high GC content are a consequence of higher rates of GC-biased gene conversion in regions of higher recombination2. The greater GC homogeneity in the anole genome may thus reflect more uniform recombination rates, or else a substantially reduced bias towards GC during the resolution of gene conversion events in the A. carolinensis lineage (for a discussion, see ref. 5).

Figure 3: The A. carolinensis genome lacks isochores.
figure 3

The A. carolinensis genome shows only very local variation in GC content, unlike the human and chicken genomes, which also show larger trends in GC variation, sometimes called isochores. Syntenic regions of human chromosome 14, chicken chromosome 5 and A. carolinensis chromosome 1 are shown. The human and chicken regions are inverted and rearranged to align with the A. carolinensis region. Blue lines depict GC percentage in 20-kb windows. The purple line designates the genome average. Green lines represent examples of syntenic anchors between the three genomes.

PowerPoint slide

Both temperature-dependent sex determination and XY genetic sex determination have been found in Iguania10. Within the genus Anolis, there are species with heteromorphic XY chromosomes (including those with multiple X and Y chromosomes), and others with entirely homomorphic chromosomes12. A. carolinensis is known to have genetic sex determination21, but the form of its sex chromosomes (ZW or XY) has thus far been unknown owing to a lack of obviously heteromorphic chromosomes.

In depth examination of male and female cells using FISH allowed us to identify the microchromosome previously designated as ‘b’ as the A. carolinensis X chromosome; it is present in two copies in females and one in males. This chromosome is syntenic to chicken microchromosome 15. Eleven BACs assigned to two scaffolds, 154 (3.3 Mb) and chrUn0090 (1.8 Mb), hybridize via FISH to the p arms of the two X chromosomes in females, and hybridize to the p arm of the single X chromosome in males (Fig. 4 and Supplementary Fig. 1). A. carolinensis thereby shows a pattern representative of a male heterogametic system of genotypic sex determination. We have not identified the Y chromosome, but we hypothesize that A. carolinensis possesses both X and Y chromosomes, as both male and female cells contain the same number of chromosomes.

Figure 4: The A. carolinensis genome contains a newly discovered X chromosome.
figure 4

a, b, The X chromosome, a microchromosome, is found in one copy in male A. carolinensis (a) and in two copies in females (b). The BAC 206M13 (CHORI-318 BAC library) is hybridized to the p arm of the X chromosome using FISH in both male and female metaphase spreads. 206M13 and ten other BACs showed this sex-specific pattern in cells derived from five male and five female individuals. Original magnification, ×1,000.

PowerPoint slide

The 5.1 Mb of sequence assigned to the X chromosome contains 62 protein-coding genes (Supplementary Table 10); Gene Ontology (GO) terms associated with these genes show no significant enrichment. It is very likely that there is more X chromosome sequence that is currently labelled as unanchored scaffolds in the AnoCar 2.0 assembly. Identification of the A. carolinensis sex determination gene will require considerable functional biology, but we note that the chicken sex determination gene DMRT1 is located on A. carolinensis chromosome 2 and that SOX3 (the X chromosome paralogue of the therian mammal sex determination gene SRY) is located on an unanchored A. carolinensis scaffold; these genes are thus unlikely to be the A. carolinensis sex determination gene.

All ten A. carolinensis individuals (originating from South Carolina and Tennessee) used for FISH mapping showed large pericentromeric inversions in one or more of chromosomes 1–4, with no correlation between different chromosomal inversions or with the sex of the lizard (see Supplementary Note, Supplementary Table 11 and Supplementary Fig. 6).

A total of 17,472 protein-coding genes and 2,924 RNA genes were predicted from the A. carolinensis genome assembly (Ensembl release 56, September 2009). We built a phylogeny for all A. carolinensis genes and their homologues in eight other vertebrate species (human, mouse, dog, opossum, platypus, chicken, zebra finch and pufferfish), allowing us to identify a conservative set of 3,994 one-to-one orthologues, that is, genes that have not been duplicated or deleted in any of these vertebrates since their last common ancestor. These gene phylogenies were also used to identify genes that arose by duplication in the lizard lineage after the split with the avian lineage and, separately, those that were lost in the mammalian lineage after the mammal–reptile split (Fig. 1, Supplementary Note, Supplementary Fig. 7 and Supplementary Table 12).

We found 11 A. carolinensis opsin genes that have no mammalian orthologues (but have orthologues in invertebrates, fishes and frog), and thus seem to have been lost during mammalian evolution (Supplementary Table 13). The large repertoire of opsins may contribute to the excellent colour vision of anoles—including the ability to see in the ultraviolet range—and also may contribute to their hyperdiversity by allowing the evolution of diverse, species-specific colouration of the dewlap, which has an important role in sexual selection and species recognition11. Similarly, olfactory receptor and β-keratin genes are highly duplicated in A. carolinensis (Supplementary Note and Supplementary Fig. 9).

Many reptiles, including green anoles, differ from placental mammals in being oviparous (laying eggs). Vivipary in placental mammals is a derived state, reflected in their loss of some egg-related genes. We used mass spectrometry to identify proteins present in the immature A. carolinensis egg, as most egg proteins are produced in the mother’s body and then transported into the immature egg. We found that in contrast with mammals, reptiles have lineage-specific gene duplications, including in vitellogenins (VTGs), apovitellenin-1, ovomucin-α and three homologues of ovocalyxin-36, a chicken eggshell matrix protein.

Our results show rapid evolution of egg protein genes among amniotes. Specifically, we found proteins from 276 A. carolinensis genes in immature A. carolinensis eggs (Supplementary Tables 14 and 15), of which only 50 have been confirmed to be present in chicken eggs by mass spectrometry22,23. These genes include VTGs, a lysozyme, vitelline membrane outer layer protein 1 (VMO1) paralogues, protease inhibitors, natterin and nothepsin. By aligning genes that are one-to-one orthologues in A. carolinensis and chicken, we found that egg proteins evolve significantly more rapidly than non-egg proteins (mean dN/dS values (ratio of the rate of non-synonymous substitutions to the rate of synonymous substitutions) of 0.186 and 0.135, respectively; P = 1.2 × 10−5), which reflects reduced purifying selection and/or more frequent episodes of adaptive evolution.

Using multiple vertebrate genome sequences, we identified three VMO1 paralogues (which we name α, β and γ) that we infer to have been present in the last common ancestor of all reptiles and mammals. Whereas at least one of VMO1-α, VMO1-β and VMO1-γ has been lost in all other amniote genomes, the A. carolinensis genome contains representatives of all three paralogues. Moreover, the A. carolinensis-specific VMO1-α family has grown to 13 members and has experienced positive selection of amino acid substitutions within a negatively charged, probably substrate-binding cavity; changes that, presumably, modify its lysozyme-like transferase activity (Supplementary Note, Supplementary Fig. 8 and Supplementary Tables 16 and 17).

The extensive and active repeat repertoire of A. carolinensis has allowed us to discover the origin of several mammalian conserved elements. Through the process of exaptation (a major change in function of a sequence during evolution), certain mobile elements that were active in the amniote ancestor have become conserved, and presumably functional, in mammals, while remaining active mobile elements in A. carolinensis. The origin of these conserved mammalian sequences in mobile elements was not recognizable without comparison to a distant and repeat-rich genome sequence24. We identified 96 such exapted elements (see Supplementary Table 18) in the human genome tracing back to mobile elements present in the amniote ancestor that are still present in A. carolinensis, particularly the CR1, L2 and gypsy families.

Although most exapted elements are non-coding and probably serve a regulatory function, we also identified a protein-coding exon that was exapted from an L2-like LINE, now constituting exon 2 in a mammal-specific N-terminal region of the MIER1 (mesoderm induction early response 1) protein. This exon is highly conserved across 29 mammals and therefore probably represents a mammalian innovation since the amniote ancestor.

GO terms associated with the transcription start site closest to each exapted element in the human genome show enrichment for neurodevelopmental genes (see Methods), with “ephrin receptor binding”, “nervous system development” and “synaptic transmission” being strongly enriched (all P values < 5 × 10−3). These enrichments are consistent with adaptive changes in neurodevelopment occurring during the emergence of mammals.

Anolis lizards are a textbook case of adaptive radiation, having diversified independently on each island in the Greater Antilles and throughout the Neotropics, producing a wide variety of ecologically and morphologically differentiated species, with as many as 15 found at a single locality11. Although anoles are widely used as a model system for phylogenetic comparative studies, it has been difficult to determine the evolutionary relationships among major anole clades owing to rapid evolutionary radiations associated with access to new dimensions of ecological opportunity. Successfully resolving the relatively short branching events associated with such a radiation requires a wealth of data from loci evolving at an appropriate rate.

We used the genome sequence of A. carolinensis to develop a new phylogenomic data set comprised of 20 kb of sequence data sampled from across the genomes of 93 species of anoles (Supplementary Tables 19 and 20). Analyses of this data set infer a well-supported phylogeny that reinforces and clarifies the adaptive and biogeographic history of anoles (Fig. 5, details in Supplementary Fig. 10). First, our phylogenomic analysis reaffirms previous molecular and morphological studies indicating that similar anole habitat specialists have evolved independently on each of the four large Greater Antillean islands. Second, our analyses suggest a complex biogeographic scenario involving a limited number of dispersal events between islands and extensive in situ diversification within islands. The closest relatives of Anolis occur on the mainland and the phylogeny confirms the existence of two colonizations, one into the southern Lesser Antilles and the second producing the diverse adaptive radiations throughout the rest of the Caribbean. Within this latter clade, anoles initially diversified primarily on the two larger Greater Antillean islands (although Puerto Rico also seems to have been involved) before subsequently undergoing secondary radiations on all of the islands and eventually returning to the mainland, where this back-colonization has produced an extensive evolutionary radiation. The phylogeny also indicates that very few inter-island dispersal events occurred in Greater Antillean evolution. Rather, the Greater Antillean faunas, renowned for the extent to which the same ecomorphs are found on each island, are primarily the result of convergent evolution25.

Figure 5: A phylogeny of 93 Anolis species clarifies the biogeographic history of anoles.
figure 5

Anolis ecomorphs derive from convergent evolution and not from frequent inter-island migration. Using conserved primer pairs distributed across the genome of A. carolinensis, we obtain sequences from 46 genomically diverse loci evolving at a range of evolutionary rates and representing both protein-coding and non-coding regions. Maximum likelihood analyses of this new data set of 20 kb aligned nucleotides infer nearly all previously established anole relationships while also partially resolving the basal relationships that have plagued previous studies. Open circles indicate bootstrap (bs) values <70; grey-shaded circles, 70< bs <95; filled circles, bs >95.

PowerPoint slide

The genome sequence of A. carolinensis allows a deeper understanding of amniote evolution. Filling this important reptilian node with a sequenced genome has revealed derived states in each major amniote branch and has helped to illuminate the amniote ancestor. However, the tree of sequenced reptilian genomes is still extremely sparse, and the sequencing of additional non-avian reptiles would be necessary to fully understand how typical A. carolinensis and the sequenced bird genomes are of the entire reptile clade.

In addition to the utility of the A. carolinensis genome sequence as a representative of non-avian reptiles, Anolis species are a unique resource for the study of adaptive radiation and convergent evolution. With their invasions of and subsequent radiations on Caribbean islands, anoles provide a terrestrial analogue to stickleback and cichlid fish, which underwent adaptive evolution in separate aquatic environments. Just as genomic research in sticklebacks has deepened the study of aquatic ecological speciation, a large-scale genomic phylogenetic survey of the Caribbean anoles would be an opportunity for detailed study of adaptive evolution in a land animal26; in particular because anole genomes contain large numbers of active mobile elements that we speculate could form substrates for exaptation of novel regulatory elements.

Methods Summary

A full description of methods, including sample collection, sequencing, assembly, anchoring, mass spectrometry and all sequence analysis, can be found in Supplementary Information. All animal experiments were approved by the MIT Committee for Animal Care.