The transcriptome of the bowhead whale Balaena mysticetus reveals adaptations of the longest-lived mammal

Mammals vary dramatically in lifespan, by at least two-orders of magnitude, but the molecular basis for this difference remains largely unknown. The bowhead whale Balaena mysticetus is the longest-lived mammal known, with an estimated maximal lifespan in excess of two hundred years. It is also one of the two largest animals and the most cold-adapted baleen whale species. Here, we report the first genome-wide gene expression analyses of the bowhead whale, based on the de novo assembly of its transcriptome. Bowhead whale or cetacean-specific changes in gene expression were identified in the liver, kidney and heart, and complemented with analyses of positively selected genes. Changes associated with altered insulin signaling and other gene expression patterns could help explain the remarkable longevity of bowhead whales as well as their adaptation to a lipid-rich diet. The data also reveal parallels in candidate longevity adaptations of the bowhead whale, naked mole rat and Brandt's bat. The bowhead whale transcriptome is a valuable resource for the study of this remarkable animal, including the evolution of longevity and its important correlates such as resistance to cancer and other diseases.

As a first step in identifying such patterns, we present the liver, kidney and heart transcriptomes of the bowhead whale. Comparison of the bowhead whale transcriptome with that of the related minke whale and other mammals enabled us to identify candidate genes for the exceptional longevity of the bowhead whale as well as molecular adaptations to a lipid-rich diet. Analyses of gene sequences and expression patterns also informed on various aspects of the biology and evolution of bowhead whales.

RESULTS AND DISCUSSION
We sequenced the liver, kidney and heart transcriptomes of bowhead whales by Illumina RNAseq technology. The tissue samples ( Fig. 1 and Supplemental Table 1) were sourced from native Iñupiaq Eskimo subsistence harvests in Barrow, Alaska. Using the de novo assembler Trinity [12,13], approximately 659 million paired-end short reads from the bowhead whale were assembled into a single transcriptome per tissue (Supplemental Table 2). Using the Ensembl mouse gene dataset [14] as reference, we identified 9,395 protein-coding genes with one-toone orthology between diverse mammalian taxa, including whales, a bat, rodents, a tree shrew and a primate ( Fig. 2A).
Phylogenetic analysis of the gene expression dataset revealed that the bowhead and minke whales cluster within one evolutionary branch, with a sister group comprising the cow and yak. The bowhead whale and minke whale diverged approximately 13.3 million years ago (Mya) (Fig. 2B). The estimated molecular divergence time of bowhead and minke whales is slightly less than the estimates from comprehensive morphological and molecular phylogenetic analyses of cetacean phylogeny, examining both extant and fossil lineages in simultaneous analyses [15,16]. It is thus likely that the common ancestor of the bowhead and minke whales lived approximately 13-27 million years ago.
To reveal candidate genes important in the bowhead's longevity, we compared the bowhead whale transcriptome with that of the minke whale (maximum lifespan of 50 years in the wild [17]), cow, yak, Brandt's bat, Chinese tree shrew, naked mole rat, rhesus macaque, mouse and rat. We also identified unique amino acid changes and rapidly evolving genes in the bowhead whale that may contribute to slow aging in this species.

Liver
The liver performs several functions vital to wholebody homeostasis. The integrity of liver function is thus important to longevity, and this organ has gained considerable attention in aging research. A total of 45 genes were differentially expressed in the bowhead whale liver compared to other mammals (Fig. 3, and  Tables 1 and 2). In particular, the bowhead whale showed greatly reduced expression of growth factor receptor-bound protein 14 (Grb14) (Fig. 4A). Grb14 is normally highly expressed in the liver and kidney of rodents and humans [18]. In contrast to the liver, Grb14 showed moderate expression in the kidney of the bowhead whale (data not shown). GRB14 binds the receptors for insulin (IR) and IGF1 (IGF1R) to modulate downstream signaling [19][20][21][22]. Insulin itself decreases endogenous glucose production (gluconeogenesis) and promotes lipogenesis. Knockout of Grb14 in mouse reduces the amount of circulating insulin and enhances insulin responsiveness in the liver [21]. Similar effects were observed in primary murine hepatocytes, where RNAi knockdown of Grb14 augmented insulin-activated signaling and associated decreased expression of genes associated with gluconeo-genesis [23]. During fasting and calorie restriction hepatic gluconeogenesis is required for tissues that lack enzymes required to metabolize free fatty acids (e.g. the brain) [24]. Elevated expression of Cited2 was observed in the bowhead whale (Fig. 4B).
CITED2 is upregulated during fasting, increases the activity of transcriptional co-activator PGC-1α, and directly induces hepatic gluconeogenesis [25]. Interestingly, a recent study found that elevated PGC-1α activity retards aging in Drosophila (26) and it has been hypothesized to also influence longevity in mammals (27). These data suggest that gluconeogenesis is sustained in the bowhead whale despite reduced expression of Grb14. Surprisingly, knockdown of Grb14 in mouse hepatocytes also potently inhibits maturation of sterol regulatory element binding transcription factor 1 (Srebf1, often denoted as SREBP1) and decreases lipogenesis, rendering lipogenic genes unresponsive to insulin [23]. In agreement, we observed decreased expression of insulin-induced gene 1 (Insig1, a downstream target of SREBP1 [23,[28][29]) in the bowhead whale liver (Fig.  4C), indicative of reduced activities along the GRB14/SREBP1 axis.  For each gene, gene counts were normalized across all replicates. We used an absolute value of log2 Ratio≥2, a Benjamini-Hochberg corrected P-value≤ 0.05, and a B-value of at least 2.945 (representing a 95% probability that a gene is differentially expressed) as the threshold to judge the significance of gene expression difference between the bowhead whale and other mammals. A negative fold change denotes a higher gene expression compared to the other mammals examined, and vice versa.  Calorie restriction and modulation of the insulin/IGF1signaling pathway are key modulators of aging and have potent effects on lipid homeostasis [30]. Lipid metabolism, therefore, plays a central role in lifespan regulation in metazoan eukaryotes. While carbohydrates are a major contributor to metabolism of terrestrial mammals, marine mammals subsist on a high-energy, lipid-rich diet [31]. Notably, the bowhead whale diet consists of zooplankton with very high lipid content [32,33]. Toothed whales, such as the common bottlenose dolphin, which has a maximum lifespan of ~52 years in captivity [34], can exhibit pathology that parallels type two diabetes (T2D) and metabolic syndrome of humans. This includes a high prevalence of insulin resistance, fatty liver disease, and chronic overproduction of Very-low-density lipoprotein (VLDL) by the liver (dyslipidemia) [35][36][37]. Our data suggest that reduced Grb14 expression in the bowhead whale improves energy homeostasis in an animal that depends on an exceptionally lipid-rich diet. This beneficial adaptation may serve to protect the long-lived bowhead whale against chronic dietary diseases. www.impactaging.com It has been proposed that the longevity of large mammals, such as the bowhead whale and elephant, stem from a lack of non-human predators, slow development and concomitant tight control of nutrient sensing pathways, in particular the insulin/TOR signaling axis [38]. Small African mole rats and many bats live in protected environments and are long-lived relative to animals of similar size, but may in contrast have evolved genetic changes that allow them to slow down aging after a comparatively rapid development [38]. Sequence and expression changes of genes involved in hepatic metabolism have been reported for Brandt's bat [39] and naked mole rat [40], two terrestrial species in our dataset that are long-lived compared to Table 3. Genes differentially expressed in the bowhead and minke whale heart compared to other mammals.
For each gene, gene counts were normalized across all replicates. We used an absolute value of log2 Ratio≥2, a Benjamini-Hochberg corrected P-value≤ 0.05, and a B-value of at least 2.945 (representing a 95% probability that a gene is differentially expressed) as the threshold to judge the significance of gene expression difference between whales and other mammals. A negative fold change denotes a higher gene expression compared to the other mammals examined, and vice versa.  www.impactaging.com related species. Interestingly, in agreement with the previous studies that compared a smaller set of species, we found that Foxo1 (forkhead box O1) and Fto (fat mass and obesity associated) were differentially expressed in liver of the Brandt's bat and naked mole rat, respectively (Supplemental Tables 3-6). The transcription factor FOXO1, like PGC-1α, is a classical mediator of gluconeogenesis in species ranging from worms to mammals, and Foxo1 and its upstream regulator Creb1 [24] are expressed at a higher level in the Brandt's bat liver (Supplemental Table 3). Elevated hepatic Foxo1 mRNA expression in the Brandt's bat reconciles with increased longevity in calorie restricted and growth hormone receptor knockout mice [41], the worm C. elegans [42] and the fruit fly Drosophila melanogaster [43]. This result strongly suggests that elevated hepatic Foxo1 expression contributes to the longevity of the Brandt's bat. The naked mole rat has a highly divergent insulin peptide that may be compensated by autocrine/paracrine IGF2 in the adult liver and displays physiological changes consistent with an altered insulin/IGF1 axis [40,44]. In the naked mole rat, fat mass and obesity associated mRNA (encoded by Fto) is expressed at a lower level compared to the other species examined (Supplemental Table 5). Fto regulates glucose metabolism in the liver and its reduced expression protects against obesity and metabolic syndrome, and associated pathology such as insulin resistance [45][46][47]. Taken together, we propose that fundamental longevity-promoting mechanisms in the long-lived bowhead whale, Brandt's bat and naked mole rat stem from overlapping but distinct expression and sequence changes in metabolism genes. Living under water also entails a unique set of challenges. Compared with the other mammals, the bowhead liver had lower expression of the apoptosisassociated p53 target gene Perp (TP53 Apoptosis Effector) (Fig. 4D). Under hypoxic stress, p53 is activated to trigger cellular apoptosis as a protective response, but this may not be appropriate for mammals with only intermittent assess to oxygen. It has been shown in mouse kidney that Perp knockdown protected cells from hypoxia-induced apoptosis [48]. Low levels of Perp may therefore increase stress tolerance in the bowhead whale liver. In addition, we observed increased expression of genes that enhance vasculature maintenance (EPH receptor A2 (Epha2) [49]), regulate www.impactaging.com vascular tone and proliferation (endothelial nitric oxide synthase (eNOS) (Nos3) [50,51]) and promote DNA repair (replication protein A2 (Rpa2) [52,53]) (Fig. 3, Tables 1 and 2), which may also represent adaptations to limited oxygen availability and water pressure in the aquatic environment.

Cardiovascular system
Other extraordinarily long-lived species, such as the naked mole rat, possess biological adaptations endowing resilience to aging-related diseases, including cardiovascular disease and cancer [54,55]. The availability of a single heart sample precluded differential gene expression analysis of the bowhead whale alone.
Using transcriptome data from a bowhead whale heart and a minke whale heart [8], we compared these marine mammals to the terrestrial ones and identified four differentially expressed genes in whales (Table 3). Of these, argininosuccinate lyase (Asl) is particularly interesting. The expression of Asl, which is essential for NO production in the heart [56], was 4.4-fold higher in the whales ( Table 3). Reduction of nitrite to nitric oxide (NO) provides cytoprotection of tissues during hypoxic events [57] and serves to preserve cardiac function in the diving and hibernating red-eared slider turtle (Trachemys scripta elegans) [58]. Cardiovascular regulation is critical during diving in all marine mammal species, and evidence suggests that hypoxia-sensitive tissues such as the brain and heart have adapted to low oxygen conditions while diving [59]. We speculate that elevated expression of Asl improves cardiac metabolism and function of diving cetaceans.
Amino-acid residues that are uniquely altered in a lineage can reveal clues to functional adaptations [39,60]. In addition to differential gene expression patterns, we identified a unique amino acid change with potential relevance to vascular aging in the bowhead whale (Table 4). An examination of 68 vertebrate genomes, including 6 cetaceans (Supplemental Table 7), revealed a unique amino acid change in bowhead whale c-fos induced growth factor (Figf, coding for vascular endothelial growth factor D, VEGFD) (Fig. 5). VEGFD plays a role in the maintenance of vascular homeostasis [61,62]. The radical amino acid change, neutral polar residues replaced by a charged residue (SerThrGln174Arg), in bowhead whale VEGFD flanks a large hydrophobic surface that interacts with the cognate receptor VEGFR-2 [63]. Although this finding must be corroborated with experimental data, we speculate that the arginine substitution may serve to improve the interaction of VEGFD with its cognate receptor, potentially aiding maintenance of vascular health in bowhead whales.

Kidney
Aging-induced changes in the kidney include reduced repair and/or regeneration of cells, which is concomitant with decreases in glomerular filtration rate and blood flow [64,65]. Comparison of the bowhead whale kidney transcriptome with those of other mammals revealed that 53 genes were differentially expressed in the bowhead whale kidney (Tables 5 and 6). Among these was a battery of DNA repair associated genes. For example, SprT-like N-terminal domain (Sprtn) [66]; non-SMC condensin I complex, subunit D2 (Ncapd2) [67]; ligase I, DNA, ATP-dependent (Lig1) [68]; and Rpa2 [52,53] which also was elevated in the liver, were expressed at higher levels in the bowhead whale kidney ( Table 5). The mitochondrial stress inhibitor Junb [69], as well as genes associated with tumor suppression, was also expressed at higher levels in the bowhead whale kidney. These latter genes included E4F transcription factor 1 (E4fl) which directly interacts with the tumor suppressors p53, RASSF1A, pRB and p14ARF [70]; ADP-ribosylation factor-like 2 (Arl2) [71]; protein kinase C, delta binding protein (Prkcdbp; also known as hSRBC) [72,73]; and Egr1 [74]. We speculate that this series of genes protect against age-related kidney tissue decline in the bowhead whale. VEGFD residues at the VEGFR-2 receptor-ligand interface are highlighted in yellow, highly conserved cysteine residues that stabilize VEGFD in orange, and a radical amino acid in the bowhead whale, but not in 67 other species, is shown in red. (B) Modeled VEGFD structure. Location of the unique arginine residue in the bowhead whale protein is shown.
Expanding on the theme of improved genome integrity and repair, we found a radical amino acid change SerLeuThr202Cys in the HEAT2 domain of bowhead whale MMS19 nucleotide excision repair homolog (encoded by Mms19) ( Table 4). This highly conserved protein plays an essential role in genome stability by facilitating iron-sulfur (FeS) cluster insertion into proteins involved in methionine biosynthesis, DNA replication and repair and telomere maintenance [75,76]. Reactive cysteines in assembly complex proteins, such as MMS19, assist in the transfer of FeS clusters to target proteins [77]. The amino acid change in bowhead whale MMS19 may confer systemic robustness to damage and/or longevity. Table 5. Genes differentially expressed in the kidney of bowhead whales compared to other mammals.
For each gene, gene counts were normalized across all replicates. We used an absolute value of log2 Ratio≥2, Benjamini-Hochberg corrected P-value≤ 0.05 a B-value of at least 2.945 (representing a 95% probability that a gene is differentially expressed) as the threshold to judge the significance of gene expression difference between the bowhead whale and other mammals. A negative fold change denotes a higher gene expression compared to the other mammals examined, and vice versa. www.impactaging.com

Rapid gene evolution analysis
Identification of positively selected genes can reveal insights into unique adaptations. Of the 9,395 1:1 orthologs from the species shown in Fig. 2B, eight genes showed evidence of positive selection in the bowhead whale lineage (Table 7). Similar to comparisons between human and other primates [78], the small number of positively selected genes in the Arctic bowhead whale may reflect a reduced efficacy of natural selection. Indeed, the bowhead whale has a long generation time (~50 years) and a small population size compared to the common minke whale and the other mammals examined here [79]. Unique amino acid changes were recently identified in cetacean haptoglobin (encoded by Hp) [8], a protein that protects against hemoglobin-driven oxidative stress [80]. Hp is under positive selection in the bowhead whale ( Table  7), suggesting that this gene is rapidly evolving in the long-lived bowhead. Four positively selected genes in the bowhead whale lineage have roles in cancer: mitochondrial tumor suppressor 1 (Mtus1) [81]; glycogen synthase kinase 3 alpha (Gsk3a) and prune exopolyphosphatase (Prune), which act in concert to regulate cell migration [82]; and cytoplasmic FMR1 interacting protein 1 (Cyfip1) [83].

Concluding comments
Here, we report the transcriptome of the bowhead whale, the longest-lived mammal known. We compared the gene expression of the bowhead whale to shorterlived mammals, including the minke whale which we estimate to have diverged from a common ancestor www.impactaging.com approximately 13-27 million years ago (Mya). This is similar to the great apes (gorillas, chimpanzees, orangutans and humans) which share a common ancestor ~13 Mya [84], or human and rhesus macaques which diverged ~25 Mya [85]. Among higher primates, humans are exceptionally long-lived [86]. Similarly, the bowhead whale has increased maximum lifespan compared to related cetaceans. It has been proposed that the difference in longevity between humans and other primates stems from differential expression of a small number of genes [87]. A recent study comparing humans to eight other mammals, including primates, revealed that 93 liver and 253 kidney genes showed evidence of human lineage-specific expression changes [88]. The number of genes differentially expressed in the bowhead whale liver (45 genes) and kidney (53 genes) compared to other mammals is similar, albeit obtained using a different computational method. We speculate that the genes differentially expressed, with unique coding sequence changes and rapidly evolving in the bowhead whale, represent candidate longevitypromoting genes. We particularly stress the findings suggestive of altered insulin signaling and adaptation to a lipid-rich diet. The availability of a single heart tissue sample from the bowhead whale precluded identification of distinct gene expression patterns in the long-lived bowhead whale, but revealed that argininosuccinate lyase (Asl) may protect the heart of cetaceans during hypoxic events such as diving. The bowhead whale transcriptome provides valuable resource for further research, including genome annotation and analysis, the evolution of longevity, adaptation to an aquatic environment, conservation efforts and as a reference for closely related species. de novo transcriptome assembly. We obtained and performed RNA-seq on independent biological replicates of bowhead whales (3 liver, 4 kidney and 1 heart samples) (Supplemental Table 1). In addition, publicly available Illumina HiSeq 2000 RNA-seq data were obtained from the following species: the common minke whale Balaenoptera acutorostrata (1 liver, 1 kidney, 1 heart) [8], cow Bos taurus (3 livers, 3 kidneys, 3 hearts) [89], (domestic) yak Bos grunniens (1 liver, 1 heart) [90], Brandt's bat Myotis brandtii (2 livers, 2 kidneys -all from summer active bats) [39], naked mole rat Heterocephalus glaber (2 livers, 2 kidneys) [91], Chinese tree shrew Tupaia chinensis (1 liver, 1 kidney, 1 heart) [92], mouse Mus musculus (3 livers, 3 kidneys, 2 hearts) [93], rat Rattus norvegicus (2 livers, 3 kidneys, 2 hearts) [89] and rhesus macaque Macaca mulatta (3 livers, 3 kidneys, 2 hearts) [89].
Prior to assembly, all reads (in FASTQ format) were filtered for adapter contamination, ambiguous residues (N's) and low quality regions using nesoni v0.123 (http://www.vicbioinformatics.com/software.nesoni.shtm l) with default settings. Trimmed read quality was visualized using FastQC (http://www.bioinformatics. bbsrc.ac.uk/projects/fastqc). For each species, trimmed reads from a tissue were pooled and assembled using Trinity v20140717 [12,13] with default parameters. Assembly statistics can be found in Supplemental Table 2.
Ortholog annotation. To identify orthologous proteincoding transcripts we employed a custom pipeline written in the R statistical computing language [93]. First, we extracted the longest open reading frame for each gene in Mouse Ensembl gene data set GRCm38.p2 (hereafter termed "Reference ORF") and confirmed they were all in frame, had start and stop codons, and did not have internal stop codons. For genes with multiple alternatively spliced transcripts, the longest transcript was kept. Putative orthologs between the mouse and Trinity assembled transcriptomes (each Trinity transcriptome contains a set of Trinity Transcripts) were identified by a bidirectional best-hit method by discontiguous MegaBLAST. MegaBLAST is bundled in v2.2.29 of the BLAST+ suite and optimized to find long cross-species alignments of highly similar sequences [94]. An "ortholog pair" was declared if a Reference ORF and a Trinity Transcript were mutual best hits and an "ortholog set" was declared if such ortholog pairs could be identified in all species. We further refined the sequences in each ortholog set as follows: they must have a start and stop codons in at least 80% of the sequences in the set; and they must be within ±50% of the median length of the sequences in the set.
Analysis of differential gene expression. For each species, trimmed RNA-seq reads (in FASTQ format) for www.impactaging.com each biological replicate were aligned to ortholog sets using TopHat2 [95] and read counting was performed using featureCounts [96]. Raw counts were normalized by Trimmed Mean of M-values (TMM) correction [97] using the R package edgeR [98]. Library size normalized read counts were next subjected to the voom function (variance modeling at the observation-level) function in limma R package v3.18.13 [99] with trend=TRUE for the eBayes function and correction for multiple testing (Benjamini-Hochberg false discovery rate of 0.05). Following limma analysis, strict parameters were set to denote genes as differentially expressed between the bowhead whale and other mammals: significant genes required a log2 fold-change of at least 2.0, a Benjamini Hochberg-adjusted P-value less than or equal to 0.05, and a B-value of least 2.945; representing a 95% probability that each gene was differentially expressed.
Phylogenetic analysis. Orthologs identified from ten de novo liver transcriptome assemblies were joined into one 'super gene' for each species. Briefly, nucleotide sequences were aligned with ClustalO v1.2.1 [100] and trimmed using Gblocks v0.91b, preserving codon information [101,102]. A ML (Maximal Likelihood) phylogenetic tree was constructed in RAxML-8.0.0 [103] using all codons included and the first and second codon respectively. The best likelihood trees were searched under the GTR-GAMMA model with six categories of rate variation (500 bootstrap replicates were undertaken for estimation of node support). Bayesian molecular dating was adopted to estimate species divergence time using MCMCTree implemented in PAML (v4.4b) [104]. For each Bayesian analysis, 2,000,000 generations of MCMC (Markov chain Monte Carlo) analysis of phylogenetic trees were performed, with the first 2,000 generations discarded as burn-in. The remaining trees were sampled every 100 generations to build consensus trees. Calibration times (50-60 Mya) of divergence between Cetacea (includes the bowhead whale and minke whale) and Artiodactyla (includes the cow and yak) were obtained from fossil records [2,3].
Identification of proteins with unique amino acid changes. To obtain single best orthologs to human RefSeq proteins, as in the UCSC human 100 species multiple alignment track [105], Trinity FASTA files of the bowhead whale kidney (n=4) and liver (n=3) were concatenated and putative open reading frames (ORFs) identified using the Perl script TransDecoder bundled with Trinity [13]. Next, to associate predicted bowhead whale peptide sequences with human RefSeq IDs, tBLASTn v2.2.29+ of the BLAST+ suite [94] with an E-value cut-off set at 1e-5 was employed. The best match (>50% overall amino acid sequence identity along the entire sequence and spanning >75% of the length of the query sequence) was used to annotate the sequences. Protein sequences from the the toothed whales the bottlenose dolphin (Tursiops truncatus) and the killer whale (Orcinus orca) were available in the UCSC multiple alignment track. The genomes of an additional baleen whale, the minke whale (Balaenoptera acutorostrata) [8] (GenBank Assembly GCF_000493695.1), and two toothed whales: the sperm whale (Physeter catodon) (NCBI Assembly GCF_000472045.1) and the Yangtze River dolphin (Lipotes vexillifer) [11] (GenBank Assembly GCF_000442215.1) were queried using human UCSC multiway coding sequences and gmap v2014-07-28 (a genomic mapping and alignment program for mRNA and EST sequences) [106] with the parameters --cross-species --align --direction=sense_force -Y. The gmap output was parsed using custom Perl scripts. Bowhead whale proteins were aligned to orthologs from 67 vertebrate species (Supplemental Table 7) using ClustalO v1.2.1 [100].
In-house Perl scripts were used to parse the ClustalO output and identify unique amino acids. A Perl script scanned orthologous proteins for sites where types/groups of residues unique to the bowhead whale. The script groups residues into four groups: acidic (ED), basic (KHR), cysteine (C) and "other" (STYNQGAVLIFPMW). For example, it identifies cases where the bowhead whale harbors E or D, whilst the other organisms contain exclusively basic, cysteine, or "other" residues at that particular site. The false positive rate of the detection of unique amino acids is approximately 1.33 per analysis; or in other words, one false positive per 315 unique amino acid residue candidates [39]. To validate the results pertaining to specific genes, we manually inspected multiple sequence alignments and interrogated raw bowhead whale transcriptome data for selected genes. We appreciate that, in lieu of lack of bowhead whale genome data, unique amino acid inferences are putative when supported by a limited number of RNAsequencing reads.
Identification of genes under positive selection in the bowhead whale. The orthologous gene set (9,395 protein-coding genes) identified in our R pipeline was aligned by ClustalO [100]. We next employed Gblocks [101,102] to minimise the impact of multiple sequence alignment errors and divergent regions. We used the program CodeML in the PAML v4.6 package to perform the optimized branch-site test [104,107], as described previously [39]. Briefly, we compared PAML modelA1 (where codons evolve neutrally and under www.impactaging.com purifying selection) with ModelA (where codons on the branch of interest can be under positive selection). Following PAML analysis, likelihood ratio test (LRT) P-values were computed assuming that the null distribution was a chi-squared distribution with 1 degree of freedom at a false discovery rate of 0.05. We applied a sequential Bonferroni correction to account for the multiple comparisons made in these analyses. Manual inspection of positive selection data is currently recommended in the literature [108,109], and we manually examined all significant alignments.