Introduction

One of the most striking aspects of humans is our large brains. Human brains are about three times larger than those of chimpanzees, bonobos, and gorillas, which are our closest living relatives and the species with the next largest brains among extant primates1. Both absolute and relative brain mass generally increased throughout primate evolution, with independent increases occurring in all major clades2,3. Explorations of the genetic underpinnings of primate brain expansions can shed light on the mechanisms by which they have occurred. While some studies have focused on identifying ‘human-specific’ genomic alterations, often by comparing humans to the other apes or to chimpanzees and rhesus macaques only (e.g., segmental duplications of SRGAP2C and ARGHAP11B4; human-accelerated regulatory enhancer (HARE5) of FZD85), broader phylogenetic analyses have the potential to inform us about the extent to which certain genomic or neurodevelopmental processes are conserved across large-brained primate species. Previous work has typically focused on genes that regulate the extent or duration of progenitor cell production or death during neurogenesis6,7,8,9, and this approach has identified multiple genes and pathways that seem to have played a role in primate brain size evolution. For example, NIN, a gene involved in regulating neurogenic divisions of radial glial cells, was subject to positive selection during anthropoid primate evolution, and selection on this gene, as measured by the ratio of the rates of nonsynonymous to synonymous substitutions since the last common ancestor of the species in the data set (i.e., root-to-tip dN/dS), predicts brain size across anthropoid species10. Similarly, a recent genome-wide comparative analysis identified numerous conserved gene-brain associations (i.e., positive correlations between selection pressure and absolute or relative brain size) across three independent episodes of primate brain size expansion11. Additional studies have demonstrated associations between genes and the expansion of specific brain regions. For example, selection pressure on multiple genes involved in cerebellar development predicts cerebellum size across anthropoid species12.

Furthermore, selection on genes associated with human disorders that produce abnormal brain sizes has been proposed to play a major role in primate brain evolution13,14,15. For example, genes known to be involved in human primary autosomal recessive microcephaly have received considerable attention in the literature. These include ASPM, MCPH1, CDK5RAP2, and CENPJ, all of which are expressed in the fetal brain during neurogenesis16,17,18,19,20,21. These genes are involved in apoptosis22 and centrosome and microtubule formation23,24,25,26, the latter of which may regulate neural progenitor proliferation since spindle orientation influences cell division16 and cell cycle progression27. Phenotypic correlation studies in humans suggest that these genes affect total brain size and cortical surface area, but not cortical thickness, which is likely to reflect their involvement in regulating the size of the neural progenitor pool28,29,30. Previous work has demonstrated that these genes have been subject to positive selection across primates31,32,33,34, and that selection pressure on these genes is associated with both brain mass increases33 and decreases35 across primates (in addition to eutherian mammals more generally36). There is also evidence that these genes were involved in the evolution of the large human brain, as they may have been under selection in humans during the last 40 kya37,38,39 and human-specific SNPs and methylation patterns in these genes have been identified40. Selection on genes associated with microcephaly may also have contributed to differences in sexual dimorphism in brain size across species41. Consistent with this, microcephaly-related gene mutations may have sex-specific effects on human cranial volume30 via estrogen-mediated regulation42.

The many findings linking selection on microcephaly-associated genes to inter-specific variation in brain size is a validation of this candidate gene approach to investigating primate brain evolution. It also suggests that the investigation of genes associated with additional human disorders linked to abnormal brain sizes may also provide insight into the mechanisms underlying primate brain size evolution. Here, we examine 16 genes that have been linked to megalencephaly in humans, defined as an oversized brain that exceeds the age-related mean by two or more standard deviations43. Recent work suggests that children with both megalencephaly and autism exhibit increased cortical surface area, but not cortical thickness, relative to those who have autism only (i.e., without megalencephaly) or are developing along a typical trajectory44. This suggests that, similar to microcephaly-related genes, megalencephaly-related genes may affect the size of the neural progenitor pool. Dysregulation of two functionally related cellular pathways, the Ras/mitogen-activated protein kinase (RAS-MAPK) pathway and the phosphatidyl-inositiol 3-kinase/protein kinase B/mammalian target of rapamysin (PI3K-AKT-mTOR) pathway, account for the majority of megalencephaly syndromes. Both pathways are associated with cellular functions that are critical for proper brain development, including cellular proliferation, differentiation, cell cycle regulation, and survival/apoptosis. This creates the potential for these genes, like those associated with microcephaly conditions, to be involved in the genetic mechanisms underlying inter-specific variation in primate brain size. To assess the evidence for this, we examined genes in both pathways, including PTEN, PIK3CA, AKT1, AKT3, STRADA, TBC1D7, CCND2, and MTOR in the PI3K-AKT-mTOR pathway, and SPRED1 and RIN2 in the RAS-MAPK pathway43,45. We also examined additional genes that have been associated with megalencephaly syndromes, including KIF7 and OFD1 (involved in centrosome and microtubule assembly43), BRWD3 (involved in the Janus kinase/signal transducers and activators of transcription signaling (JAK-STAT) pathway, activation of which stimulates cell proliferation, differentiation, migration, and apoptosis45,46), GPC3 (involved in Notch signaling45), EXT2 (involved in neuron elongation47); and HEPACAM (involved in cell adhesion/morphogenesis43). All of these genes are expressed in the fetal brain during neurogenesis (brainspan.org).

We investigated the molecular evolution of these genes in relation to brain size across primates, with two aims: Aim 1: To determine whether megalencephaly-associated genes evolved under positive selection across primates. Given that several microcephaly-related genes were subject to positive selection throughout primate evolution, we predicted that one or more megalencephaly-related genes would similarly be under positive selection (Prediction 1). Aim 2: To determine whether selection pressure on megalencephaly-associated genes is linked to measures of primate brain size. Given that these genes are involved in the proliferation and survival of neurons, that absolute brain size increases linearly with the total number of neurons48, and that brain and body size may be genetically and developmentally decoupled2,49, we predicted that selection pressure on one or more megalencephaly-related genes would be correlated with absolute (Prediction 2), but not relative (Prediction 3), brain size across primates. These predictions are in line with previous work on genes associated with microcephaly33.

Results

Aim 1: To determine whether megalencephaly-associated genes evolved under positive selection across primates

To detect positive selection across primates, we employed site models in PAML50, which allow dN/dS ratios to vary across sites but not across lineages51. We then compared the likelihoods of two types of models, including: (1) a “nearly neutral” model which allows sites to fall into two categories, representing purifying selection and neutral evolution; and (2) a “positive selection” model which allows sites to fall into three categories, including purifying selection, neutral evolution, and positive selection52. We did not find evidence for positive selection for most (15/16) of the genes analyzed here. In each of these cases, the likelihood of the model that allowed sites to evolve by positive selection was not significantly higher than the likelihood of the model that allowed sites to evolve by purifying or neutral selection only (Table 1). We did find evidence for positive selection for one gene, OFD1 (likelihood ratio = 35.405, p = 2e−8, p-adj = 3.2e−7), with 1.8% of sites estimated to have dN/dS values greater than 1 (= 6.224). This supports our Prediction 1.

Table 1 Results for Aim 1: Testing whether megalencephaly-associated genes evolved under positive selection across primates.

Aim 2: To determine whether selection pressure on megalencephaly-associated genes is linked to measures of primate brain size

To examine relationships between selection pressure (i.e., root-to-tip dN/dS) and either absolute or relative brain size, we employed free-ratios branch models in PAML50, which allow dN/dS ratios to vary across branches but not across sites. We calculated root-to-tip dN/dS values for each species by the addition of dN values and dS values from the root to the terminal species branch and taking the ratio of the sums. These values were then used as predictors in phylogenetic generalized least squares (PGLS) regression models of brain size. We found no evidence for significant associations between selection pressure (i.e., root-to-tip dN/dS values) and absolute brain size for any of the genes analyzed (p-adj > 0.05; Table 2). When Callithrix jacchus and Microcebus murinus were excluded (i.e., species that have experienced secondary evolutionary decreases in brain size3; see “Methods”), we detected a positive relationship between selection pressure and absolute brain size for GPC3 and TBC1D7; however, these associations were not significant after FDR correction (GPC3: estimate = 1.637, p = 0.017, p-adj = 0.119) and TBC1D7 (estimate = 3.463, p = 0.011, p-adj = 0.119) (Fig. 1; Supplementary Table S1). The relationship for GPC3 may be linked to increased dN specifically (multiple regression: dN estimate = 1.883, p = 0.012, p-adj = 0.168) (Fig. 1; Supplementary Table S1). These results are bolstered by sensitivity analyses (in which we removed one species at a time and re-ran the regression models). Specifically, the association between root-to-tip dN/dS and absolute brain size was only significant for TBC1D7 when Callithrix was excluded (p = 0.003) and for GPC3 when Callithrix (p = 0.034) or Mandrillus (p = 0.042) were excluded. Overall, these results provide weak support for Prediction 2, as they do not survive FDR correction and may be sensitive to species sampling. Body size significantly predicted brain size in all models of relative brain size, and we did not find evidence for any significant associations between selection pressure and relative brain size in any analysis (p > 0.05; Table 2, Supplementary Table S1), consistent with our Prediction 3. We could not test any of these relationships for two genes in our analysis, AKT3 and PTEN, since most species had root-to-tip dN/dS values of zero for these genes (Supplementary Tables S2, S3).

Table 2 Results for Aim 2: Testing whether selection pressure on megalencephaly-associated genes is linked to measures of primate brain size.
Figure 1
figure 1

Two candidate genes exhibit associations between selection pressure and brain size in a subset of analyses. We present PGLS regression models of absolute brain size (log) ~ root-to-tip dN/dS (log) for GPC3 (left) and TBC1D7 (right). Each species is represented by a data point. The regression lines from models including all species are solid and those from models excluding Callithrix jacchus and Microcebus murinus are dashed. Callithrix jacchus and Microcebus murinus are represented by triangles. Regressions are significant (nominal p < 0.05) only when Callithrix jacchus and Microcebus murinus are excluded (see “Methods”, Table 2, Supplementary Table S1).

Discussion

In the present study, we investigated the evolutionary histories of genes associated with human megalencephaly to examine their potential roles in primate brain size evolution. We tested whether any of the candidate genes that we selected evolved under positive selection across the primate phylogeny, and identified one gene with this pattern, namely OFD1; however, we did not detect a relationship between selection pressure on this gene and brain size across species. This suggests that selection for changes to other (i.e., non-brain size) phenotypes facilitated evolutionary changes to this gene. Although we did not identify any significant associations between selection pressure and brain size for any of our candidate genes, we did find positive relationships for GPC3 and TBC1D7 when phyletic dwarfs were excluded; however, these findings should be interpreted with caution since: (i) they did not survive FDR correction; (ii) the association for GPC3 may be sensitive to species sampling; and (iii) increasing dN/dS values may reflect positive or relaxed selection. Accordingly, our results represent equivocal evidence that some megalencephaly-associated genes may have been involved in primate brain size evolution.

Although we found that one of our candidate genes, OFD1, evolved under positive selection across primates, selection pressure on this gene does not predict brain size. Accordingly, it is likely that selection for changes to other, non-brain size phenotypes influenced by this gene drove evolutionary changes to its coding regions. While certain mutations to this gene cause syndromes that include megalencephaly and central nervous system deficits, reflecting its role in brain development, OFD1 is also involved in the growth and development of other parts of the body, including limb bud patterning and bone development53. Accordingly, selection for changes to limb morphology or proportions, which vary greatly across primates, could account for our findings. Interestingly, previous work identified signatures of positive selection on OFD1 for certain mammalian branches, including the internal branch leading to the eutherians and the terminal branches leading to opossum, horse and tree shrew54. Although this study did not detect positive selection on any primate branches, this may reflect that only four primate species were included in their analysis.

We did identify two instances in which selection pressure predicted absolute brain size across species, namely for GPC3 and TBC1D7, when Callithrix jacchus and Microcebus murinus were excluded, but the association did not survive FDR correction. While a trend of increasing brain size is typical of most primate lineages, previous work has suggested that both callitrichids and cheirogaleids underwent secondary reductions in brain size as a consequence of selection for decreased body size (phyletic dwarfism3), and that some microcephaly genes actually facilitated this decrease in callitrichids35. In line with this, studies of microcephaly genes suggest that callitrichids are outliers in primate-wide analyses33, and that, within this clade, there is a negative relationship between root-to-tip dN/dS and brain size35. Together with these findings, our results suggest that megalencephaly-associated genes may similarly be involved in both evolutionary increases and decreases in brain size. However, root-to-tip dN/dS values were less than 1, so we must acknowledge that increasing dN/dS values in larger-brained lineages may reflect either increasing positive selection or relaxed selection.

For GPC3, the detected relationship between selection pressure and absolute brain size appears to be driven by an acceleration of dN specifically (Supplementary Table S1), although these results may be sensitive to species sampling (see “Results”). Previous work suggests that certain alterations to this gene lead to Simpson–Golabi–Behmel syndrome, an overgrowth syndrome that is often associated with megalencephaly. Specifically, this gene regulates embryonic growth by inhibiting the Hedgehog signaling pathway through competition for the hedgehog receptor55,56. These pathways may directly affect brain growth since hedgehog signaling increases the proliferation of neocortical precursors57. Additionally, neuronal differentiation is associated with upregulation of GPC3, suggesting the extracellular components encoded by this gene also help regulate the distribution and activity of extracellular signaling molecules during neuronal development58.

For TBC1D7, the relationship between selection pressure and brain size may be driven by an acceleration of dN and/or a deceleration of dS (neither coefficient estimate is significant in the multiple regression model; Supplementary Table S2). Although this result is not as straightforward as that for GPC3, it does not necessarily represent evidence against selection. Specifically, both dN and dS depend on local mutation rates, so when dS decreases (thereby changing dN/dS), we would expect dN to also decrease in the absence of selection. In this case, maintaining an elevated dN relative to a decreasing dS may reflect selection on TPC1D7 that is relevant to brain size evolution. This gene negatively regulates cell growth via suppression of mTOR (mammalian target of rapamycin) signaling59, and mTOR complexes regulate many functions critical to brain development, including proliferation, differentiation and migration60. Interestingly, a recent comparison of human organoid, chimpanzee organoid, and macaque primary cells suggested that human radial glia exhibit relatively greater increased mTOR activation61.

After FDR correction, we did not detect an association between selection pressure and brain size in any analysis. These findings were unexpected given the established links between primate brain size and genes associated with microcephaly, another disorder that produces abnormal brain sizes33. While parameter uncertainty may be due to low species sample sizes (and in some analyses, the inclusion of phyletic dwarfs), lack of significant associations may also reflect that genes associated with megalencephaly are within intermediary signaling pathways that have highly pleiotropic functions. In line with this, most syndromes that cause megalencephaly, including those linked to the genes examined here, also cause body overgrowth, cancers, and epilepsy as part of generalized overgrowth syndromes45. These pleiotropic, and sometimes also deleterious, effects of mutations to megalencephaly-related genes are likely to constrain their evolvability due to widespread, multivariate stabilizing selection62. For example, certain alterations to these genes could shift some traits (e.g., brain size) closer to their optima while simultaneously shifting other traits (e.g., body size) away from their optima, thereby hampering the ability of these genes to respond to selection for increased brain size. This is in contrast to microcephaly, which is not usually associated with profound somatic growth abnormalities63. Although the specific mechanisms that lead this phenotype to be brain-specific are not fully understood, new work suggests it may reflect differences between tissues in the expression of certain inhibitory splicing proteins when certain mutations create binding sites for these proteins within regions containing specific microcephaly genes64.

A subset of the results presented here partially overlap with prior work from Boddy et al.11, who took a different approach—performing a genome-wide analysis of thousands of orthologous genes across three independent episodes of brain size increase in primates. Many of the genes examined here were not included in the prior study (i.e., PTEN, PIK3CA, AKT3, MTOR, RIN2, EZH2, MED12, OFD1, BRWD3, GPC3) due to their justifiably conservative filtering approach. However, some genes did overlap, and neither study found significant associations between brain size and selection pressure for AKT1, STRADA, CCND2, or SPRED1. For KIF7, Boddy and colleagues11 found that the dN/dS ratio was higher in the Colobus versus Papio lineages, even though the latter exhibit larger brains, in line with our finding of a (non-significant) negative association between selection pressure on this gene and brain size. Finally, Boddy and colleagues11 found that selection pressure on TBC1D7 predicts a measure of relative brain size (EQ), while we found a weak association with absolute brain size; however, our analyses differ greatly with regard to species sampling (6 species versus 23 species).

The work presented here enhances our understanding of the shared genetic basis underlying changes in brain size across primates. While numerous studies have examined genes associated with human microcephaly, this is the first study, to our knowledge, to focus explicitly on genes associated with megalencephaly. While we focused on how changes to the structure of the proteins encoded by these genes may have played a role in primate brain size evolution, future studies of comparative variation in the regulation and expression levels of these genes may provide further insights. Interestingly, previous work suggests that the expression levels of multiple genes analyzed here: (1) are differentially expressed between human and rhesus macaque brains during the prenatal and early postnatal periods65; (2) are differentially expressed between human and chimpanzee cerebral organoids66; and (3) exhibit delayed patterns of change during human brain development relative to macaque brain development67. In addition, the expression level of PTEN appears to be correlated with brain size across primate species (N = 18), and there is evidence for selection in the proximal regulatory region of this gene68. Given that different genes in our analyses showed signatures of positive selection or equivocal gene-phenotype associations, this study highlights the importance of including phenotypic data when attempting to identify genes involved in brain size evolution. Furthermore, although the majority of neurological differences between species may be due to differences in gene expression, this work, along with studies of microcephaly-related genes, suggest that coding sequence changes also play an important role.

Methods

Data collection

For each candidate gene, we retrieved coding sequences from primate species that had whole genome sequences available on NCBI (maximum N = 23 species): Callithrix jacchusSaimiri boliviensisAotus nancymaaeCebus capucinus imitatorCercocebus atysMandrillus leucophaeusPapio anubisMacaca mulattaMacaca nemestrinaChlorocebus sabaeusNasalis larvatusRhinopithecus roxellanaRhinopithecus bietiColobus angolensisGorilla gorilla gorillaPan paniscusPan troglodytesHomo sapiens, Pongo abeliiNomascus leucogenysTarsius syrichtaMicrocebus murinus, and Otolemur garnettii (GenBank accession numbers in Supplementary Tables S4, S5). Annotations of coding sequences were confirmed via BLAST searches against each species’ genome assembly, using the Homo sapiens exon sequences as queries. Exons missing from automated annotations were completed via these BLAST searches, where possible. We confirmed the existence of or filled gaps in the coding regions of the candidate genes by conducting BLAST searches against the Short Read Archive (SRA). Exon sequences were aligned in Geneious69 using MUSCLE and default settings (alignments are available in Supplementary Data). Sequences that aligned poorly were removed from alignments and excluded from downstream analyses. Missing alignment data were excluded from PAML analyses (cleandata = 1)50. Accordingly, to preserve robust sampling across each gene sequence, species with < 70% gene coverage were excluded, and species sample sizes varied slightly across genes (AKT1: N = 20; AKT3: N = 23; BRWD3: N = 21; CCND2: N = 23; EXT2: N = 23; GPC3: N = 18; HEPACAM: N = 22; KIF7: N = 23; MTOR: N = 23; OFD1: N = 19; PIK3CA: N = 20; PTEN: N = 23; RIN2: N = 23; SPRED1: N = 23; STRADA: N = 23; TBC1D7: N = 23). Brain and body size data were collected for these species from published literature sources3,70,71,72 (see Supplementary Table S6).

Statistical analyses

A common measure used to infer selection pressures acting on coding regions of genes is the ratio of the rates of nonsynonymous to synonymous base changes (dN/dS). Nonsynonymous changes refer to those that result in an amino acid change, while synonymous mutations refer to those that do not cause a change in the amino acid sequence.

Aim 1: To determine whether megalencephaly-associated genes evolved under positive selection across primates

To detect positive selection across primates, we employed site models in PAML (version 4.8)50, which allow dN/dS ratios to vary across sites but not across lineages51. We compared two types of models, including: (1) a “nearly neutral” model (denoted as ‘M1a’ in PAML) which allows sites to fall into two categories, representing purifying selection (dN/dS < 1) and neutral evolution (dN/dS = 1); and (2) a “positive selection” model (denoted as ‘M2a’ in PAML) which allows sites to fall into three categories, including purifying selection (dN/dS < 1), neutral evolution (dN/dS = 1), and positive selection (dN/dS > 1)52. These nested models were compared using the likelihood ratio test statistic − 2[loglikelihood(M1a) − loglikelihood(M2a)], with degrees of freedom equal to the difference in the number of parameters estimated by each model. When model M2a exhibited a significantly higher likelihood value than model M1a, this was taken as evidence for positive selection. Small negative likelihood ratio values were assumed to be estimates of zero73.

Aim 2: To determine whether selection pressure on megalencephaly-associated genes is linked to measures of primate brain size

To examine relationships between selection pressure (i.e., root-to-tip dN/dS) and either absolute or relative brain size, we followed other studies74,75,76,77,78 in employing free-ratios branch models in PAML (version 4.8)50, which allow dN/dS ratios to vary across branches but not across sites. We then calculated root-to-tip dN/dS values for each species by the addition of dN values and dS values from the root to the terminal species branch and taking the ratio of the sums. These values were set as species data and used as predictors in regression models of brain size. Although some previous studies that used this approach incorporated the dN/dS values of terminal branches79,80, the root-to-tip dN/dS is a more appropriate measure because it is more inclusive of the evolutionary history of a locus. In addition, some studies have used lineage-specific branch models (i.e., two-branch models), in which one model is run for each lineage with dN/dS constrained to one value over all branches leading to that lineage, to obtain root-to-tip dN/dS values for each species11,33. The approach used here (and in other studies74,75,76,77,78) may produce root-to-tip dN/dS values that are more comparable across species, since species with shared internal branches will have the same values for those branches included in their root-to-tip dN/dS calculations; however, we acknowledge that the free ratio model implemented here is relatively more parameter rich than the two-branch model, which may impact parameter estimation. In any case, the results of both methods are expected to be similar since they both rely on reconstructed nucleotide sequences and neither is able to address the uncertainty surrounding rate estimates. For models of absolute brain size, we modelled brain size as a function of root-to-tip dN/dS. We modeled relative brain size by also including body size as a predictor. Additionally, we performed multiple regressions for both relative and absolute brain size models in order to examine relationships with dN and dS as independent variables. Prior to analysis, all variables were log-transformed to ensure that residuals were normally distributed. We report nominal and FDR adjusted p-values81.

Species do not represent independent data points due to their shared evolutionary history, such that closely related species are more likely to exhibit similar phenotypes to each other than are distantly related species. To control for shared evolutionary history, we used phylogenetic generalized least squares (PGLS) regression models, incorporating the topologies and branch lengths from the GenBank taxonomy consensus tree provided on the 10kTrees website82. For each model, we allowed the phylogenetic scaling factor (λ) to take the value of its maximum likelihood. In some of the analyses, maximum-likelihood estimations of λ were equal to zero, and the log-likelihood plots of lambda were very flat (i.e., all values of lambda had very similar likelihood values, suggesting uncertainty regarding the best value of lambda), which may reflect low species sample sizes. Accordingly, these models were run using an average value of lambda, weighted according to likelihood (although results did not change when lambda values of 0 were used for these models). We also repeated these analyses with lambda = 0 and 1.

Previous work has suggested that callitrichids and cheirogaleids underwent recent evolutionary decreases in brain size3 and that some microcephaly genes actually facilitated this decrease in callitrichids, altering the expected relationship between root-to-tip dN/dS and brain size in these species35. Accordingly, we repeated our regression analyses excluding both Callithrix jacchus and Microcebus murinus. To test the sensitivity of these results, we re-ran relevant models removing one species at a time.