CMIP and ATP2C2 Modulate Phonological Short-Term Memory in Language Impairment

Specific language impairment (SLI) is a common developmental disorder characterized by difficulties in language acquisition despite otherwise normal development and in the absence of any obvious explanatory factors. We performed a high-density screen of SLI1, a region of chromosome 16q that shows highly significant and consistent linkage to nonword repetition, a measure of phonological short-term memory that is commonly impaired in SLI. Using two independent language-impaired samples, one family-based (211 families) and another selected from a population cohort on the basis of extreme language measures (490 cases), we detected association to two genes in the SLI1 region: that encoding c-maf-inducing protein (CMIP, minP = 5.5 × 10−7 at rs6564903) and that encoding calcium-transporting ATPase, type2C, member2 (ATP2C2, minP = 2.0 × 10−5 at rs11860694). Regression modeling indicated that each of these loci exerts an independent effect upon nonword repetition ability. Despite the consistent findings in language-impaired samples, investigation in a large unselected cohort (n = 3612) did not detect association. We therefore propose that variants in CMIP and ATP2C2 act to modulate phonological short-term memory primarily in the context of language impairment. As such, this investigation supports the hypothesis that some causes of language impairment are distinct from factors that influence normal language variation. This work therefore implicates CMIP and ATP2C2 in the etiology of SLI and provides molecular evidence for the importance of phonological short-term memory in language acquisition.

Developmental speech and language disorders are a heterogeneous group of childhood conditions with variable presentation and etiology. Together, they account for 40% of pediatric referrals 1 and statements of educational need. 2 The term specific language impairment (SLI) defines a category of speech and language disorders in which a profound language impairment represents the primary deficit. 2 This disorder affects 5%-8% of preschool children 2 and is highly heritable. 3 Nonetheless, in contrast to other related developmental disabilities (e.g., dyslexia [MIM #127700] and attention deficit hyperactivity disorder [ADHD, MIM #143465]), relatively few genetic studies have been performed for SLI. SLI is a prototypical multifactorial disorder that is predicted to involve numerous genetic loci and environmental factors. 3 Three primary sites of linkage have been described 4,5 , the most robust of which is on chromosome 16q (SLI1, MIM #606711). This region is of interest because the linkage is highly specific to a single psychometric measure (nonword repetition). 4,6,7 The test for nonword repetition involves the repetition of nonsensical words of increasing length and complexity and is regarded as a measure of phonological (speech sound) processing and short-term memory. 8 Individuals with SLI typically perform particularly poorly on nonword repetition, even when their language difficulties have apparently resolved, leading to the postulation that a short-term memory deficit causes susceptibility to SLI 9 by impairing the retention of novel verbal information. 10 This paper incorporates two contingent investigations: an association screen of the SLI1 region in a cohort of language-impaired families and a subsequent replication study of detected association effects in an independent sample selected from the Avon Longitudinal Study of Parents and Children (ALSPAC) general-population cohort. 11,12 The association screen utilized 806 individuals from 211 families ascertained by the SLI Consortium (SLIC). This nuclear-family cohort was collected from five sites across the UK (The Newcomen Centre at Guy's Hospital, London; the Cambridge Language and Speech Project (CLASP) 13 ; the Child Life and Health Department at the University of Edinburgh 14 ; the Department of Child Health at the University of Aberdeen; and the Manchester Language Study 15,16 ) and included the families in whom the SLI1 linkage was originally identified. Ethical permission for each collection was granted by local ethics committees. SLIC families were all selected on the basis of a single proband with receptive and/or expressive language skills more than 1.5 SD below the normative mean for his or her age. A more detailed description of these samples and the exclusionary criteria applied to the SLIC collection can be found in previous publications. 4,6,7 Genotyping for the association screen was performed in two phases with a combination of Sequenom and Illumina technologies. We performed an initial high-density screen involving 1906 SNPs to tag all 58 genes (including introns, exons, and 5 Kb 5 0 and 2 Kb 3 0 of coding sequences) mapped to the 10 18 Any between-block gap that was more than 15 Kb in size was tagged with the Tagger algorithm. Two genes that mapped to the region (CDH13 [MIM #601364] and WWOX [MIM #605131]) were found to be larger than 1 Mb in size. For these two genes, blocks were built to cover the exonic regions only. Any region containing a SNP that met our predefined significance threshold (p < 0.001 in any one analysis or p < 0.01 across both analyses) was then supplemented with additional markers in a follow-up panel that included 138 SNPs, eight of which had previously been genotyped. Both phases of genotyping were completed prior to the replication study and were subjected to consistent quality-control procedures. The total genotype mismatch rate was 0.73% for duplicated SNPs and 0.76% for duplicated samples. Across both phases, 261 (12.7%) of SNPs were excluded at the quality-control stage. These included SNPs with a genotype rate of <80%, a minor-allele frequency of <2.5%, SNPs with unusual Beadstudio cluster patterns (Illumina) or atypical peaks in MassArray TyperAnalyser (Sequenom), SNPs with a GenTrain score of <0.5 (Illumina), and markers that showed consistent bad inheritances (>10 errors after data clean up). Across the entire region, the merged data set consisted of, on average, one SNP every 6.4 Kb. Across the known genes, there was on average one SNP every 4.5 Kb, and the largest remaining gap between blocks was 19,579 bp. Details of SNP coverage can be found in Table S1. Q-Q plots can be found in Figure S1. Given the consistent linkage between SLI1 and nonword repetition, all association analyses were based upon this measure. Our principal analysis involved the variance-components modeling of 28-item nonword repetition scores 8 within 211 SLIC families (ao option) as a quantitative trait and was performed within QTDT. 19 In addition, we performed a categorical case-control allelic test of association within PLINK. 20 In this case-control analysis, SLIC individuals with low nonword-repetition scores (>2 SD below population mean, n ¼ 79) were chosen as cases, and family members with above-average performance (>0.5 SD above population mean, n ¼ 71) were used as controls. To avoid interdependence, we selected only one case or control from each family unit.
The initial screen involved 1678 SNPs, of which thirteen (0.77%) exceeded our significance threshold, highlighting two primary regions of association (Table 1 and Figure 1). The follow-up panel chiefly included SNPs in these two regions and supported the association seen in the screen while reducing the evidence for association at other loci (Table 2 and Figure 1). Of the 105 SNPs tested in the follow-up panel, five (4.8%) were found to be significantly associated ( Table 2 and Figure 1). The first identified cluster of association lay across 26 Kb (exons 2-4) of the CMIP gene (MIM #610112; seven significant SNPs, minP ¼ 5 3 10 À7 ). This gene encodes an adaptor protein and has two isoforms, the shorter of which is involved in cell signaling pathways and is upregulated in minimal change nephrotic syndrome (MCNS), a childhood kidney disease. 21 Little is known about the function of the longer transcript. Both isoforms are expressed in the brain. 21 The second region of association was observed between exons 7 and 12 (10.8 Kb) of the ATP2C2 gene (six significant SNPs, minP ¼ 2 3 10 À5 ). This gene is one of two secretory-pathway Ca 2þ -ATPases (SPCAs) that move cytosolic calcium and manganese ions into the golgi. 22 Its expression is limited to the brain, testis, gastrointestinal tract, and respiratory tissues and mammary, salivary, and thyroid glands. 22 In the mammary gland, ATP2C2 expression facilitates the secretion of Ca 2þ into casein micelles during lactation. 23 Three lines of evidence indicate that the associations at CMIP and ATP2C2 represent separate effects. First, we did not see any indication of long-range linkage disequilibrium between the two loci (which lie almost 3 Mb apart) in the SLIC cohort or public data ( Figure S2). Second, the inclusion of a CMIP covariate in the linkage or association model did not affect the level of linkage or association seen at ATP2C2 (or vice versa for ATP2C2 covariates) ( Figure S3). Finally, in a stepwise regression model, the group mean for SLIC individuals carrying a double-risk genotype was found to be significantly lower than those who were homozygous for risk at a single locus (p ¼ 3.7 3 10 À6 , Table 3). In this model, the group mean for double-risk individuals was 15.8 points (1.05 SD) below that of individuals carrying nonrisk variants at both loci (Table 3). We therefore propose that CMIP and ATP2C2 independently regulate nonword repetition performance and together underlie the linkage seen between SLI and chromosome 16.
Our replication sample consisted of 490 cases selected from the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort. 11,12 This is a general-population sample that follows the development of 14,062 live-born individuals born in the southwest of England. The ALSPAC group periodically performs an assessment of the development of consenting individuals, and these measurements include tests of language ability. Informed written consent was obtained from the parents at the time of enrolment. Ethical approval for the study was obtained from the ALSPAC Law and Ethics Committee and the Local Research Ethics Committees. Because the current study focuses upon language impairment, we selected individuals from the lower extreme of language-related phenotype distributions (Children's Communication Checklist (CCC) 24 and Wechsler Objective Language Dimensions (WOLD) 25 ) for our replication sample. This included 665 individuals (10.3%) with a CCC pragmatic composite 1-3 SD below the ALSPAC population mean (123 % 3 % 145) or a WOLD listening comprehension score R2 SD below the ALSPAC population mean (%3). Of these individuals, 490 had completed a 12-item nonword repetition test. Because the genotyping in the replication sample was restricted to a single individual from each family, we performed a quantitative association analysis within PLINK 20 by using nonword repetition in a linear-regression framework. In addition, we used PLINK 20 to carry out a casecontrol analysis analogous to that described for SLIC. We selected cases and controls from the extremes of the nonword repetition performance distribution of the 490 selected individuals. As expected, given the extreme nature of the language impairment in the SLIC samples, the distribution of nonword repetition differed between the SLIC and ALSPAC cohorts. Therefore, in the replication cohort, the cut-offs used for cases and controls were less extreme than those applied for the association screen. Cases were selected from the identified replication sample to have nonword repetition scores R1 SD below the general-population mean (n ¼ 112), and controls had nonword repetition scores R1 SD above the general-population mean (n ¼ 72). Data were analyzed for three CMIP and three ATP2C2 SNPs (rs12927866, rs4265801, and rs16955705; and rs16973771, rs2875891, and rs8045507, respectively), and significant associations (p < 0.05) were seen for two CMIP and two ATP2C2 SNPs (Table 4 and Figure 2). Regression trends for ATP2C2 followed those seen in SLIC, replicating the previously described association. Association to CMIP was in an opposite direction from that described above (Table 4 and Figure 2). Although this result might represent a type I error, the consistency of significant association in light of the low number of SNPs tested supports a role for CMIP. Associations can occur in opposite directions if the relationship between the observed and causal variants differs between populations. 26 This is particularly true if multiple risk loci interact in an additive or multiplicative fashion 26 , as is predicted for CMIP. Identification of the causal variant will enable the further characterization of the relationship between risk variants in different populations.
Given the partial replication of association, we investigated whether the primary associated SNPs in ATP2C2 and CMIP had an effect upon additional language-and memory-related measures (Table S2). In SLIC, we found borderline association for ATP2C2 with measures of receptive language (oral directions 27 27 [p ¼ 0.04]), and vocabulary 28 (p ¼ 0.04). In the replication cohort, aside from nonword repetition, we only observed borderline association between ATP2C2 and counting span, a measure of working memory (p ¼ 0.01). In the replication sample, nonword repetition performance had been scored according to the number of syllables the nonword contained. For both CMIP and ATP2C2, the majority of association came from the five-syllable nonwords (p ¼ 0.016 and p ¼ 6 3 10 À4 , respectively) (Table S2). In neither sample did we observe association to reading-related tasks, which have been reported to show linkage to SLI1. 6 Nor did we find any association to digit span 28 or recalling sentences, 27 two measures that have a high memory load. This is consistent with the finding that nonword repetition correlates with SLI to a higher degree than other shortterm memory tests (e.g., digit span). The sensitivity of nonword repetition to SLI could be because it places heavier demands on processing of speech sounds than other memory tests as a result of the child's having to perceive and produce an unfamiliar sequence. 29 It is important to note that, although nonword repetition is a good marker for SLI, poor performance on nonword repetition is not a perfect correlate of this disorder. 30 In our study, 50% of SLIC probands performed poorly (>1 SD below the expected population mean) on nonword repetition, but a significant number (27%) scored above the expected population mean. These findings support recent opinion that deficits across multiple domains are required to cause persistent language impairments. 31 A recent genome-wide association study of ADHD listed a SNP (rs10514604; p ¼ 8 3 10 À7 ) in ATP2C2 within the top 30 significant associations. 32 Despite distinct defining characteristics, ADHD and SLI show a high level of comorbidity both with each other 32 and with disorders such as developmental coordination disorder, speech-sound disorder (SSD; MIM #608445), and dyslexia. [33][34][35] For example, individuals with SLI, SSD, ADHD, or dyslexia often present with linguistic deficits and impairments in short-term memory. 33 It has therefore been suggested that certain aspects of these disorders might share a common etiology. Given the high levels of co-occurrence, we did not exclude children affected by ADHD and dyslexia from our study samples. However, in some of our SLIC samples, data were available for the presence of hyperactivity, coordination, and reading problems. From this, we estimate that approximately one-third of our SLIC samples showed some evidence of ADHD or developmental coordination disorder and that approximately onehalf of our probands had reading problems. In the entire ASLPAC sample, 1.3% of individuals met criteria for ADHD. In the selected ALSPAC replication sample, the rate of ADHD increased to 3.7%. Thus, as expected, it is clear that the rate of developmental disorders across our cohorts is elevated over that expected in a population   Table 1). Eight SNPs were genotyped in both the screen and follow-up panels. All of these markers showed some evidence of association in the screen phase (p < 0.01) but had genotype success rates of <95%, and none lay within CMIP or ATP2C2. Each of the duplicated SNPs showed increased success rates and decreased association levels in the follow-up panel. SNP alleles are given with the minor allele in the SLIC sample first. Putative risk alleles are marked with an asterisk. p Quant gives the p value for the quantitative, family-based analysis. p case-cont gives the p value for the case-control analysis. p values <0.01 are shown in bold. The odds ratios indicate the ratio of case/control odds for each additional copy of the putative risk allele. Odds ratios were calculated within PLINK. The effect size is the estimated effect of each risk allele on the nonword-repetition score (in SD 5 SE). Effect sizes were calculated with MERLIN. Heritability gives the proportion of total variance explained by the SNP. Heritability estimates were calculated with MERLIN. The p Emp column gives empirical p values for the given SNP; these values were derived from permutations within QTDT or PLINK. The effects of CMIP (rs6564903) and ATP2C2 (rs11860694) on nonword-repetition performance were modeled as additive effects within a regression framework in the R package. This regression model included all available SLIC children with genotype and nonword-repetition data (n ¼ 503). Group means were calculated for each SNP in isolation (''Single SNP'' entries) and in combinations of genotypes (3 3 3 grid) across risk SNPs. Note that individuals carrying combinations of risk alleles performed significantly worse than those carrying risk variants at a single locus. Nonword-repetition scores are age adjusted and standardized against normal population controls with a mean of 100 and a SD of 15. sample. Nonetheless, the association detected in our samples shows a strong correlation to nonword-repetition ability which has repeatedly been shown to be a strong indicator of language impairment. 9,10 Furthermore, in ADHD samples, performance on the nonword-repetition task is correlated with linguistic ability rather than the presence of hyperactivity. 33,36 Thus, we conclude that variants in ATP2C2 might account for shared aspects of the linguistic deficit in SLI and ADHD. Given this possibility, we also postulate that ATP2C2 might contribute to phonological short-term memory in other developmental disorders.
Finally, we investigated the effects of ATP2C2 and CMIP on nonword-repetition performance at the population level. Across the entire unselected ALSPAC population (n ¼ 3612), there was no evidence for quantitative association between nonword-repetition ability and either locus (minP ¼ 0.48). Moreover, there were no differences in allele frequency for ATP2C2 or CMIP SNPs between either SLIC or replication-sample individuals and unselected European population controls (data not shown). Taken together, these data indicate that ATP2C2 and CMIP do not modulate nonword-repetition performance across the entire population, nor, in isolation, do they cause SNP alleles are given with the minor allele first. Putative risk alleles in the replication cohort are marked with an asterisk. p Quant shows the p value for the quantitative analysis. p < 0.05 are highlighted in bold. The odds ratio indicates the ratio of case/control odds for each additional copy of the putative risk allele. The 95% confidence intervals for the odds ratios of all significantly associated SNPs exceeded 1.0. The effect size is the estimated effect of each risk allele on the nonwordrepetition score (in SD). a predisposition to SLI. Instead, we propose that when combined with additional, as-yet-unidentified, susceptibility factors (either genetic or environmental), variants in ATP2C2 and CMIP have a detrimental effect upon nonword repetition performance and thus heighten the risk of developmental language impairments. This situation demonstrates a fundamental principle often overlooked in the mapping of complex disorders: that genetic variants might have selective effects in specific populations depending upon the genetic and environmental background. The question as to whether SLI constitutes a qualitatively distinct disorder caused by abnormal development of language abilities or merely represents the tail end of normal linguistic development is a matter of recent debate. 37 Although the absence of association in our population sample could reflect insufficient sample sizes or the insensitivity of psychometric tests to quantify variation beyond the lower extremes of the spectrum, it is obvious that the effects of ATP2C2 and CMIP upon nonword-repetition performance are particularly pertinent to individuals with language difficulties. As such, this investigation provides molecular evidence that, at least in terms of the effects described here, SLI represents a distinct disorder caused by genetic variants discrete from those that influence language ability in the general population.
In summary, we have used a positional fine-mapping approach to demonstrate association between ATP2C2 and CMIP and nonword repetition performance across two independent language-impaired populations. We propose that variants in both loci combine to modulate nonword-repetition performance in language-impaired populations. Both genes are expressed in the brain and represent good candidates for language-and memoryrelated processes. ATP2C2 is involved in the translocation of cytosolic calcium and manganese ions to the golgi. 22 Calcium homeostasis is important for the regulation of many neuronal processes, including working memory, synaptic plasticity, and neuronal motility 38 , and manganese dysregulation has been linked to Parkinsonism (MIM #168600), Alzheimer disease (MIM #104300), and disordered memory. 39 The functional role of CMIP is less defined, but it is known to interact with filamin A (MIM #300017) 40 and the NF-kappaB subunit RelA (MIM #164014). 41 The filaminA protein is involved in the reorganization of the actin cytoskeleton, which is of importance in the formation of the dendritic spine. 40 The NF-kB family of transcription factors plays a central role in many neuronal processes, including synaptic activity and memory formation, and members of this family have been implicated in neurodegenerative disorders. 42 Further characterization of the observed associations has enabled us to infer that SLI represents a qualitatively distinct disorder caused by a combination of genetic variants that disrupt multiple pathways important to the development of language. It is anticipated that the functional characterization of ATP2C2 and CMIP will promote a better understanding of the molecular basis of language acquisition and aid in the diagnosis and treatment of individuals affected by language disorders.

Supplemental Data
Supplemental Data include three figures and two tables and can be found with this article online at http://www.ajhg.org/.