Two complement receptor one alleles have opposing associations with cerebral malaria and interact with α+thalassaemia

Malaria has been a major driving force in the evolution of the human genome. In sub-Saharan African populations, two neighbouring polymorphisms in the Complement Receptor One (CR1) gene, named Sl2 and McCb, occur at high frequencies, consistent with selection by malaria. Previous studies have been inconclusive. Using a large case-control study of severe malaria in Kenyan children and statistical models adjusted for confounders, we estimate the relationship between Sl2 and McCb and malaria phenotypes, and find they have opposing associations. The Sl2 polymorphism is associated with markedly reduced odds of cerebral malaria and death, while the McCb polymorphism is associated with increased odds of cerebral malaria. We also identify an apparent interaction between Sl2 and α+thalassaemia, with the protective association of Sl2 greatest in children with normal α-globin. The complex relationship between these three mutations may explain previous conflicting findings, highlighting the importance of considering genetic interactions in disease-association studies.


Introduction
Complement Receptor One (CR1) plays a key role in the control of complement activation and the immune clearance of C3b/C4b-coated immune complexes . CR1 is expressed on a range of cells including red blood cells (RBCs), leucocytes and glomerular podocytes . A number of CR1 polymorphisms have been described, including four molecular weight variants and variation in the number of CR1 molecules expressed on the surface of RBCs (reviewed by [Krych-Goldberg et al., 2002;Schmidt et al., 2015]). Missense mutations of CR1 form the basis of the Knops blood group system of antigens, that includes the antithetical antigen pairs of Swain-Langley 1 and 2 (Sl1 and Sl2) and McCoy a and b (McC a and McC b ) (Moulds, 2010). The non-synonymous single nucleotide polymorphisms (SNPs) A4828G (rs17047661) and A4795G (rs17047660) within exon 29 of the CR1 gene give rise to the Sl1/ eLife digest Malaria kills more than half a million children in Africa every year. The disease is caused by the Plasmodium falciparum parasite, and mosquitos infected with the parasites spread them to humans when they bite. Once inside a human, the parasites infect the red blood cells. In severe cases, these red blood cells can stick to the walls of small blood vessels that supply the brain and so hinder the flow of oxygen, causing a coma. This is called cerebral malaria. Malaria can also result in the destruction of many oxygen-carrying red blood cells, which causes severe anemia. Both cerebral malaria and severe anemia can lead to death.
Small changes (called mutations) in certain human genes can protect against malaria. Over time, mutations that protect people living in Africa from dying from malaria have been passed down through generations. A good example is the sickle cell mutation, which causes red blood cells to be of an unusual shape, but also affects the ability of malaria parasites to grow normally within red cells. Finding new mutations that protect against malaria may help scientists understand how severe malaria happens and eventually develop new drugs and vaccines against the disease. Some studies have found that mutations in a gene called complement receptor 1 (CR1) may be protective, although others have disagreed. Now, Opi, Swann et al. show that children with one of the CR1 mutations were one-third less likely to get cerebral malaria and half as likely to die as children without the mutation. In the study, genetic and health information on more than 5,500 children in Kenya were analyzed to see if the severity of malaria differed depending on whether they had a CR1 mutation. They also found that the CR1 mutation is only protective against severe malaria when the child does not have another malaria-protective mutation called a-thalassemia. In children with a-thalassemia, the CR1 mutation does not make a difference.
The interaction between the CR1 mutation and a-thalassemia may explain why some studies did not show a benefit of CR1. If the researchers did not include a-thalassemia in their assessment, they could not have seen the whole picture. Future studies showing how the CR1 mutation protects against cerebral malaria could help identify new treatments that prevent severe disease or death. More study of interactions between genes that play a role in malaria may also be helpful.   Schmidt et al. (2015) and Krych-Goldberg et al. (2002). The ectodomain of CR1 is composed of 30 Complement Control Protein (CCP) domains which are organized into four 'Long Homologous Repeats' (LHR). The single-nucleotide polymorphisms determining the Sl and McC antigens of the Knops blood group system are found in CCP 25 in LHR-D (red). Various functions have been mapped to different regions of CR1, including Site 1 (decay accelerating activity for C3 convertases; binding of the complement component C4b and the P. falciparum invasion ligand PfRH4), and Site 2 (cofactor activity for Factor I; binding of C3b and C4b and P. falciparum rosetting). LHR-D is thought to bind C1q and Mannose Binding lectin (MBL), but the specific binding sites have not been mapped. TM, transmembrane region; CYT, cytoplasmic tail. DOI: https://doi.org/10. 7554/eLife.31579.003 Sl2 and McC a /McC b alleles, encoding R1601G and K1590E, respectively (Moulds et al., 2001) ( Figure 1). CR1 has been implicated in the pathogenesis of multiple diseases, with epidemiological and in vitro data suggesting a role in malaria (Schmidt et al., 2015). The Sl2 and McC b alleles occur at high frequencies only in populations of African origin ( Figure 2) (Thathy et al., 2005;Zimmerman et al., 2003;Moulds et al., 2004;Noumsi et al., 2011;Fitness et al., 2004;Covas et al., 2007;Gandhi et al., 2009;Yoon et al., 2013;Hansson et al., 2013;Kariuki et al., 2013;Eid et al., 2010), which, given the historical prevalence of the malaria-causing parasite Plasmodium falciparum in sub-Saharan Africa, might suggest a possible survival advantage against malaria (Rowe et al., 1997;Rowe et al., 2000). CR1 is a receptor for the invasion of RBCs by Plasmodium falciparum merozoites (Spadafora et al., 2010;Tham et al., 2010) and for the formation of clusters of P. falciparum-infected RBCs (iRBCs) and uninfected RBCs, known as rosettes (Rowe et al., 1997). The rosetting phenotype is associated with severe malaria in sub-Saharan Africa (Doumbo et al., 2009), with pathological effects likely due to the obstruction of microcirculatory blood flow (Kaul et al., 1991). RBCs from donors with the high-frequency African CR1 Knops mutations bind poorly to the parasite ligand P. falciparum erythrocyte membrane protein-1 (PfEMP1) that mediates rosetting by iRBCs, potentially protecting against severe malaria by reducing rosetting (Rowe et al., 1997). Nevertheless, epidemiological data supporting this possibility are contradictory, with some studies showing an association between Sl and McC genotypes and severe malaria (Thathy et al., 2005;Kariuki et al., 2013;Tettey et al., 2015) and others finding none (Zimmerman et al., 2003;Hansson et al., 2013;Jallow et al., 2009;Manjurano et al., 2012;Toure et al., 2012;Rockett et al., 2014). Some previous studies have not considered Sl and McC genotypes together in the same statistical model, despite their physical adjacency in the CR1 molecule, nor taken into account potential interactions with other malaria resistance genes. Given the important biological role of CR1 in malaria host-parasite interactions, we aimed to clarify the relationship between the Sl and McC alleles and severe malaria in a case-control study of Kenyan children. These investigations were supplemented with a separate longitudinal cohort study of Kenyan children, examining the associations of these alleles with uncomplicated malaria and other common childhood illnesses. Finally, we also investigated the influence of these alleles on the formation of P. falciparum rosettes, as a potential functional explanation for these results through ex vivo laboratory studies.

Results
The Sl2/Sl2 genotype is associated with protection against cerebral malaria and death in the Kenyan case-control study Data were obtained from 5545 children enrolled in a case-control study of severe malaria ( Figure 3). The general characteristics of the cases and controls are shown in Supplementary file 1A, and the characteristics of the dataset by Sl and McC genotype are shown in Supplementary file 1B. The Sl2 and McC b allele frequencies (0.68 and 0.16 respectively) were comparable to other African populations ( Figure 2). There was no significant deviation from Hardy-Weinberg equilibrium for the Sl or McC genotypes among controls (Supplementary file 1C).
Using a simple logistic regression model containing only Sl and McC genotypes (referred to as the unadjusted analysis below), we found a non-significant association between the Sl2 allele and severe malaria overall, with the Sl2/Sl2 genotype being associated with an OR for severe malaria of 0.90 (95% CI 0.79-1.01; p=0.07) (Supplementary file 1D). We attempted to refine this signal by fitting a more complete model to the data, including the potential confounding factors of ethnicity, location, sickle cell trait, ABO blood group and a + thalassaemia genotype, as well as considering possible first-order interactions between terms (referred to as the full adjusted analysis below). A significant protective association was observed for Sl2 in the recessive form (adjusted Odds Ratio (aOR) 0.78; 95% CI 0.64-0.95; p=0.011), which was most marked for cerebral malaria (aOR 0.67; 0.52-0.87; p=0.006) ( Figure 4 and Table 1). The Sl2/Sl2 genotype was also associated with significant protection against death from severe malaria (aOR 0.50; 0.30-0.80; p=0.002), and death among children admitted with a specific diagnosis of cerebral malaria in the full adjusted analysis (aOR 0.44; 0.23-0.78; p=0.007) ( Figure 4 and Table 1). Unexpectedly, we observed a significant interaction between Sl2 and a + thalassaemia genotype, such that the protective associations of Sl2 were only  Figure 2 continued on next page seen in individuals of normal a-globin genotype ( Figure 5). We found no evidence for an association between Sl2 and any other clinical form of severe malaria (Table 1), or with P. falciparum parasite density ( Figure 6).
The McC b allele is associated with increased susceptibility to cerebral malaria and death in the Kenyan case-control study The unadjusted analysis showed a borderline significant association between McC b and increased susceptibility to severe malaria overall (OR 1.17; 1.00-1.25; p=0.056, Supplementary file 1D), and   Figure 4 and Table 1). We found no association between McC b and any other clinical form of severe malaria ( Table 1 and Supplementary file 1D) or with P. falciparum parasite density ( Figure 6).   observed in our data. We therefore reanalyzed the data under a haplotype model in which the perindividual count of each of the three observed haplotypes was included as a predictor along with the potential confounding factors, as well as under a genotypic model in which the count of each of the six possible Sl/McC genotype combinations was included as a predictor (Appendix 2). These analyses suggest an additive protective association with the Sl2/McC a haplotype (aOR = 0.85; 0.75-0.96; p=0.007), with broadly consistent results observed for analysis of genotype combinations (Supplementary file 1E and 1F). Thus, the opposing effects of Sl2 and McC b observed above could plausibly result from the protective association of a single haplotype at the locus, although this is difficult to distinguish from the individuals SNPs acting independently and additively based on the statistical evidence alone.

Analysis of haplotypic effects and genotype combinations
The Sl2/Sl2 genotype was associated with protection against uncomplicated malaria in the Kenyan longitudinal cohort study We next examined the association between Sl2 and McC b alleles and uncomplicated malaria in a longitudinal prospective study of 208 Kenyan children. General characteristics of the cohort study population by Sl and McC genotypes are shown in Supplementary file 1G. After adjusting for variables known to influence malaria susceptibility, the Sl2 allele was associated with a >50% reduction in the incidence of uncomplicated malaria (additive model) ( Table 2; the number of episodes, incidence and unadjusted Incidence Rate Ratios for the diseases studied in the longitudinal cohort are shown in Supplementary file 1H, I and J). Once again, a significant interaction was seen with a -+ thalassaemia, such that the protective association of Sl2 was only demonstrated in children of normal a-globin genotype ( Table 3). We found no significant association between the McC b allele and uncomplicated malaria ( Table 2). The McC b allele was associated with protection from common nonmalarial childhood diseases in the Kenyan longitudinal cohort study The data shown above are incompatible with malaria being the selective pressure for McC b in the Kenyan population, and suggest that other life-threatening childhood diseases may have been responsible for selection of McC b . We therefore used the same longitudinal cohort study to investigate whether the McC b and Sl2 alleles influence the risk of other childhood diseases. McC b was associated with borderline significant protection against several common infectious diseases including LRTIs, URTIs and gastroenteritis ( Table 2). Sl2 was associated with a borderline reduced incidence of gastroenteritis ( Table 2). The association of McC b with gastroenteritis was predominantly seen in children of normal a-globin genotype, echoing the interaction seen with Sl2 and malaria.
The Sl2 allele was associated with reduced ex vivo rosette frequency in P. falciparum clinical isolates from Mali A previous in vitro study based on a culture-adapted P. falciparum parasite line suggested that RBC from Sl2 genotype donors had a reduced ability to form rosettes, providing a possible mechanism for protection against severe malaria (Rowe et al., 1997). P. falciparum clinical isolates were not available from the Kenyan case-control study to investigate this potential mechanism in that population. However, the association of Sl and McC genotypes with ex vivo P. falciparum rosette frequency could be examined using 167 parasite isolates from a case-control study of children with clinical malaria in Mali (Doumbo et al., 2009). Analysis of this small case-control study suggested a protective association between the Sl2/Sl2 genotype and cerebral malaria (aOR 0.35, 95% CI 0.12-0.89, p=0.024) and the Sl2/Sl2-McC a /McC a genotype combination was associated with protection against cerebral malaria (aOR 0.14, 95% CI 0.02-0.84, p=0.031, Appendix 1). As such, we considered samples from this population to be appropriate for testing rosetting as a potential mechanism of action. The median rosette frequency (percentage of iRBC that form rosettes) was significantly lower in P. falciparum isolates from malaria patients with one or more Sl2 alleles than in isolates from Sl1/Sl1

Discussion
The data presented here provide epidemiological evidence supporting a role for CR1 in the pathogenesis of cerebral malaria. Two neighboring CR1 polymorphisms belonging to the Knops blood group system of antigens had opposing associations on risk of cerebral malaria. The Sl2/Sl2 genotype was associated with protection against cerebral malaria and death, while the McC b allele was associated with increased susceptibility ( Figure 4 and Table 1). The Sl2 allele was also associated with significant protection against uncomplicated malaria, whereas the McC b allele was associated with borderline protection against several common infections in Kenyan children ( Table 2). The protective association of Sl2 against cerebral malaria, death and uncomplicated malaria was influenced by a + thalassaemia, being most evident in children of normal a-globin genotype.  The protective association between Sl2 and cerebral malaria was first reported in a small casecontrol study from western Kenya (Thathy et al., 2005), but has remained controversial, especially as most prior studies have been underpowered. Hence, our study is the first adequately powered independent sample set that replicates the protective association between Sl2 and cerebral malaria. Other studies found no consistent significant associations between Sl genotypes and severe malaria (Zimmerman et al., 2003;Hansson et al., 2013;Jallow et al., 2009;Manjurano et al., 2012;Toure et al., 2012;Rockett et al., 2014), including a recent multi-centre candidate gene study that included the sample set analysed here . A weak association between McC b and an increased odds ratio for cerebral malaria was shown in the multi-centre study .
The complex interactions between Sl2, McC b and a + thalassaemia revealed by our study provide possible reasons for the previous inconsistent findings. Although Sl2 was associated with protection against cerebral malaria in our study, McC b and a + thalassaemia both counteracted this effect. The protective association of Sl2 was observed most clearly when both McC b and a + thalassaemia genotypes were included in the statistical model, something that has not been considered in previous studies. It is possible that some of the other discrepant genetic associations with severe malaria  might result from interactions between multiple loci that vary across populations and may not be revealed by standard analyses. Biologically, it makes sense to account for McC genotype when investigating associations with Sl2 and vice versa, as the two polymorphisms encode changes only 11 amino acids apart in the CR1 molecule (Figure 1). The possibility that the observed association might be due to a haplotype rather than independent effects of Sl and Mc cannot be discounted.
The interaction we describe here between Sl2 and a + thalassaemia is reminiscent of the epistatic interactions that have been observed between a + thalassaemia and other malaria-protective polymorphisms including sickle cell trait (HbAS) (Williams et al., 2005a) and haptoglobin (Atkinson et al., 2014). It is possible, therefore, that a + thalassaemia has a broad effect on multiple malaria-protective polymorphisms, influencing their restricted global frequencies (Penman et al., 2009), and contributing to the discrepant outcomes of previous association studies. Recent large genetic association studies on malaria do not include data on a + thalassaemia, because the causal deletions are not typed on automated platforms , instead requiring manual genotyping using labour-intensive PCR-based methods (Chong et al., 2000). Replication of the Sl2/ a + thalassaemia interaction will be required, and we suggest that a + thalassaemia genotype should be included as an important confounding variable in future malaria epidemiological studies and that efforts should continue to discover the mechanism of protection afforded by a + thalassaemia, which remains controversial (Carlson et al., 1994;Fowkes et al., 2008;Krause et al., 2012;Opi et al., 2014;Opi et al., 2016).
We examined one possible biological mechanism by which the Sl2 allele might influence cerebral malaria by studying P. falciparum rosetting, a parasite virulence factor associated with severe malaria in African children (Doumbo et al., 2009). Previous in vitro experiments showed that CR1 is a receptor for P. falciparum rosetting on uninfected RBCs, and that RBCs serologically typed as negative for the Sl1 antigen (likely to be from donors with Sl1/Sl2 or Sl2/Sl2 genotypes) (Moulds et al., 2001) show reduced binding to the parasite rosetting ligand PfEMP1 (Rowe et al., 1997) . In this study, we found a significantly lower median rosette frequency in P. falciparum parasite isolates from Malian patients with Sl2 genotypes compared to Sl1/Sl1 controls (Figure 4). Therefore, similar to HbC (Fairhurst et al., 2005), blood group O (Rowe et al., 2007) and RBC CR1 deficiency (Cockburn et al., 2004), it is possible that reduced rosetting and subsequent reduced microvascular obstruction (Kaul et al., 1991) may in part explain the protective association of Sl2 against cerebral malaria. However, given the protective association of Sl2 with uncomplicated malaria, and the possible associations of Sl2 and McC b with other common childhood infections, it seems likely that the Knops polymorphisms may be associated with broader effects, for example on the complement regulatory functions of CR1. Previously, we have shown that neither cofactor activity for the breakdown of C3b and C4b nor binding to C1q are influenced by the Sl2 and McC b mutations (Tetteh-Quarcoo et al., 2012). In addition, we can find no association between Knops genotype and CR1 clustering on erythrocytes (Paccaud et al., 1988;Swann et al., 2017). However, other potential effects such as altered immune complex binding and processing or activation of the complement lectin pathway via mannose-binding lectin (Ghiran et al., 2000) have not yet been investigated.
Our studies have several limitations: McC b homozygotes are relatively infrequent in Kenya, which limited our power to detect associations with McC b in the homozygous state. Our longitudinal cohort study generated several values of borderline statistical significance for the McC b allele which are inconclusive. Studies with larger sample sizes will be needed to examine the specific associations of McC b on assorted childhood diseases. Another limitation is that our functional (Mali) and epidemiological (Kenya) studies were conducted in different populations. The mechanisms of rosetting and associations with malaria severity are thought to be similar across sub-Saharan Africa , suggesting that data collected in either location are likely to be comparable. Furthermore, examination of a small set of cerebral malaria cases and controls from Mali suggests a protective association between Sl2/Sl2 genotype and cerebral malaria also occurs in this setting (Appendix 1). Ideally, future epidemiological and functional studies of specific polymorphisms on malaria should be conducted within a single population, although this remains logistically challenging.
In conclusion, we show that two high frequency CR1 polymorphisms have opposing associations with cerebral malaria and death in Kenyan children. While the Sl2 allele may have reached high frequency in African populations by conferring a protective advantage against cerebral malaria, our data suggest that McC b arose due to a survival advantage afforded against other non-malarial infections (Noumsi et al., 2011;Fitness et al., 2004). Sl2 may in part protect against cerebral malaria by reducing rosetting, but additional effects seem likely. Further work is needed to examine both the epidemiological effects of the Knops polymorphisms on diverse childhood diseases, and the biological effects of the Sl2 and McC b polymorphisms on CR1 function. Future epidemiological studies should account for the effect of a + thalassaemia on the associations between Sl2 and McC b on malaria and other infectious diseases.

Datasets studied
This study uses data from a Kenyan case-control study of severe malaria, with samples collected between 2001 and 2010, a Kenyan longitudinal cohort study, with samples collected between 1998 and 2001 and a Malian case-control study performed between July 2000 and December 2001. Historic datasets (i.e. >10 years old) are widely used in genetic epidemiological studies of malaria due to the logistical challenges of sample collection in malaria endemic countries and the changing epidemiological patterns of disease.

The Kenyan study area
All epidemiological and clinical studies in Kenya were carried out in the area defined by the Kilifi Health and Demographic Surveillance System (KHDSS), with Kilifi County Hospital (KCH) serving as the primary point of care (Scott et al., 2012). Malaria transmission is seasonal in this region following the long and short rains. An Entomological Inoculation Rate (EIR) of up to 50 infective bites per person per year was measured in the late 1990s (Mbogo et al., 2003), but transmission has since declined (O'Meara et al., 2008).

The Kenyan case-control study
Between January 2001 and January 2008, children aged <14 years who were admitted to KCH with severe malaria were recruited as cases, as described previously , except that children who were resident outside the KHDSS were excluded (Figure 3). Severe malaria was defined as the presence of blood-film positive P. falciparum infection complicated by one or more of the following features: cerebral malaria (CM) (a Blantyre coma score (BCS) of <3) n = 943; severe malarial anaemia (SMA) (hemoglobin concentration of <5 g/dl) n = 483; respiratory distress (RD) (abnormally deep breathing) n = 522 or 'other severe malaria' (no CM, SMA or RD but other features including prostration (BCS 3 or 4), hypoglycemia and hyperparasitemia) n = 318. Controls (n = 3829) consisted of children 3-12 months of age who were born consecutively within the KHDSS study area between August 2006 and September 2010 and were recruited to an ongoing genetic cohort study (Williams et al., 2009). As such, controls were representative of the general population in terms of ethnicity and residence but not of age. The use of controls who are considerably younger than cases differs from the classical structure of a case-control study. However, this method (using cord blood or infant samples as controls) has been widely used in African genetic association studies (e.g. [Band et al., 2013;Busby et al., 2016;Clarke et al., 2017]) and is the most logistically feasible way of collecting sufficiently large numbers of control samples in many sub-Saharan African settings.

Sample processing and quality control for the Kenyan case-control study
The Sl and McC polymorphisms were originally typed as part of a larger study by Rockett et al., 2014, which included case-control data from 12 global sites. In Kenya, 0.5 ml blood samples were collected into EDTA tubes and DNA extracted using Qiagen DNeasy blood kits (Qiagen, Crawley, UK). DNA was stored at À20˚C and shipped frozen to Oxford. Sample processing is described in detail in the supplementary methods of Rockett et al., 2014. Briefly, samples underwent a wholegenome amplification step using Primer-Extension Pre-Amplification. Genotyping was performed using SEQUENOM iPLEX Gold with 384 samples processed per chip. In Rockett et al.'s study, samples were typed for 73 SNPs; 55 of these SNPs were chosen on the basis of a known association with severe malaria, 3 SNPs were used to confirm gender and the remaining 15 SNPs to aid quality control. Samples were excluded if they did not have clinical data for gender or if genotypic gender of the sample did not match clinical gender. Samples were included if they were successfully genotyped for more than 90% of 65 'analysis' SNPs. The Kenyan samples studied by Rockett et al. originally comprised 2741 cases of severe malaria and 4183 controls. After the quality control of both phenotypic and genotypic data described above, 2268 cases and 3949 controls were analysed by Rockett et al., 2014. Comparison between this study and Rockett et al., 2014. The 2268 Kenyan cases and 3949 controls that were analyzed by Rockett et al., 2014 were the starting point for our study. Children living outside the KHDSS were excluded, because this allowed us to use 'location' as a random effect in the final statistical model, which greatly improved model fit. Children with missing genotypes (Sl, McC, sickle cell, a + thalassaemia or ABO blood group) were also excluded ( Figure 3). After applying these exclusion criteria, 1716 severe malaria cases and 3829 community controls were available for analysis.
Hence, the number of severe malaria cases differs between our study and Rockett et al., 2014 due to differing exclusion criteria. The inclusion of the severe malaria cases who lived outside the KHDSS into our statistical models did not alter the findings of our analysis (Supplementary file 1K). In both our study and Rockett et al., 2014, the control samples were identical and all came from within the KHDSS. Our study has 120 fewer controls than Rockett et al., 2014 due to missing genotypes, because we only used controls for whom full Sl, McC, sickle cell genotype, a + thalassaemia genotype and ABO blood group data were available.
Our analytical methods differed from Rockett et al., 2014, in that we included both Sl and McC in the same statistical model and adjusted for confounders, whereas Rockett et al. examined each SNP independently.

The Kenyan longitudinal cohort study
This study has been described in detail previously (Nyakeriga et al., 2004). Briefly, this study was established with the aim of investigating the immuno-epidemiology of uncomplicated clinical malaria and other common childhood diseases in the northern part of the KHDSS study area, approximately 15 km from KCH (Williams et al., 2005b). The study was carried out between August 1998 and August 2001 involving children aged 0-10 years recruited either at the start of the study or at birth when born into study households during the study period. They were actively followed up on a once-weekly basis for both malaria and non-malaria related clinical events. In addition, on presentation with illnesses, cohort members were referred to a dedicated outpatient clinic for more detailed diagnostic tests. The cohort was monitored for the prevalence of asymptomatic P. falciparum infection through four cross-sectional surveys carried out in March, July and October 2000 and June 2001. Exclusion criteria included migration from the study area for more than 2 months, the withdrawal of consent and death. Uncomplicated clinical malaria was defined as fever (axillary temperature of > 37.5˚C) in association with a P. falciparum positive slide at any density. The most common non-malaria-related clinical events reported during the study period included upper respiratory tract infections (URTIs), lower respiratory tract infections (LRTIs), gastroenteritis, helminth infections and skin infections, as defined in detail previously (Williams et al., 2005b). Malaria negative fever was defined as an axillary temperature of > 37.5˚C in association with a slide negative for P. falciparum. This analysis includes 208 children aged < 10 years for whom full Sl, McC, sickle cell genotype, a -+ thalassaemia genotype and ABO blood group data were available.

The Malian case-control study
This study has been described in detail previously (Lyke et al., 2003). Briefly, between July 2000 and December 2001, children ranging from 1 month to 14 years of age were recruited into a casecontrol study in the Bandiagara region in East Central Mali, an area of intense and seasonal P. falciparum malaria infection. In order to address the specific question of whether the Sl2/Sl2 genotype is associated with protection against cerebral malaria in Mali, only the subset of children suffering strictly defined cerebral malaria (a BCS of <3, with other obvious causes of coma excluded, n = 34) or uncomplicated malaria (n = 184, symptomatic children with P. falciparum parasitemia and an axillary temperature !37.5˚C, in the absence of other clear cause of fever), and for whom Sl and McC genotyping was available were analyzed.

Ex vivo rosetting
The rosette frequency (percentage of mature infected erythrocytes forming rosettes with two or more uninfected erythrocytes) of P. falciparum isolates from patients recruited into the Mali casecontrol study was determined by microscopy after short term culture (18-36 hr), as described in detail previously (Doumbo et al., 2009). Of the 209 isolates studied previously (Doumbo et al., 2009), 167 were successfully genotyped for the Sl and McC alleles and are analysed here. The rosetting assays were performed before we genotyped the study participants, excluding observer bias. The rosette frequency of parasites from hosts with differing Sl and McC genotypes were compared by a Kruskal-Wallis test with Dunn's multiple comparisons (Prism v6.0, Graphpad Inc, San Diego, CA).

Laboratory procedures
DNA was extracted either from fresh or frozen whole blood by proprietary methods using either the semi-automated ABI PRISM 6100 Nucleic acid prep station (Applied Biosystems, Foster City, CA) or using QIAamp DNA Blood Mini Kits (Qiagen, West Sussex, UK). SNPs giving rise to the Sl and McC alleles were genotyped using either the SEQUENOM iPLEX Gold multiplex system (Agena Biosciences, Hamburg, Germany) (Kenyan study)  or by an established PCR-RFLP method as described previously (Malian study) . Genotyping for sickle cell trait (HbAS) and the common African a + thalassaemia variant caused by a 3.7 kb deletion in the HBA gene were performed by PCR as described in detail elsewhere (Chong et al., 2000;Waterfall and Cobb, 2001).

Statistical analysis
The effects of the Sl and McC alleles were examined in genotypic, dominant, recessive and additive models of inheritance, with the best fitting model selected based on Akaike information criterion (AIC). Analyses for the Kilifi case-control study were performed in R (R Foundation for Statistical Computing, Vienna, Austria) (R Development Core Team, 2010) using the 'ggplot2', 'lme4', and 'HardyWeinberg' packages (Wickham, 2009;Bates et al., 2015;Graffelman and Camarena, 2008), while analyses for the longitudinal study were performed in Stata v11.2 (StataCorp, Texas, USA). In both studies, a p value of < 0.05 was considered statistically significant. Graphs were generated using R or Prism v6.0 (Graphpad Inc, San Diego, CA).
For the Kenyan case-control study, Sl and McC genotype were included together in a statistical model to examine their associations with malaria susceptibility. Odds Ratios (ORs) and 95% Confidence Intervals (CI) were generated using mixed effect logistic regression analysis both with and without adjustment for ethnicity and location of residence as random effects, and sickle cell genotype, a + thalassaemia genotype, and ABO blood group (O or non-O) as fixed effects (variables which have been associated with malaria susceptibility in multiple previous studies in this population) (Jallow et al., 2009;Rockett et al., 2014;Williams et al., 2005a;Atkinson et al., 2014;Rowe et al., 2007;Williams et al., 2005b;Fry et al., 2008;Malaria Genomic Epidemiology Network et al., 2015). The ethnicity variable was compressed from 28 categories to four; Giriama (n = 2728), Chonyi (n = 1800), Kauma (n = 588) and other (n = 429). Binary parameterization of the a -+ thalassaemia variable was used, that is, comparing those children with no a + thalassaemia alleles against those with one or more a + thalassaemia alleles. This division was chosen in accordance with a previous report showing that both heterozygous and homozygous a + thalassaemia genotypes are associated with protection against severe malaria and death in the Kilifi area (Williams et al., 2005c). 2000 bootstrapped iterations were run to give 95% CIs and p values.
For the Kenyan longitudinal cohort study, Incidence Rate Ratios (IRRs) and 95% CIs were generated using a random effects Poisson regression model that took into account within-person clustering. Data were examined with and without adjustment for confounding by McC genotype (for Sl analyses), Sl genotype (for McC analyses) sickle cell genotype, a + thalassaemia genotype, ABO blood group, ethnic group, season (defined as 3 monthly blocks), and age in months as a continuous variable.
For the Malian case-control study, ORs and 95% CIs were computed using mixed effect logistic regression analysis with adjustment for location of residence as a random effect and age, ABO blood group (O or non-O) and ethnicity (Dogon or non-Dogon) as fixed effects. a + thalassaemia genotyping was not available for the Malian study and sickle cell trait is extremely uncommon in this population, therefore neither variable was included in the model. 2000 bootstrapped iterations were run to give adjusted ORs.
Corrections for multiple comparisons were not performed, instead all adjusted odds ratios, confidence intervals and p values have been clearly reported. This approach has been repeatedly advocated, particularly when dealing with biological data (Rothman, 1990;Perneger, 1998;Nakagawa, 2004;Fiedler et al., 2012;Rothman, 2014). A detailed description of the Malian dataset is given in Appendix 1, and a detailed description of the statistical model fitting for the Kenyan studies is given in Appendix 2. Bootstrapping was performed using the 'bootMer' function in package 'lme4' in R. 2000 iterations were run of each model to calculate 95% confidence intervals and p values. If models did not converge over these 2000 iterations they were inspected for singularities (i.e. a level of one of the variables having a value of 0, for example 0 cases living in Gede). If no singularities were identified, the bootstrapping was rerun using the optimiser 'bobyqa' with 10 5 evaluations.

Additional information
Corrections for multiple comparisons were not performed in this study, instead all adjusted odds ratios, confidence intervals and p values have been clearly reported. This approach has been repeatedly advocated, particularly when dealing with biological data (Rothman, 1990;Perneger, 1998;Nakagawa, 2004;Fiedler et al., 2012;Rothman, 2014). The stringency of multiple comparisons increases the risk of type II error, potentially discarding important findings. No single study can be considered conclusive and novel results will always require replication.
Exploration of alternative haplotype and combined genotype models As one of the four possible Sl/Mc haplotype combinations was not seen (Sl1/McC b ), the Sl2 and McC b alleles are likely to be in complete linkage disequilibrium in this population sample (i.e. D'=1, no recombination between these two markers). This situation makes it difficult to distinguish statistically between a model where Sl and McC act independently and additively or a haplotype model. We considered the possibility that a haplotype model could provide an alternative explanation for our findings, with a separate true protective mutation being positively tagged by the Sl2 allele and negatively tagged by the McC b allele. Specifically, for each sample we computed the count of each of the three possible Sl/Mc haplotypes (assuming only three haplotypes are segregating as above). We then re-fit the logistic regression model for cerebral malaria using haplotype counts as predictors, in addition to potential confounders included in the full adjusted analysis described above. This model estimates a non-zero protective effect of the Sl2/McC a haplotype (additive OR = 0.85; 0.75-0.96; p=0.007), but did not fit as well as the full adjusted analysis described above (AIC = 4268.5, versus 4266.8 for the full analysis).
As both Sl and Mc have sufficient structural effects to alter Knops blood group phenotype, it would appear reasonable to examine their function further before looking for other nearby mutations. In addition, no other strong effects near CR1 have been identified by GWAS studies that could explain the association. However, a haplotype model cannot be excluded as a possibility on the basis of our current data.
Exploration of the negative epistasis between sickle trait and a + thalassaemia Previous studies have described a negative epistatic interaction between sickle trait and a + thalassaemia, reporting that a + thalassaemia homozygotes who also carry the sickle trait are not protected from severe malaria (Williams et al., 2005a). We wanted to ensure that an unrecognised relationship between sickle cell trait and Sl genotype did not account for the interaction between a + thalassaemia and Sl that we report in the current study.
Analysis of our current dataset confirmed the existence of negative epistasis between sickle trait and a + thalassaemia in this population (Supplementary file 1N). However, of interest, this negative epistatic interaction was only seen in the severe malaria cases without cerebral malaria, whereas the a + thalassaemia/Sl interaction was specific to cerebral malaria cases (Supplementary file 1N). Therefore, the two interactions appear to be mutually exclusive.
The final adjusted analysis was also re-run on a restricted dataset which excluded the 664 children with sickle cell trait or sickle cell disease. The results of this analysis remained unchanged and the a + thalassaemia/Sl interaction persisted without any influence of sickle trait (Supplementary file 1O).
Sickle cell trait did not show a statistical interaction with either Sl or McC genotype. The sickle cell mutation is far less common in the KHDSS population than the a + thalassaemia mutation (~12% of children have one or more sickle cell alleles, compared to~64% with one or more a + thalassaemia alleles). As such, even in a study as large as this one, the power to detect statistically significant interactions between all three of sickle, a + thalassaemia and Sl genotypes is greatly reduced. However, we found no evidence of a three way interaction between these alleles. The raw data for the combined genotypes compromising sickle trait, a + thalassaemia and Sl for each clinical outcome is presented in Supplementary file 1P. Correlations between sickle cell, a + thalassaemia, Sl and McC are presented in Supplementary file 1Q.

Statistical model fitting for the longitudinal cohort study
Associations between Sl and McC and mild malaria and other non-malarial related diseases in the longitudinal cohort study were tested in Stata v11.2 (StataCorp, Texas, USA) using a random effects Poisson regression analysis that accounted for within-person clustering of events. This analysis was restricted to children under 10 years old living in the Ngerenya area in the northern part of the KHDSS study area. The analysis was carried out on the 208 children from the cohort with full genotype, ethnic group, season and age data. The model selection process first involved univariate analyses testing for disease associations for Sl and McC independently without potential confounders in genotypic, dominant, recessive, heterozygous and additive models of inheritance. Models were compared using the Akaike information criterion (AIC) for fitness, with the model displaying the minimum AIC values for each respective genotype and outcome of interest chosen as the best fitting model. The unadjusted Incident Rate Ratios and best fitting models are shown in Supplementary file 1J. For each disease outcome, the association with Sl genotype was then adjusted for confounding by McC (best genetic model chosen from the univariate analysis) and for explanatory variables previously associated with outcomes of interest: sickle cell genotype, a + thalassaemia genotype, ABO blood group genotype, ethnic group (Giriama, Chonyi and others), season (defined as 3 monthly blocks) and age in months as a continuous variable. AIC values were compared to identify the best fitting genetic model (Supplementary file 1R). The same process was carried out for the association of McC genotype with each disease outcome, with adjustment for Sl genotype and the other explanatory variables (Supplementary file 1S). For consistency of reporting here, the same explanatory variables are included in the statistical models for all disease outcomes in the data presented. Optimized model-fitting for each outcome by removing explanatory variables that did not improve model fit, did not make any material difference to the results shown here.
Finally, we tested for interactions between either Sl and McC and a + thalassaemia (represented as normal, heterozygous and homozygous genotypes) using the likelihood ratio test with a p value of <0.05 indicating statistically significant evidence for interaction, and the appropriate interaction term included in the final model.