Lack of Association of CD55 Receptor Genetic Variants and Severe Malaria in Ghanaian Children

In a recent report, the cellular receptor CD55 was identified as a molecule essential for the invasion of human erythrocytes by Plasmodium falciparum, the causal agent of the most severe form of malaria. As this invasion process represents a critical step during infection with the parasite, it was hypothesized that genetic variants in the gene could affect severe malaria (SM) susceptibility. We performed high-resolution variant discovery of rare and common genetic variants in the human CD55 gene. Association testing of these variants in over 1700 SM cases and unaffected control individuals from the malaria-endemic Ashanti Region in Ghana, West Africa, were performed on the basis of single variants, combined rare variant analyses, and reconstructed haplotypes. A total of 26 genetic variants were detected in coding and regulatory regions of CD55. Five variants were previously unknown. None of the single variants, rare variants, or haplotypes showed evidence for association with SM or P. falciparum density. Here, we present the first comprehensive analysis of variation in the CD55 gene in the context of SM and show that genetic variants present in a Ghanaian study group appear not to influence susceptibility to the disease.

Taken together, there is substantial evidence that human genetic variation affecting CD55 integrity on the surface of RBCs could influence SM phenotypes, rendering CD55 a promising candidate for association studies. Moreover, CD55 has been discussed as a potential drug target (Tham and Kennedy 2015) and, as such, information on the degree of sequence conservation among malaria-exposed populations is crucial.
Here, we screened a large case-control study group comprising more than 1700 subjects (May et al. 2007) from the Ashanti Region in Ghana, West Africa, for genetic variation in the CD55 gene and assessed the influence of common and rare coding variants, and the impact of CD55 haplotypes on SM phenotypes.

SM case-control group
The SM case-control group comprised 831 SM cases and 903 control individuals from the Ashanti Region in Ghana. SM was defined according to World Health Organization (WHO) criteria as described (May et al. 2007), and control subjects were unaffected healthy individuals. a Severe malaria anemia (hemoglobin level , 5 g/dl) with or without additional complications including cerebral malaria. b Cerebral malaria (Blantyre Coma Score [BCS] , 3) with or without additional complications including severe anemia.
n Within the SM case group, 507 (61.0%) presented with SMA and 165 (19.9%) had CM ( Table 1). The majority of the SM cases were also affected by additional severe manifestations such as prostration, hyperparasitemia, or hyperlactatemia, which were partly overlapping. Parasite density was determined by counting P. falciparum asexual blood stages in Giemsa-stained blood smears (May et al. 2007). The median age of cases and control individuals was 18 and 27 months, respectively (ranges 3-147 months in the case group and 4-161 months in the control group). Phenotyping, recruitment of study subjects, and DNA extraction methods have previously been described in detail (May et al. 2007;Schuldt et al. 2011). After genotyping and evaluation of association results for variant rs7542430, additional DNA samples from 2704 study subjects were used for genotyping in order to confirm initial findings. These subjects were cases and controls that were part of the same original study (May et al. 2007).

Variant discovery and genotyping
The gene encoding CD55 is located on chromosome 1 and spans 40 kb of genomic DNA. In order to screen the CD55 locus (chr1:207,494,817-207,534,311, hg19) for genetic variants, DNA from 1734 unrelated Ghanaian individuals (831 SM cases and 903 controls) was used for high resolution melting analyses (HRM). Therefore, DNA samples were amplified by PCR using primers that capture the 11 CD55 exons, including a small proportion of their flanking regions as well as 140 bp of untranslated sequence upstream of the transcription start site. Oligonucleotides were designed using LightCycler Probe Design Software 2.0 (Roche Applied Science) against reference transcript NCBI NM_000574.4. Sequences of oligonucleotides and PCR conditions for HRM assays are listed in Supplemental Material, Table S1. Previously unknown single nucleotide polymorphisms (SNPs) and singletons were confirmed by resequencing of genomic DNA. Information about all novel variants detected in our study group was submitted to NCBI dbSNP and will be released in Build 148. In addition, two promoter variants, rs28371586 and rs7542430, were genotyped by allele-specific hybridization in a Roche Lightcycler device (Table S2). Consequences of discovered amino acid missense mutations were analyzed by Polyphen2 and SIFT predictions (Ng and Henikoff 2001;Adzhubei et al. 2010).

Calculation of linkage disequilibrium (LD)
Haploview was used to generate the LD plot of the CD55 genomic region (Barrett et al. 2005). The LD calculation was based on healthy control individuals and included all variants with a global minor allele frequency (MAF) $ 1%.

Association analyses of genetic variants with MAF ‡ 1%
Tests for association of seven variants with global MAFs $ 1% were done by logistic and linear regression analyses in PLINK v1.9 assuming an additive mode of inheritance (Purcell et al. 2007). In the regression models, ethnic group, age, and gender were used as covariates. The logarithm of the parasite density was used as a quantitative phenotype in a linear regression analysis in PLINK. In order to account for multiple testing, a correction factor of seven was used; hence, a p-value , 0.007 (0.05/7) was considered significant. A factor of seven was applied because the correlation between the seven SNPs was negligible (all pairwise r 2 -values , 0.05 Figure S1), hence the associations tested represented seven independent tests. All variants were tested for fulfilling the Hardy-Weinberg equilibrium (HWE) in PLINK and were found to comply with frequencies under HWE.
n Power for detecting effects of variants in the case-control study was estimated with CATS (Skol et al. 2006). For power estimations, a disease prevalence of 2% for SM and p , 0.007 were assumed. Power calculations for single association tests resulted in a power of 99, 95, and 53% to detect a genotype-phenotype association in the SM, SMA, and CM study groups, respectively, when assuming a genotype relative risk of at least two and a MAF $ 5%.
Haplotype-based association testing Full-length haplotype analyses were done for the case-control group with the HaploStats package v1.4.4 (Schaid et al. 2002) in R (version 3.1.0; http://www.r-project.org), including reconstructed haplotypes with estimated global frequencies $ 1%. In addition, subhaplotypes were evaluated in sliding window analyses capturing a minimum of two and a maximum of six alleles. SM, as well as the major manifestations SMA and CM, and parasite density (logarithmic scale) were used as phenotypes, and the three covariates ethnic group, age, and gender were included in the model. Empirical p-values were computed using default values.

Association analyses of rare variants
Four different methods integrated in variant tools were used to assess the overall genetic burden due to rare variants (global MAF , 1%) (San Lucas et al. 2012). These included: (i) the combined and multivariate collapsing (CMC) method described by Li and Leal (2008), (ii) the weighted sum statistic (WSS) described by Madsen and Browning (2009), (iii) the variable-threshold (VT) model described by Price et al. (2010), and (iv) the C(a) test described by Neale et al. (2011). Variants were pooled on the basis of their MAF and their cumulative effect was tested in univariate and multivariate analyses, in which age, gender, and ethnic group were included as covariates. Tests for association of rare variant carrier status in CD55 were done for SM, SMA, CM, or hyperparasitemia, defined as . 200,000 asexual parasites/ml blood.

Ethics statement
Ethical clearance was granted by the Committee for Research, Publications and Ethics of the School of Medical Sciences, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana. All procedures were explained to parents or guardians of the participating children in the local language, and written or thumb-printed informed consent was obtained.

Data availability
The CD55 mRNA reference sequence for design of HRM assays is accessible under NCBI NM_000574.4. Oligonucleotides and conditions for HRM reactions and for SNP genotyping are found in Table S1 and  Table S2. Genomic locations and frequencies of the SNPs present in this Ghanaian study group are shown in Table 2. Information on previously unknown genetic variants was submitted to NCBI dbSNP and assigned SNP IDs are included in Table 2.

RESULTS
Variant discovery of CD55 was conducted in 1734 unrelated individuals from the Ashanti Region in Ghana. The study group comprised 831 SM cases and 903 apparently healthy control individuals. The genomic region screened included coding and regulatory regions of CD55 as well as the intron-exon boundaries. As a result, a total of 26 SNPs were found to be present in the study group. Of these, five variants were previously unknown (Table 2).
n a Refers to SNP number as designated in Table 2. b Results of haplotypic-specific score tests adjusted for gender, age, and ethnicity assuming an additive mode of inheritance.
c Simulation p-values are computed based on a permuted reordering of the trait and covariates in HaploStats (Schaid et al. 2002).
Nineteen variants were found to be very rare with MAFs below 1% including 11 singletons. Only three SNPs with MAFs $ 1% were found that caused nonsynonymous amino acid exchanges in the receptor protein. These were R52H, L82R, and V333I, with MAFs of 4, 1, and 2% in the control group, respectively. According to Polyphen2 and SIFT predictions all three substitutions were classified as benign. With the exception of V406A, all other missense mutations, all being singletons, were predicted to be probably damaging or damaging (Table 2). Seven variants were found to be more frequent than 1%, and these were initially used in single variant association tests with SM, SMA, CM, or parasite density. The findings are summarized in Table 3.
As a result, the G allele of rs7542430 was found to be associated with a reduction in risk of SM (odds ratio (OR) 0.44, 95% C.I. 0.25-0.79, and p = 0.006). However, when additional subjects of the study were analyzed (n cases = 2588 and n controls =1850) the association did not hold (OR 0.92, 95% C.I. 0.67-1.26, and p = 0.598).
Reconstruction of haplotypes generated eight full-length haplotypes with a global MAF above 1% (Table 4). Of these, haplotype CD55-1 was the most prominent one with an estimated frequency in the control subjects of 57%. Haplotype CD55-2 was found more frequently in SM cases than in controls (AF 25% in cases and 21% in controls). This difference was statistically significant in the haplotypic-specific score test (p = 0.03, p empirical = 0.04). A similar result was found for haplotype CD55-8, which had an estimated AF of 1% in the SM case group and 2% in the control group, and haplotype-specific p-values of 0.01 (Table  4). However, after correction for multiple hypotheses testing, the global test-statistic produced p-values of 0.12 and 0.32, indicating no association of full-length haplotypes with any of the phenotypes.
In order to test for an accumulation of rare variants in either the case or control group, four different algorithms were applied: the CMC method, the WSS, the VT model, and the C(a) test. None of the approaches, which included univariate and multivariate collapsing tests, provided evidence for a joint effect of rare variants on the phenotypes tested (Table 5).

DISCUSSION
Following a recent report on CD55 and its essential role for P. falciparum survival, it was hypothesized that host genetic variation of CD55 may have an impact on SM disease severity. Here, we analyzed variants in the coding sequence and regulatory regions of CD55 in a large SM casecontrol study from Ghana. The findings of our study do not support an effect of genetic variants tested on parasite density or susceptibility to major clinical manifestations of SM. Only 3 amino acid-changing variants were detected that had MAFs between 1 and 4%, indicating a relatively high degree of sequence conservation in the CD55 exonic sequence in the Ghanaian population.
To our knowledge, this is the first study to identify novel genetic variants of CD55 in African subjects and test thoroughly for association with SM phenotypes. In their report, Egan et al. (2015) described two nonsynonymous CD55 polymorphisms that were significantly enriched in populations with ancestral or current exposure to malaria: CROM3 (R52L) in African Americans and in the Yoruba ethnic group from Nigeria, and CROM1 (A227P) only present in the Yoruba. The CROM3 genotype reached frequencies of 4.9% in African Americans and 5.7% in the Yoruba, but was not present in our Ghanaian study group. Instead, we found the coding variant R52H, which had a MAF of 4% in the case and control groups and did not show evidence for an association with any of the SM phenotypes. CROM1 was found once in the control group, indicating a very low frequency in the Ghanaian population. If this variant influenced disease outcome at all, it would have a very low impact on disease risk at population level. Because Egan et al. (2015) classified populations of the 1000 Genomes Project into high and low malaria-exposed populations and performed a simple comparison of MAFs between these two groups, it is possible that this analysis produced spurious results, as it may solely capture interethnic differences instead of malaria-specific associations. Moreover, the diversity of allele frequencies across the analyzed 1000 genomes populations may be driven by other pathogens that exert selective pressure on the human genome. According to Egan et al. (2015), CD55 not only functions as a receptor for P. falciparum merozoites, but also for viral and bacterial pathogens. Therefore, it could also be possible to envisage a selective pressure by pathogens other than P. falciparum, resulting in different frequencies of CD55 genetic variants.
Of note, a lack of association in the Ghanaian study group with the analyzed CD55 variants does not exclude significant associations with the same variants in other populations exposed to malaria. There are several coevolutionary effects between P. falciparum and its human host that have shaped the human genome (Kwiatkowski 2005;Bongfen et al. 2009). As a consequence, some types of adapted mechanisms based on structural genetic variation may be restricted to particular human subpopulations, as has been described for the protective HbC variant, which is found at high frequencies in West African populations but is not present in other malaria-exposed populations (Taylor et al. 2013). Likewise, there may be true associations between genetic variants and SM in additional chromosomal regions of the human genome that influence CD55 expression. Here, we screened the CD55 coding sequence, intron-exon boundaries, and part of the promoter region, because the main focus was laid on structural n variation or splice variants of the CD55 receptor, which are likely to influence invasion of RBC by P. falciparum merozoites. Another reason why existing genotype-phenotype associations may remain obscure is lack of power. The power of a study depends on the number of study participants and the MAFs and effect sizes of the alleles tested. For single SNP associations, our study provided enough power for all SNPs . 5% in the SM and SMA study groups. However, the study was underpowered when analyzing SNPs with lower frequencies in the single SNP association tests. Hence, low frequency variants were analyzed using several collapsing approaches that have been developed in order to increase the power for association studies of rare variants.
In addition, there may be true associations between genetic variants and SM in chromosomal regions of the human genome that influence CD55 expression, but that were not covered in this study. Here, we mainly screened the CD55 coding sequence, including intron-exon boundaries and part of the promoter region, because the main focus was laid on structural variation or splice variants of the CD55 receptor, which are likely to influence invasion efficiency. In this regard, exact mechanisms of how CD55 expression is regulated and the genetic variation with potential influence on expression levels on the RBC surface remains to be explored. For instance, a study that evaluated expression quantitative trait loci (eQTLs) in blood cells in samples from over 5300 donors did not identify any variation that functions as eQTLs for CD55 (Westra et al. 2013). One reason may be that the study did not include individuals of African ancestry. In this respect, CD55 expression studies in RBCs originating from African individuals would be helpful to specifically determine eQTLs for CD55 in RBCs.