Assessing risk for Mendelian disorders in a Bronx population

Abstract Background To identify variants likely responsible for Mendelian disorders among the three major ethnic groups in the Bronx that might be useful to include in genetic screening panels or whole exome sequencing filters and to estimate their likely prevalence in these populations. Methods Variants from a high‐density oligonucleotide screen of 192 members from each of the three ethnic‐national populations (African Americans, Puerto Ricans, and Dominicans) were evaluated for overlap with next generation sequencing data. Variants were curated manually for clinical validity and utility using the American College of Medical Genetics (ACMG) scoring system. Additional variants were identified through literature review. Results A panel of 75 variants displaying autosomal dominant, autosomal recessive, autosomal recessive/digenic recessive, X‐linked recessive, and X‐linked dominant inheritance patterns representing 39 Mendelian disorders were identified among these populations. Conclusion Screening for a broader range of disorders could offer the benefits of early or presymptomatic diagnosis and reproductive choice.


Introduction
Ethnicity-based genetic testing, particularly among welldefined ethnic groups, has been advocated as a valuable strategy that can aid in preventing genetic disease and improving public health at relatively low cost (Strauss et al. 2012).The advantage of such population-targeted screening has included the ability to identify carriers of well-characterized, population-associated deleterious recessive and X-linked conditions and thereby prevent the birth of affected children. One notable success of this approach has been the sharp decline in incidence of Tay-Sachs disease and other disorders among Ashkenazi Jews following carrier screening for this condition (Kaback 2000). Yet despite these successes, there are indications that the spectrum of population-based risk for Mendelian disorders has been underestimated. For example, a recent study documenting carrier frequencies for 108 high-penetrance autosomal recessive conditions among individuals from 14 ethnic groups has demonstrated population risks beyond the original populations known to be at risk for these conditions (Lazarin et al. 2013). Genome wide approaches to assess population risk for Mendelian conditions can further expand our understanding of the spectrum of disorders for which specific populations may be at risk. Together with the availability of highly multiplexed technologies, such as high density oligonucleotide arrays, this information can be used to develop more comprehensive, cost effective diagnostic and reproductive risk assessment assays. In addition, it enables the incorporation of predisposition and presymptomatic testing, such as for cancer risk.
Of the ethnic groups polled by the US Census Bureau, the largest U.S. ethnic groups are African Americans and Hispanic Latinos, comprising 13.3% and 17.6% of the population, respectively (https://www.census.gov/quickfac ts/table/PST045216/00-accessed 3/16/17). Population growth of Hispanic Latinos accounted for over half of the growth of the total US population between 2000 and 2010. Puerto Ricans and Dominicans comprise 9.2% and 2.2% respectively of the total Hispanic Latino population and are concentrated in the major urban cities of New York, Florida, New Jersey, Pennsylvania, Massachusetts, California and Illinois (http://www.census.gov/prod/ce n2010/briefs/c2010br-04.pdf accessed 3/16/17 [PDF]) making them important populations in multiple regions of the US.
Although Puerto Ricans and Dominicans are products of admixture from similar source populations (Arawak Native American, West and Central African, and Iberian European), differences in the relative source population proportions, time of contact, number of contact events, and population growth after isolation have led to the development of genetically distinct populations with individual founder mutations (Li et al. 2014). Despite Sewall Wright's papers about island populations developing unique genetic constitutions based on migration, gene flow, selection and drift beginning as early as 1931 (Berniell-Lee et al. 2009), the genetic distinctiveness of these island populations has only recently been appreciated and as a result most researchers, clinicians, and population databases have used only the broad classification of Hispanic Latino. Thus our understanding of the clinically relevant genetic risks in these populations remains incomplete, limiting the utility of genetic testing among these populations. In addition, suboptimal provider communication, medical mistrust, and cultural beliefs, have further limited access to testing and understanding of genetic risks (Lalueza-Fox et al. 2001;Burke et al. 2006;Kaplan et al. 2006;Suther and Kiros 2009;Burke and Korngiebel 2015).
In the borough of Bronx, New York, African Americans and two Hispanic Latino subgroups (Puerto Ricans and Dominicans) collectively comprise 85% of the population of 1.4 million people (http://www.census.gov accessed 5-26-11). As a first step in developing better genetics services for these groups, we sought to expand our understanding of the deleterious genetic variants that they carry by screening 192 healthy individuals from each of these three ethno national groups using Affymetrix Axiom 319 arrays. The more than 300,000 probes found on Affymetrix Axiom 319 arrays are a group of markers enriched for likely pathogenic variants ascertained by whole exome screening of 120,000 ethnically diverse samples of European, African, Latino, and Asian ancestry representing multiple disease cohorts including type 2 diabetes, cancer, infectious disease, cardiovascular disease, and neurological/psychiatric disorders. Our study screens Dominicans, Puerto Ricans, and African Americans, to determine whether any of these variants are relevant to these specific populations. Detected variants are further confirmed to be present in these specific populations and their frequency estimated by comparison to population-specific exome sequencing data sets. Confirmed variants are annotated according to ACMG guidelines (Richards et al. 2015) to provide a working set of common genetic disease risk variants useful to clinicians evaluating patients from these populations. Parallel literature review was also used to identify additional conditions and underlying variants documented for our populations of interest.
The parallel examination of these three groups offers distinct advantages. First, their overlapping Columbian and post-Columbian histories and occupancy of a small urban area suggest gene flow and consequent shared health risk, while their divergent pre-Columbian histories predict some disease risks unique to each group (Lalueza-Fox et al. 2001;Berniell-Lee et al. 2009;Li et al. 2014). Second, despite residing in New York, an area of potentially sophisticated medical knowledge and intervention, these populations continue to experience rates of morbidity and mortality from diseases with substantial genetic underpinnings similar to those seen among underserved populations throughout the U.S. (Kaplan et al. 2006). Studies, such as ours, might ultimately engender broader use of and trust for genetic screening among these groups, providing a step toward closing public healthcare gaps (Burke et al. 2006;Suther and Kiros 2009;Burke and Korngiebel 2015).

Ethical compliance
Patients were consented for anonymous use of residual specimen post clinical testing according to an NYU IRBapproved protocol.
Blood samples were drawn from 192 individuals from each of the Bronx's three prominent ethnonational groups: Puerto Ricans, African Americans, and Dominicans. The subjects studied were healthy unrelated individuals (predominantly female) presenting for routine reproductive screening and reproductive risk assessment. Ethnonationality was self-reported as parental country of origin at the time of collection. Each subject's DNA was extracted through a standard protocol and hybridized to a separate Affymetrix Axiom â Exome 319 Genotyping array containing 318,916 probes using a similar experimental design to that reported previously (Cartmel et al. 2014). All samples genotype calling were performed using Affymetrix Genotyping Console TM and following a standardized best practices protocol. Variants excluded from further analysis included those with mean allele frequencies (MAF) of either 0 (not present) or 1 (homozygous), and those that did not meet the expectations of Hardy-Weinberg equilibrium (P-value <0.05).
In parallel, DNA samples from 100 Dominican subjects, randomly selected from our 192 sample set, were combined into 10 pools comprising 20 specimens each and subjected to whole exome sequencing (WES) on an Illumina platform. This strategy ensured that each exome was represented in two separate pools and thereby sequenced twice to ensure fidelity (Prabhu and Pe'er 2009). Variant calling was performed following a method used previously in our laboratory (Isakov et al. 2013). The pooled WES served as orthogonal validation for variants identified through the Axiom 319 chip in Dominicans (see Fig. 1). Orthogonal validation was also performed using the 1000 Genomes and ESP6500 databases for Puerto Ricans and African Americans, respectively (Genomes Project Consortium, 2010; Tennessen et al. 2012).
Variants were parsed by a custom web-crawler that queried three interrelated, open-source databases for pathogenicity: OMIM (http://omim.org), ClinVar (https:// www.ncbi.nlm.nih.gov/clinvar/), and dbSNP (http://www. ncbi.nlm.nih.gov/SNP/) (Baskovich et al. 2016). Variants flagged in the pipeline as pathogenic or possibly pathogenic were evaluated by at least two independent reviewers who performed literature review to confirm or reject the assessment. To compile the final variant list, variants were assigned one or more scores from the variant calling guidelines of the ACMG (Richards et al. 2015). To augment the panel, the literature was queried for variants underlying conditions known to be present in these populations of interest which were subjected to a similar approach for assessing pathogenicity.
Estimated carrier frequencies and disease burden were calculated assuming Hardy-Weinberg equilibrium and using a Mean Minor Allele Frequency (MMAF) that weighted individual frequencies based on the number of subjects contributed by the respective database to the entire cohort. For Puerto Ricans, 1000GenomesPUR contributed 210 (39%) and Axiom 319 contributed 384 (61%) of all alleles to the final allele frequency pool. For African Americans, ESP6500 contributed 4406 (90%), 1000GenomesASW contributed 132 (3%), and Axiom 319 contributed 384 (7%) of all alleles to the final allele frequency pool. For Dominicans, whole exome sequencing contributed 200 (34%) and Axiom 319 contributed 384 (66%) of all alleles to the final allele frequency pool. "At risk" variants are reported for a likely disease burden of 1/60,000 or greater, as in our previous study (Baskovich et al. 2016). Thus, inclusion in the panel was limited to alleles with a MMAF of 0.001-0.1 for autosomal recessive or X-linked conditions and 0.001-0.02 for autosomal dominant conditions (Cartmel et al. 2014). Conditions in the panel that exceeded these limits included APOL1NM_003661.3:c.1152T>G(p.Ile384Met) and NM_003661.3:c.1024A>G(p.Ser342Gly) and G6PD NM_000402.4:c.292G>A(p.Val68Met), all of which have been hypothesized to have increased in frequency in ancestral African populations because of heterozygote advantage (Genovese et al. 2010;Sirugo et al. 2014). The panel also included alleles for some conditions with rare prevalence among one of these populations for which testing would have clinical utility.

Results
This study identified 75 variants for 39 conditions, including seven autosomal dominant, 29 autosomal recessive, one autosomal recessive/digenic recessive, one X-linked recessive, and one X-linked dominant inheritance) (Tables 1, S1 and S2). Of these conditions, 10 were observed in this study or reported previously for African Americans and Puerto Ricans, two were observed in this study or reported previously for African Americans and Dominicans and three were observed in this study or reported previously for all three populations. Some of the early-onset autosomal recessive conditions in the panel are included in current newborn screening panels, such as galactosemia, cystic fibrosis, hemoglobin S and hemoglobin C disease (Watson et al. 2006). Some of these, such as cystic fibrosis, SMA and fragile X syndrome, are included in panels recommended for heterozygote detection (Grody et al. 2013). The remainder has not been recommended for population-based testing.
Predominantly adult onset conditions included those inherited in both autosomal recessive and dominant patterns and whose identification might be beneficial in leading for expedited intervention and/or anticipatory monitoring. Variants conferring risk for both malignant and nonmalignant diseases are included in this category. Eight high risk variants in BRCA1 and BRCA2 were identified among African Americans and Puerto Ricans (Dean et al. 2015). Among the non-malignant conditions identified in Puerto Ricans and African Americans were transthyretin amyloidosis (TTRNM_000371.3:c.424G>A (p.Val142Ile)) that typically manifests after age 60 with renal amyloidosis and cardiomyopathy (Jacobson et al. 1997;Shah et al. 2016) and the APOL1-related kidney diseases (APOL1NM_003661.3:c.1152T>G(p.Ile384Met) and NM_003661.3:c.1024A>G(p.Ser342Gly)), focal segmental glomerulosclerosis, HIV-associated nephropathy (HIVAN), and hypertension-associated end stage kidney disease (ESKD) with risks significantly higher for those carrying two risk alleles (Genovese et al. 2010).
Based on a conservative estimate for medically significant conditions, for Puerto Ricans, the panel would detect 50% of individuals as heterozygotes for autosomal recessive disorders and 2.5% as heterozygotes for autosomal dominant disorders. Among couples, 3% would be at risk for having a child with a recessive disorder. For African Americans, the panel would detect 87% of individuals as heterozygotes for autosomal recessive disorders and 4.6% as heterozygotes for autosomal dominant disorders. Among couples 23% would be at risk for having a child with a recessive disorder. For Dominicans, the panel would detect 23% of individuals as heterozygotes for autosomal recessive disorders. Among couples 1.1% would be at risk for having a child with a recessive disorder.
In order to gain insights into the population origins of the clinically significant variants identified in our study, we queried their frequencies among the populations known to have admixed during the peopling of the Americas (Lalueza-Fox et al. 2001) (Fig. 2). These populations included various European and African groups, for which there is representation in both the 1000G and ExAC databases. As no equivalent database for the pre-existing Caribbean Indigenous/Native populations (Tarawak tribes) exists, the ExAC East Asian database was queried as a proxy for these frequencies, given the archeological record of migration through the Bering Straits in populating the New World. Several variants were found exclusively among African and/or Hispanic-Latino populations suggesting an African origin. These included HBB:NM_000518.4:

Discussion
The aim of this study was to increase understanding of the genetic disease burden carried by Puerto Ricans, African Americans, and Dominicans as a first step in developing more effective clinical diagnosis and management strategies for these populations. We have broadened the list of clinically relevant conditions beyond those traditionally associated with these populations. Screening for the recessive and X-linked disorders provide opportunities for reproductive risk management, beyond conditions currently included in ACMG-recommended or commercially available panels.
Early or presymptomatic identification of other conditions could lead to interventions that would reduce morbidity and mortality and in some cases can reduce the diagnostic odyssey. been effective in avoiding morbidities associated with HFM (Qiu et al. 2006). The increased prevalence has prompted advocacy of newborn screening for this mutation among Puerto Ricans (Mahadeo et al. 2011). Transthyretin amyloidosis presents as a late-onset autosomal dominant disease with renal amyloidosis and cardiomyopathy and disproportionately affects African Americans (Jacobson et al. 1997;Shah et al. 2016). Liver transplant remains the only potentially curative treatment for transthyretin amyloidosis, and is best accomplished at an early age to reduce perioperative morbidity secondary to cardiac pathology. Close monitoring of renal and cardiac function may be pursued for non-operative candidates, and correct identification may guide appropriate supportive treatment. Either scenario emphasizes the personal clinical utility of presymptomatic genetic testing. Predicting breast cancer risk through testing BRCA1 and BRCA2 high-risk variants seen in our populations represents a genetic screening paradigm shift to a broader definition of population screening to include predisposition screening (Lieberman et al. 2016). Despite an increased frequency of aggressive triple-negative breast cancer among African American and Hispanic Latino women, genetic testing for deleterious BRCA 1 and 2 high-risk variants to guide clinical decisions remains underutilized by members of either population (Olopade et al. 2003;Halbert et al. 2006;Lieberman et al. 2016). Yet optimistically, women from these populations who are at moderate to high risk for carrying BRCA1 and BRCA2 alleles might be more receptive to genetic testing (Kessler et al. 2005).
Arguments have been proposed, both in favor of and in opposition to, ethnicity-based genetic screening of unaffected adults. Arguments in favor of such screening cite the greater ease of identification and confidence in clinical management of well understood genetic conditions and their underlying genetic alterations. Examples of this paradigm include successes in reducing morbidity from maple syrup urine disease and glutaric acidemia type 1 among Amish and Mennonite communities in Northern Pennsylvania and Tay-Sachs disease among Ashkenazi Jews (Kaback 2000;Strauss et al. 2012). Arguments in opposition to such testing focus on concerns of potential discrimination and compulsory genetic testing for certain groups, as was the case for sickle cell disease during the 1970s (Fulda and Lykens 2006). Further, a recent study cited risk of a negative self-image among individuals discovered to carry an autosomal recessive deleterious mutation, even when reassured of their lack of personal risk (Axworthy et al. 1996). Such concerns and the variety of disorders identified and ways in which this information is applied (self-diagnosis or reproductive risk assessment) highlight the need for efforts to develop individualized and culturally sensitive genetic counseling to facilitate informed consent and access to the genetic services for these populations.

Supporting Information
Additional Supporting Information may be found online in the supporting information tab for this article: Table S1 List of conditions identified in this study by inheritance pattern in Puerto Ricans (PR), African Americans (AA), and Dominicans (Dom), identified through the analytical pipeline. Table S2 List of conditions identified in this study by inheritance pattern in Puerto Ricans (PR), African Americans (AA), and Dominicans (Dom), identified through literature review.