Endoglin pathway genetic variation in preeclampsia: A validation study in Norwegian and Latina cohorts

Highlights • The endoglin pathway’s role in preeclampsia has not been fully elucidated.• Genetic variation in endoglin pathway genes is associated with preeclampsia.• Endoglin pathway genetic variation in preeclampsia may vary by ancestry.

including endoglin/soluble endoglin, represents one potential mechanism that has garnered a great deal of attention over the last decade [6,7].
Endoglin (CD105; ENG) is a membrane-bound co-receptor of transforming growth factor beta (TGFβ) that regulates vascular tone [8,9], is essential for normal angiogenesis and vascular development [10], and negatively regulates trophoblast differentiation towards the invasive phenotype [11,12]. Throughout pregnancy, ENG mRNA expression is elevated in the placenta and the maternal blood (cellular and cell-free components) in women who develop preeclampsia [13][14][15][16][17][18][19]. A preliminary study has also demonstrated that ENG mRNA expression is significantly elevated in microvascular endothelial cells isolated from adipose tissue obtained at cesarean section in women with preeclampsia [20]. Moreover, soluble endoglin (sENG) protein is significantly elevated in the circulation of women with preeclampsia weeks to months before clinically overt disease [21,22]. Soluble endoglin, which is generated by the cleavage of the trans-membrane ENG receptor, is postulated to enter the maternal circulation from the placenta where it sequesters transforming growth factor beta (TGFβ) and inhibits TGFβ-mediated cell signaling and endothelial function [19,23].
These elevations in endoglin mRNA expression and circulating endoglin protein levels in women with preeclampsia, assumed to be due to the increased contribution from a dysfunctional placenta, prompted us to investigate the role that maternal genetic variation in ENG and other members of the TGFβ1 signaling system may play in the development of preeclampsia. Using a case-control genetic association study design and a tagging single nucleotide polymorphism (tSNP) approach, we previously found that genetic variation in ENG (rs11792489, rs10121110) and TGFβR2 (rs6550005) was significantly associated with the development of preeclampsia in American Caucasian women, while genetic variation in TGFβ1 (rs4803455, rs4803457), TGFβR1 (rs10739778), and TGFβR2 (rs6550005, rs1346907, rd877572) was significantly associated with the development of preeclampsia in African American women [24]. For this original study we defined the preeclampsia phenotype based on the presence of hypertension, proteinuria, and hyperuricemia, and normotensive controls were 1:1 frequency matched to preeclampsia cases on ancestry, age, and parity.
The purpose of this current study was to [1] validate our genetic association findings from the American Caucasian cohort (PEPP cohort) in a well characterized Caucasian cohort from Norway and a well characterized White Hispanic cohort from Southern California and [2] explore the ENG pathway for new associations in these two cohorts.

Study populations
De-identified clinical data and DNA samples were obtained from the Oslo Pregnancy Biobank and the "Pilot Study of Novel Candidate Genes for Preeclampsia" research study conducted at the University of Southern California, for Norwegian and Latina cohort participants, respectively.
Participants of the Oslo Pregnancy Biobank were women with singleton pregnancies who delivered via elective cesarean section between the years 2001 and 2008. Women with a history of chronic hypertension, renal disease, diabetes, or other chronic diseases were excluded. Preeclampsia was defined as new-onset hypertension and proteinuria after gestational week 20, in accordance with the American College of Obstetricians and Gynecologists 2002 criteria that were in place when the study was conducted [25]. New-onset hypertension was defined as systolic blood pressure (SBP) ≥140 mmHg or diastolic blood pressure (DBP) ≥90 mmHg after 20 weeks' gestation, measured on ≥two occasions at least six hours apart in a previously normotensive woman. Proteinuria was defined as ≥0.3 g in a 24 h specimen or a urine dipstick ≥ 1+ on ≥ two midstream urine samples collected 6 h apart, in the absence of a urinary tract infection. Severe preeclampsia was defined as SBP ≥ 160 or DBP ≥ 110 on ≥ two occasions that were at least six hours apart (in accordance with the criteria outlined by the American College of Obstetricians and Gynecologists in 2002) [25]. HELLP syndrome was defined by the presence of hemolysis, elevated liver enzymes, and low platelets in the participants with preeclampsia. Partial HELLP was diagnosed when only two of the three signs were present. Normotensive (control) pregnancies were defined as those pregnancies in which a woman's blood pressure did not reach 140/ 90 mmHg and she did not develop proteinuria. For this validation study, all control participants delivered appropriate for gestational age infants at term. Based on self-reported country of birth and names, all Norwegian participants included in this current study were categorized as White. The Oslo Pregnancy Biobank was approved by the Regional Committee of Medical Research Ethics in Eastern Norway, and use of DNA samples for the current study was approved by the University of Pittsburgh Institutional Review Board (IRB).
Participants of the "Pilot Study of Novel Candidate Genes for Preeclampsia" case-control research study (University of Southern California) were retrospectively recruited from delivery logs at the Los Angeles County and University of Southern California Women's and Children's Hospital or during their postpartum stay at the Women's and Children's Hospital between the years of 1999 and 2008. Women with lupus, chronic renal disease, multiple gestation, or sickle cell disease/ trait were excluded from participation. Preeclampsia was also defined in alignment with the blood pressure and proteinuria criteria outlined by the American College of Obstetricians and Gynecologists in 2002 (please see criteria above) [25]. Severe preeclampsia was defined as SBP ≥ 160 mmHg or DBP ≥ 110 mmHg on ≥ two occasions at least 6 h apart AND proteinuria ≥ 500 mg/dL in a 24-h specimen or +3 on a urine dipstick. HELLP syndrome was defined by the presence of hemolysis (abnormal peripheral smear or LDH ≥ 600), elevated liver enzymes (ALT and/or AST ≥ 70), and low platelets (≤100,000). Partial HELLP was diagnosed when only two of the three signs were present. For this replication study, all participants with HELLP or partial HELLP also met the preeclampsia criteria. Uncomplicated pregnancies were defined as those pregnancies without significant hypertension. Based on self-report, all 175 Latina participants included in the current study were categorized as Hispanic Whites. The "Pilot Study of Novel Candidate Genes for Preeclampsia" research study was approved by the University of Southern California IRB, and use of the DNA samples in the current study was approved by the University of Pittsburgh IRB.

Genotyping methods
DNA aliquots obtained from the Norwegian cohort participants were extracted from EDTA peripheral blood samples using the MagNA Pure LC DNA Isolation Kit (Roche Diagnostics GmbH, Mannheim, Germany). DNA aliquots obtained from the Latina cohort participants were extracted from either peripheral blood samples or buccal swabs using the QIAamp DNA mini kit (Qiagen, Valencia, CA, USA), saliva samples using the Oragene kit (DNA Genotek, Ontario, Canada), or mouthwash samples using a phenyl-chloroform protocol previously described [26]. The genomic DNA aliquots provided by both the Norwegian and Latina cohorts were whole genome amplified using the GE™ Healthcare illustra™ GenomiPhi V2 DNA Amplification kit (GE Healthcare Life Sciences, Little Chalfont, United Kingdom). Methods for polymorphism selection, genotyping (iPLEX® Gold-SNP Genotyping assay (Sequenom® Inc, San Diego, CA; now Agena Bioscience, San Diego, CA), and data reliability checks are fully described by Bell et al. (2013) [24] ( Table 1). The UCSC Genome Browser was used to identify tSNP positions within the genome (Human Assembly Dec. 2013 (GRCh38/hg38) [27]. We also used databases (UCSC Genome Browser, dbSNP) and conducted a literature search that combined gene name AND rs number as keywords (e.g., TGFβR2 AND rs11129420) to learn about the potential function of the tSNPs that were significantly associated with pregnancy outcome in the multivariate models.

Statistical analysis
Demographic and clinical characteristics were compared between cases and controls using parametric and non-parametric tests (e.g., Student's t-test, Fisher's exact test). For the main analyses, we utilized a tiered analytic approach. For the first analytic tier, we used Chi-square or Fisher's exact tests (univariate analysis) and multivariate regression models to evaluate three tSNPs that were found to be significantly associated with pregnancy outcome in our previous study (rs11792480, rs10121110, rs6550005). In the second analytic tier, we evaluated the remaining tSNPs that had been genotyped. Univariate analysis with either Chi-square or Fisher's exact tests was conducted to initially explore the relationship between pregnancy outcome and each tSNP. A pvalue of p ≤ 0.15 was considered as potentially important. Multiple logistic regression models were then used to assess the relationship between potentially important tSNPs and pregnancy outcome while controlling for maternal age at delivery, infant sex, parity, self-reported smoking status during the index pregnancy, and pre-pregnancy body mass index (BMI). The adjusted odds ratio, 95% confidence interval, and the p-value of the tSNPs were reported. The overall goodness-of-fit of the logistic models were assessed by the Hosmer-Lemeshow test. We did not correct for multiple comparisons. Power analysis revealed that a sample size of 140 achieves 80% power to detect moderate effect size (W) of 0.2368 using a 1 degree of freedom Chi-square Test with a significance level (alpha) of 0.05.
In addition, we carried out a sub-group analysis to explore the association between tSNP genotype distributions and early vs. late-onset preeclampsia (e.g., preterm vs. term preeclampsia). We used gestational age at delivery as a surrogate marker for preeclampsia onset (earlyonset preeclampsia: delivering at < 37.0 weeks' gestation; late-onset preeclampsia: delivering at ≥37.0 weeks' gestation) and compared. tSNP genotype distributions for each preeclampsia subgroup to the healthy control group using Fisher's exact tests.

Associations between ENG (rs11792480, rs10121110) and TGFβR2 (rs6550005) tSNPs and preeclampsia not replicated in Norwegian and Latina cohorts
The demographic and clinical characteristics for the Norwegian and Latina cohorts are presented in Tables 2 and 3, respectively. In our previous genetic association study, we found that variation in two endoglin tSNPs (rs11792480, rs10121110) was associated with susceptibility to/protection from preeclampsia in an American Caucasian cohort, while variation in a TGFβR2 tSNP (rs6550005) was associated with preeclampsia in both an American Caucasian and an African American cohort. We found additional tSNPs that were associated with preeclampsia in the African American cohort that were not found to be associated with preeclampsia in the American Caucasian cohort, but here we focused on tSNPs that were associated with preeclampsia in the American Caucasian cohort given that our replication cohorts were not of African ancestry.
In both the Norwegian and Latina cohorts, associations between each of the three tSNPs of interest (ENG: rs11792480, rs10121110; TGFβR2: rs6550005) and preeclampsia status were not replicated in both the unadjusted and adjusted analyses (p's > 0.05) (Refer to Supplemental Materials for tSNP/SNP genotype distributions). These associations were also non-significant in our exploratory subgroup analyses of early and late-onset preeclampsia.

Exploratory analysis provides additional support for the ENG pathway's involvement in preeclampsia
Six intronic tSNPs in two ENG pathway candidate genes were found to be significantly associated with preeclampsia status in the Norwegian cohort, TGFβR1(rs6478974) and TFFβR2(rs11129420, rs6802220, rs1155708, rs3773640, rs3773663) ( Table 4) (Refer to Supplemental Materials for tSNP/SNP genotype distributions).
Our exploratory subgroup findings suggest that certain tSNPs in two ENG pathway candidate genes were also found to be significantly associated with preeclampsia in the Latina cohort, ALK1(rs706819, 3′ UTR variant) and TGFBR2(rs984394, intronic) ( Table 5). Because tSNP genotype distributions may differ by preeclampsia subgroup, we also compared preterm and term preeclampsia distributions to control distributions.

Discussion
Preeclampsia is a heterogeneous disorder of pregnancy whose etiology may stem from a variety of underlying mechanisms. Women who develop early-onset and late-onset preeclampsia differ on average on several aspects, including their risk of future cardiovascular disease [28], but a dysfunctional placenta is seen as a major common pathophysiological pathway for triggering the maternal syndrome [29]. Some of the proposed mechanisms include defects in placental implantation, oxidative stress, a susceptible maternal constitution, and anti-angiogenic/angiogenic factor imbalance stemming from a dysfunctional placenta [30]. Endoglin and its soluble form, sENG, are angiogenesismodulatory factors that are thought to have a role in the development of preeclampsia, and associations have been found at the gene, mRNA, and protein level. We have previously shown that genetic variants in the ENG pathway are associated with an increased susceptibility to preeclampsia in both American Caucasian and African American women [24]. Others have shown that mRNA expression of ENG is increased in the placenta, maternal blood (cellular and non-cellular components), and maternal microvascular endothelial cells [13][14][15][16][17][18][19][20] while levels of sENG protein in the maternal circulation are elevated weeks to months before overt disease in women with preeclampsia [21,22]. Although the evidence to support the alterations in ENG mRNA In this study we investigated the association between maternal genetic variation in the ENG pathway and susceptibility to/protection from preeclampsia in independent, well-characterized preeclampsia cohorts. We first sought to validate our genetic association findings from an American Caucasian preeclampsia cohort (ENG tSNPs rs11792480, rs10121110; TGFβR2 tSNP rs6550005) in Norwegian and Latina cohorts. We further explored the other gene candidates from the ENG pathway for new genetic associations with preeclampsia in these cohorts. We were unable to validate/replicate the SNP-specific previous findings in these independent cohorts. However, we identified other associations between ENG pathway candidate genes and preeclampsia in the Norwegian (TGFβR1[ALK5], TGFβR2) and Latina (ALK1, TGFβR2) cohorts in our adjusted models. We have previously demonstrated that variation in TGFβR1[ALK5] was associated with preeclampsia in an African American cohort [24]. Moreover, we have demonstrated that genetic variation in TGFβR2 is associated with susceptibility to/protection from preeclampsia in all of the populations we have studied. Although the TGFβR2 tSNPs that were associated with a particular cohort may differ, these findings, along with the other associations in the Norwegian and Latina cohorts, provide additional evidence to support the ENG pathway's role/involvement in the development of preeclampsia.

Failure to validate/replicate associations between genetic variation in ENG and TGFβR2
Several potential explanations could account for our inability to validate our previous results in independent cohorts. First, these Abbreviations: BMI (kg/m 2 ), body mass index (kilograms/meters squared). Notes: Pre-pregnancy BMI includes BMI pre-pregnancy or within the first trimester of pregnancy. .02 Smoked during pregnancy ** (n, %) 7 (10.9%) 4 (4.0%) .08 Abbreviations: BMI (kg/m 2 ), body mass index (kilograms/meters squared). Notes: * n = 11 controls and n = 2 cases were missing pre-pregnancy BMI values. ** n = 6 controls and n = 5 cases were missing values on smoking during pregnancy. cohorts may not have been sufficiently powered to detect statistically significant associations when they truly existed. Second, tSNPs that were selected based on Caucasian ancestry (CEU population-European descendants living in Utah) may not be as informative in the Norwegian and Latina cohorts if the haploblocks tagged by the tSNPs differ by ancestry. Finally, differences in the definition of preeclampsia phenotype may underlie our failure to validate our previous results. In the original cohort, a diagnosis of preeclampsia was based on hypertension, proteinuria, and hyperuricemia. Hyperuricemia was included in the original cohort's research definition as a way to identify a more homogeneous and potentially more severe form of preeclampsia. In both the Norwegian and Latina cohorts, hyperuricemia was not included as a criterion for the preeclampsia phenotype. Given the heterogeneous nature of preeclampsia, these cohorts may represent different preeclampsia subtypes with differing etiologies.

Additional evidence to support the ENG pathway's involvement in preeclampsia
Genetic variation in TGFβR1[ALK5] and TGFβR2 was associated with susceptibility to/protection from preeclampsia in the Norwegian cohort, while genetic variation in ALK1 and TGFβR2 was associated with susceptibility to/protection from preeclampsia in the Latina cohort. Seven of the eight significantly associated tSNPs (Table 5; rs6478974, rs11129420, rs680220, rs1155708, rs3773640, rs3773663, rs9843942) are intronic, while the other SNP was located in the 3′ UTR of the gene (rs706819). Of the eight tSNPs, we were only able to locate a small amount of literature on TGFβR1 tSNP rs6478974, which has been associated with endometrial and gastric cancer in other populations [31,32].
TGFβR1[ALK5] and ALK1 are type 1 receptors of the TGFβ signaling cascade, which form a heteromeric complex with TGFβR2, the type 2 receptor of the TGFβ signaling cascade. In most cells, TGFβ signals via ALK5, but TGFβ can also signal via ALK1 in endothelial cells. The TGFβ1 ligand binds to these complexes, and propagation of the cell signal is involved in the control of a variety of processes including cellular proliferation, migration, and differentiation [10].
In vitro research has demonstrated that TGFβ1 is a negative regulator of trophoblast invasion toward the invasive phenotype [33][34][35][36], and that endoglin is needed to facilitate this inhibitory effect [11,12]. Shallow trophoblast invasion and failed remodeling of the spiral arteries, affects mostly the early-onset subtype of preeclampsia [37][38][39][40]. Furthermore, implantation and placental development is dependent on the successful interaction between fetal/placental and maternal/decidual factors [34]. As such, it is plausible that maternal or fetal genetic variation in any of the members of the ENG/TGFβ pathway could impact TGFβ signaling and the regulation of trophoblast invasion. Moreover, it is possible that maternal genetic variation in this pathway could also impact systemic endothelial function in the mother, as this pathway is involved in the maintenance of vascular homeostasis, including vasomotor tone [10,41,42].

Study limitations
In addition to small sample size and differences in ancestry that were previously discussed, there were several other limitations associated with this study. First, the cohorts' sample sizes were further reduced when we carried out our exploratory preeclampsia subgroup analyses, which limited our ability to adequately evaluate preeclampsia subgroups. Second, we did not adjust for multiple comparisons and therefore we cannot rule out the potential for type 1 error. As such, follow-up studies with larger sample sizes, including larger preeclampsia subgroups, will be needed to validate these results. Third, we were unable to successfully genotype three SNPs (TGFβ1 rs1800468; TGFβR2 rs1078985 and rs995435).

Summary
Overall, our results provide further support for the involvement and investigation of the endoglin pathway in preeclampsia, yet additional research with larger sample sizes, including larger samples of different preeclampsia subgroups, is needed. Due to the relatively low frequency of preeclampsia, there is a need for investigators across the globe to collaborate and share data/biological samples with each other so that we can continue to put preeclampsia's puzzle pieces together.

Funding
This work was supported by the National Institutes of Health (F32NR014622, T32NR009759) and the Preeclampsia Foundation.