A modest but significant effect of CGB5 gene promoter polymorphisms in modulating the risk of recurrent miscarriage

Objective To confirm the effect of single nucleotide polymorphisms (SNPs) in chorionic gonadotropin beta (CGB) genes in modulating the susceptibility to recurrent miscarriage (RM) in Danes and in a meta-analysis across Danes and the discovery samples from Estonia and Finland. Design Case-control association study, restriction fragment length polymorphism genotyping, resequencing. Setting Fertility clinics at the Rigshospitalet, Copenhagen, and Aalborg Hospital, Aalborg, Denmark. Patient(s) Four hundred fifty Danish women and men from couples with RM and 119 women with children and no miscarriages in new study. A total of 634 women and men from RM couples and 314 female controls in a combined study of Estonians, Finns, and Danes. Intervention(s) None. Main Outcome Measure(s) Distribution of CGB5 and CGB8 allele and haplotype frequencies in patients and controls. Result(s) For the majority of studied SNPs, the allelic and haplotypic distribution differed statistically between the Danish and the previous Estonian-Finnish sample. In Danes, two CGB5 promoter SNPs (c5-155; c5-142) exhibited a nonsignificant trend for higher allele frequency in fertile women compared with RM patients. The meta-analysis of results from three populations confirmed a modest but significant effect on carriage of c5-155C (odds ratio = 0.64; 95% confidence interval [CI] 0.44–0.94) and c5-142A (odds ratio = 0.66; 95% CI, 0.45–0.94) variants in reducing the risk of RM. None of the investigated genetic variants in the CGB8 gene was associated with RM. Conclusion(s) Carriage of particular variants in the promoter of the CGB5 gene seems to protect against RM. No common genetic variants in CGB5 and CGB8 were associated with increased RM susceptibility in the studied North European populations.

I n human pregnancy, the production of hCG, a placental hormone, is indispensable. Its classical function is considered to maintain the production of steroid hormones in the corpus luteum. In addition, hCG enhances blastocyst implantation, uterine vascularization, and angiogenesis, as well as regulates maintenance of uterine quiescence and immunological adaptation during pregnancy (1)(2)(3). Low levels of hCG during the first trimester of pregnancy are related to miscarriage and extrauterine pregnancy (4)(5)(6). Abnormal circulating levels of hCG and alterations in the hormone's glycosylation patterns have been described in several pathologies (trisomies, gestational trophoblastic diseases, malignant tumors, etc.) and implied in clinical diagnostics (3,7,8).
A clinical condition that may develop from low hCG is recurrent miscarriage (RM), defined as three or more consecutive pregnancy losses before 22 gestational weeks (9). Apart from the known risk factors for RM (parental chromosomal anomalies, maternal thrombophilic, anatomical, endocrine, or immunological disorders), >50% of the RM cases remain classified as idiopathic (10). As the prevalence of miscarriage among the first-degree relatives of the women with RM is increased (11), a notable fraction of unexplained RM cases is expected to represent carriers of genetic risk factors involved in RM pathogenesis. Due to an irreplacable role of hCG in normal gestation, genetic variants in genes encoding hCG subunits may affect gene expression and consequently the optimal levels of hormone production as well as pregnancy success.
HCG is a heterodimeric glycoprotein consisting of two dissimilar subunits, a and b. The a-subunit gene is shared among gonadotropins (hCG, LH, FSH) and TSH, whereas the b-subunit is hormone specific. In humans, the b-subunit of hCG is coded by four duplicated and highly homologous (97%-99% DNA identity) chorionic gonadotropin beta (CGB) genes (12)(13)(14)(15). All CGB genes encode identical hCG bsubunit proteins, which are critical to the level of intact circulating hCG (16). Still, the transcriptional activity among gene duplicates varies greatly, and there is also a large interindividual variation in the hCGbeta transcript levels (6,17,18). The majority, up to 82%, of the total pool of hCGbeta transcripts is provided by two genes, CGB8 and CGB5 (6,18).
We have recently conducted a clinical resequencing study of CGB5 and CGB8 genes among Estonian and Finnish patients with RM and fertile controls (19). The study identified three rare variations in the protein-coding exons resulting in amino acid changes in the hCG-beta protein (Val56Leu in CGB5; Arg8Trp and Pro73Arg in CGB8), and they may therefore be potential risk factors for the occurrence of RM. The subsequent detailed functional and structural analysis of these mutations concluded that only substitutions with neutral or mild functional consequences for hCG action might be tolerated in the major hCG-beta coding genes CGB5 and CGB8 (20). Additionally, the resequencing described six single nucleotide polymorphisms (SNPs) in the CGB5 and CGB8 genes located outside the exons with significantly lower frequency among RM patients compared with the control group and thus exhibiting a protective effect towards RM (19). These polymorphisms included four linked SNPs (c5-155G/C/c5-147G/del/c5-144T/C/c5-142T/A) in the upstream of the CGB5 gene (up to 350 bp relative to mRNA start site), which form the two main CGB5 promoter haplotypes that are composed of the combination of either major or minor alleles of these SNPs (Fig. 1). Association with RM susceptibility was also detected for two intronic SNPs in the CGB5 (c5þ1038C/T) and CGB8 genes (c8þ1045C/T) (Fig. 1).
This study aimed [1] to confirm the effect of the CGB5 (c5-155G/C, c5-142T/A, c5þ1038C/T) and CGB8 (c8þ1045C/T) polymorphisms on the susceptibility to RM by genotyping an independent sample set from Denmark and by an extended meta-analysis across the three study populations (Estonians, Finns, Danes); [2] to resequence the promoter region of the most actively transcribed hCG-betacoding gene CGB8 in the Danish RM cases and controls to discover novel potential genetic risk variants to RM. The meta-analysis confirmed a modest but significant effect of the CGB5 promoter variants c5-155C and c5-142A in reducing the risk to RM. Other investigated SNPs in the CGB5 and CGB8 genes exhibited no effect on RM susceptibility.

Study Subjects
Subjects recruited in the study were admitted to the Fertility Clinic, Rigshospitalet, Copenhagen, and the Department of Obstetrics and Gynaecology, Aalborg Hospital, Aalborg, from all over Denmark for investigation and treatment. The study sample set included 450 Caucasian patients diagnosed with RM (three or more pregnancy losses confirmed by the hospital records). The group of Danish idiopathic RM cases consisted of 199 couples and 52 single female patients. Because maternally and paternally derived gene variants contribute equally to the function of the fetal genome in placenta, the patient group included both the women and their partners who had experienced RM. In the Estonian-Finnish discovery study (19) as well as in the current Danish follow-up study, the control group was designed under the assumption that fertile women with no history of miscarriage are carrying gene variants supporting successful pregnancies. The male partners were not investigated among the control group because detailed reliable information on their reproductive history is challenging to collect. The Danish control group comprised 119 Caucasian age-matched fertile women from couples with no history of miscarriage and at least two normal pregnancies. None of the recruited female patients had uterine abnormalities found by hysteroscopy, uterine hydrosonography, or hysterosalpingography, and all RM patients and their husbands had normal karyotypes. All women were regularly menstruating with a cycle length of <35 days, and all had normal plasma thyroxin levels (detailed in [21,22]). The study was approved by the Ethics Committees of the Capital Region, Denmark.
The subsequent meta-analysis combined the Danish data set from the current study with the discovery data of the Estonian-Finnish sample (19). The patient group of unexplained RM comprised 35 couples and 29 single female patients from Estonia and 40 couples and five single female patients from Finland. For the RM patients, the recruitment criteria in the three study centers were identical. The Estonian-Finnish control group was formed from agematched fertile women with no history of miscarriage and consisted of 95 Estonians and 100 Finns (19). The definition of fertile female controls in the discovery study was based on at least one (Finnish) or three (Estonian) successful deliveries (the detailed description is in reference 19).

Genotyping and Resequencing
DNA was extracted from peripheral blood using an in-house protocol or Puregene DNA Isolation Kit (Gentra Systems), which are both based on the salting-out method for DNA extraction. The CGB5 ($1.7 kb fragment) and CGB8 (long-range polymerase chain reaction [PCR] $8.3 kb; nested PCR $2.5 kb fragment) genomic regions were amplified using previously described primers and PCR conditions (15,19) (Fig. 1A, Supplemental Table 1).
The 5 0 -upstream gene regulatory region of CGB8 was subjected to full resequencing, covering from 350 bp upstream relative to the mRNA start site to the end of exon 1 (at þ400 bp). Primer design for the additional PCR amplification and sequencing primers was implemented using the Primer3 software (http://frodo.wi.mit.edu/cgi-bin/primer3/ primer3_www.cgi). Sequences were resolved using ABI 3730 XL DNA Analyzer (Applied Biosystems) and analyzed by the Phred, Phrap and Consed package (23), which facilitates base calling from sequencing trace files, sequence quality assessment, and assembly. Polymorphisms were identified using the PolyPhred program (ver. 6.02.) (24) and confirmed by manual checking. A genetic variant was called only if it was observed in both forward and reverse orientations. The nomenclature of the polymorphisms was based on the  Table 1) for the amplification of the CGB5 and CGB8 genic regions are depicted with short arrows. The flanking regions of the genotyped SNPs (c5-155, c5-142, c5þ1038, c8þ1045) have been zoomed in and aligned between the two duplicate genes. Dots indicate identical nucleotides in the corresponding positions of CGB5 and CGB8. The SNP code corresponds to the gene name (c5 ¼ CGB5) and location relative to mRNA start site. The LD between the four polymorphisms in the CGB5 promoter region is expressed using the r 2 -statistic. (B) The SNPs identified in Danes within the resequenced region of CGB8 spanning the upstream region (À350 bp from mRNA start site) and the first exon (gray box; up to þ400 bp). The proximal promoter of the hCGbeta coding CGB genes necessary for full basal expression has been demonstrated to be located between nucleotide positions À362 and þ104 relative to mRNA start site (31). The direction of gene transcription is shown with a bold arrow. Singleton SNPs are marked with ''S,'' rare SNPs (MAF, <10%) with short bidirectional vertical lines and common SNPs (MAF R10%) with the long vertical lines. The position c8-186 is in strong LD with the SNP c8þ108 in CGB8 exon 1; r 2 ¼ 0.896, 0.971, and 1.0 in Danes, Estonians, and Finns, respectively. All SNPs are listed in Table 1.

Data Analysis
Allele frequencies were estimated, and conformance to Hardy-Weinberg equilibrium (HWE) in the full sample as well as in patient and control subgroups was calculated by Fisher's exact test implemented in the GenePOP software package (http://genepop.curtin.edu.au/index.html) (25). The statistical tests for population differentiation comparing allele and genotype frequencies of all studied SNPs among the three populations (Danish, Estonian, Finnish) were performed using GenePOP (25).
Association with the diagnosis of RM as a binary trait was assessed by the Cochran-Armitage test for trend. Association tests and calculation of LD between SNP pairs (r 2 ) were performed with the PLINK software, version 1.04 (http:// pngu.mgh.harvard.edu/$purcell/plink/). The LD r 2 -statistic represents the square of the correlation coefficient between the alleles at addressed loci.
For a meta-analysis including data from three recruitment centers, the inverse-variance method was implemented under a fixed-effects model using R, version 2.7.2 (R Development Core Team, http://www.r-project.org/). Odds ratios (OR) with 95% confidence intervals (CI) were calculated to show the strength and direction of the association. P< .05 was considered statistically significant.
Haplotypes within the resequenced region of CGB8 (À350 bp to þ400 bp relative to mRNA start) were determined based on all but singleton SNPs. Singleton polymorphisms carried in heterozygous status by one single individual were excluded from haplotype calculations as their location on either of the chromosomes cannot be reliably phased. Haplotypes were inferred from unphased genotype data using the Bayesian statistical method in the program PHASE 2.1.1 (http:// www.stat.washington.edu/stephens/), applying the model allowing recombination (26). The running parameters were number of iterations ¼ 1,000, thinning interval ¼ 1, and burn-in ¼ 100; the ÀÂ10 parameter was used for increasing the number of iterations of the final run of the algorithm. The relationship between inferred haplotypes was analyzed with NETWORK 4.6.1.0. software (http://www.fluxus-technology.com) using the Median-Joining network algorithm (27). Haplotype networks for CGB8 were calculated using SNPs covering the promoter region up to the end of the first exon.
To increase statistical power, the genetic data of the Danish, Estonian, and Finnish recruitment centers were combined in a meta-analysis across the three study samples (total number of 948 individuals; 634 RM patients and 314 fertile female controls; Table 2). The carrier status of the minor alleles of the CGB5 promoter SNPs exhibited a modest but significant protective effect against RM occurrence (P¼ .021; c5-155: OR ¼ 0.64; 95% CI, 0.44-0.94; and c5-142: OR ¼ 0.66; 95% CI, 0.45-0.94; Table 2). This result enhanced and confirmed the outcome of the original report (19). The meta-analysis including only Danish, Estonian, and Finnish female RM patients (n ¼ 349) compared with fertile female controls (n ¼ 314) showed the same direction and magnitude of the effect as the analysis in the full sample, but it did not reach statistical significance owing to the smaller sample size (c5-155: P¼ .116; OR ¼ 0.71; 95% CI, 0.46-1.08; c5-142: P¼ .089; OR ¼ 0.68; 95% CI, 0.44-1.06). Overall, both male and female partners of RM couples had a lower prevalence of the minor alleles of the studied CGB5 promoter SNPs (c5-155 and c5-142) compared with fertile controls (Supplemental Table 4). The allele frequencies of the genotyped intronic polymorphisms (CGB5: c5þ1038; CGB8: c8þ1045) did not differ between the Danish RM cases and fertile controls (MAF, 7.14% vs. 7.42%, P¼ .52; 0.55% vs. 0.43%, P¼ .83, respectively; Table 2).

Genetic Variation in CGB8 Promoter Does Not Affect RM Risk
The allelic distribution of SNPs (excluding singletons) in the resequenced CGB8 gene regulatory region (from À350 bp to þ400 bp from mRNA start) did not differ between the Danish RM patients and fertile controls (Supplemental Table 5), confirming the discovery analysis in the Estonian-Finnish sample (19). Concordantly, no statistical difference was detected in the CGB8 haplotype distribution between RM patients and fertile controls either ( Fig. 2B; Supplemental Table 3). We conclude that common genetic variants in the proximal regulatory region of CGB8 have no substantial effect on the susceptibility to RM.

DISCUSSION
Previously, we showed a significant association between six SNPs located in the promoter region or introns of the CGB5 and CGB8 genes and reduced susceptibility to unexplained RM among Estonians and Finns (19). The present study set out to confirm this finding in another European population (Danes) and in a meta-analysis across the three study

FIGURE 2
Networks of predicted haplotypes of the resequenced region of CGB8 spanning the upstream region (À350 bp from mRNA start site) and the first exon (up to þ400 bp). The size of each node is proportional to the haplotype frequency in the total analyzed data set and the length of connecting lines is proportional to the number of mutational steps between haplotypes. The nomenclature and detailed composition of haplotypes are shown in Supplemental Table 3. (A) Comparison of the haplotype distribution between the Danes (DEN; black; n ¼ 569) and Estonians-Finns (EST/FIN; white; n ¼ 379; [19]). The haplotypes were inferred from seven polymorphisms present more than once among the genotyped Danish-Estonian-Finnish individuals. (B) Comparison of the haplotype distribution between the recurrent miscarriage (RM) cases and fertile controls within the Danish (DEN) and the Estonian-Finnish (EST/FIN) study samples. The haplotypes in the Danes were formed from six and in the Estonian-Finnish sample from five polymorphisms, as some SNPs were population-specific or occurred as singletons in either of the analyzed study population.
populations. The two discovery samples, representing neighboring populations of Estonians and Finns, had exhibited similar allelic distributions of SNPs in the CGB5 and CGB8 genes, whereas the allele frequencies of the Danes appeared to be statistically different from the Estonian-Finnish sample. A recent large-scale study showed that the geography of European populations is also reflected in its genetic structure, where Scandinavians cluster together with western Europeans and the Estonian population is genetically closest to Finns (28). Thus, meta-analysis rather than pooling the samples across studies is a preferred approach for increasing study power.
As a major outcome, this study confirmed the effect of the CGB5 promoter variants on modulating the susceptibility to RM. The carrier status of the minor alleles of the two SNPs (c5-155, c5-142) investigated in the present study as the genetic markers for the CGB5 promoter haplotypes significantly reduced the risk of RM (meta-analysis, P¼ .021, OR ¼ 0.64 [0.44-0.94]). This RM-protective CGB5 promoter haplotype consists of the minor alleles of four SNPs (c5-155G/C; c5-147G/del; c5-144T/C; c5-142T/A) and is completely identical to the homologous region in the CGB8 gene, exhibiting no genetic variation in these positions (Fig. 1).
All humans have the CGB8 promoter haplotype c8-155C/-c8147del/c8-144C/c8-142A, which seems to provide the most optimally functioning promoter because CGB8 is responsible for up to 40% of hCG production in pregnancy (6). Most probably, originally humans had the CGB5 gene with a slightly less efficient main promoter variant c5-155G/c5-147G/c5-144T/c6-142T (Fig. 1A). The detected CGB5 RMprotective haplotype c5-155C/c5-147del/c5-144C/c5-142A originates from the CGB8 gene via a meiotic gene conversion event between the two promoter regions (15). We speculate that in some pregnancies, where the trophoblast growth is impaired (due to genetic, trombophilic, immunological, or other reasons), the placenta with the most efficient CGB5 promoter haplotype (originating from and identical to CGB8) may have a better capacity for extra hCG production that may eventually rescue the threatened fetuses. Subsequently, this CGB5 promoter haplotype is expected to become increasingly prevalent among humans and to exhibit a higher prevalence in couples with normal fertility than in those with RM. This is in agreement with the results of this study. We also suggest that the current CGB8 gene with the c8-155C/c8-147del/c8-144C/c8-142A promoter haplotype has already reached maximum efficiency. Therefore the detected common variations in this gene have neither evolutionary advantage nor effect on pregnancy success, and balancing selection is expected to rapidly eliminate new, less fit variants (19).
In conclusion, despite the essential role of hCG in human pregnancy, no common SNP or haplotype variants in the main hCGbeta coding genes (CGB5, CGB8) were associated with increased risk of RM among the analyzed North European samples. Instead, the evolution in human lineage seems to have favored the spread of CGB genetic variants (e.g., by gene conversion), which support a more efficient gene expression and may reduce the risk of pregnancy loss even in critical situations. Recent studies have suggested that apart from SNPs, the expression of CGB genes might be modified by Meta-analysis was performed using the inverse-variance method implemented under the fixed-effects model; estimation of the combined OR and statistical significance of association takes into account the sample sizes of each individual contributing study.
epigenetic mechanisms (29,30). A pilot study reported polymorphic DNA methylation in the CGB5 promoter region exclusively in placentas from RM cases leading to expressional silencing of the paternal alleles (29). Future larger studies have to target epigenetic modifications and also other non-SNP variations (e.g., copy number variations, gene deletions/duplications) in the CGB genes, which may have clinical importance in modulating susceptibility to pregnancy loss. SUPPLEMENTAL CGB5/7_3nestR Primers for resequencing CGB8 promoter and 5 0 UTR Note: The major variant of a polymorphism is marked with a capital, and the minor variant with a lowercase letter. The haplotypes are in concordance with haplotype networks (Fig. 2)