Maternal Serum Screening Markers and Adverse Outcome: A New Perspective

There have been a number of studies evaluating the association of aneuploidy serum markers with adverse pregnancy outcome. More recently, the development of potential treatments for these adverse outcomes as well as the introduction of cell-free fetal DNA (cffDNA) screening for aneuploidy necessitates a re-evaluation of the benefit of serum markers in the identification of adverse outcomes. Analysis of the literature indicates that the serum markers tend to perform better in identifying pregnancies at risk for the more severe but less frequent form of individual pregnancy complications rather than the more frequent but milder forms of the condition. As a result, studies which evaluate the association of biomarkers with a broad definition of a given condition may underestimate the ability of such markers to identify pregnancies that are destined to develop the more severe form of the condition. Consideration of general population screening using cffDNA solely must be weighed against the fact that traditional screening using serum markers enables detection of severe pregnancy complications, not detectable with cffDNA, of which many may be amenable to treatment options.


Introduction
Prenatal screening for birth defects was initially implemented using a single biochemical marker (alpha-fetoprotein) to identify a single condition (open neural tube defects, ONTDs) in the second trimester of pregnancy [1,2]. Over the course of the last 30 years, the field has evolved so that multiple ultrasound and biochemical markers across the first and second trimesters are used to identify patients at risk not only for ONTDs but also for Down syndrome and trisomy 18/13 [3][4][5][6][7][8][9][10][11]. In addition, there have been a number of reports [12][13][14][15][16][17][18] regarding the effectiveness of the serum markers to identify pregnancies at high risk for additional adverse perinatal outcomes leading to a number of reviews and consensus opinions [19][20][21][22][23][24]. The purpose of such reviews was to evaluate serum markers which were already being used in aneuploidy screening to see if there was any additional benefit in identifying other conditions beyond the primary outcomes being screened. These reviews focused on improving pregnancy management through the use of additional counseling and follow-up ultrasound examination since the effectiveness of treatments for these other conditions was not well-established.
More recently, there have been new developments that need to be considered when evaluating aneuploidy screening markers for other adverse outcomes. Evaluation of cell-free fetal DNA (cffDNA) in maternal blood offers the opportunity to significantly improve the detection of Down syndrome while substantially reducing false positive rates [25][26][27][28]. This new technology can also be used to detect trisomy 18, 13 and sex chromosome abnormalities [28][29][30] albeit at somewhat lower detection rates than for trisomy 21. Based on the initial studies, the American Congress of Obstetricians and Gynecologists (ACOG) [31] concluded that cffDNA-based testing could be offered to pregnancies at high risk for aneuploidy including those with advanced maternal age. Before it can be determined whether or not cffDNA testing should be applied to a low risk population a number of factors need to be considered. First, the cost of the technology is significant and at present the cost of providing cffDNA testing to the entire population is substantially greater than that of current screening protocols even after factoring in the savings due to improved detection [32]. The cffDNA approach is highly focused on the specific genetic disorders tested for and therefore, at present cffDNA testing cannot detect the significant percentage of atypical abnormal karyotype results which are associated with abnormal l serum or nuchal translucency values [33]. Furthermore, cffDNA testing does not appear to be useful in the identification of other adverse perinatal outcomes which can lead to severe perinatal morbidity and mortality such as preeclampsia, preterm birth and small for gestational age neonates [34]. Indeed, the incidence of severe morbidity, mortality and NICU admission exceeds the incidence of the genetic disorders that can be identified by cffDNA testing (Figure 1). The ability to reduce the incidence of severe morbidity and mortality lies in the early identification and treatment of asymptomatic pregnancies. Figure 1. Comparison of incidence of adverse outcomes due to all causes with incidence of chromosomal abnormality. Data for incidence of adverse outcome from Lisonkova et al. [35]. Data for incidence of chromosomal abnormality from Wapner et al. [36]. NICU = Infant admitted to neonatal intensive care unit, PCS-Duplications/Deletions = Chromosome microdeletions and duplications with potential for clinical significance, T21/18/13/X/Y = trisomy 21, trisomy 18, trisomy 13 and sex chromosome abnormalities.
Initial meta-analysis of 31 randomized trials indicated that aspirin had only a small beneficial effect on reducing the incidence of preeclampsia as a whole [37]. However, re-analysis of those studies by Bujold et al. [38] and Roberge et al. [39,40] showed that when aspirin was administered prior to 16 weeks the incidence of preeclampsia was reduced by 53%. Restricting the analysis to early-onset preeclampsia showed a reduction in incidence of approximately 90% [38]. In addition, the analysis showed that aspirin treatment prior to 16 weeks was effective in reducing fetal loss, fetal growth restriction, and preterm birth [38]. Other studies have reported on effective interventions for preterm birth such as cervical cerclage or progesterone [41][42][43][44][45]. In addition, fetal surgery is now a potential option in treating myelomeningocele [46][47][48]. Finally, early identification of placenta accreta can prepare the medical team for potential complications during delivery [49][50][51]. Below we review the association of adverse outcomes with respect to routine serum screening markers.

Preeclampsia
Several studies have evaluated first trimester free β human chorionic gonadotropin (free hCGβ) and pregnancy associated plasma protein A (PAPP-A) as markers for preeclampsia [17,[52][53][54][55][56]. In general, these studies did not show an association of preeclampsia with free hCGβ but did show an association with low PAPP-A resulting in detection rates of 8%-15% at a false positive rate of 5%.
Morris et al. [23] performed a systematic meta-analysis of cohort studies evaluating second trimester markers and preeclampsia. There was significant variation among studies in the threshold used to identify patients at high-risk as well as significant variation in screening performance.  [57] found a significant association between inhibin and preeclampsia but not AFP and uE3. In a 3 marker protocol including PAPP-A, inhibin and hCG, the detection rate was 40% at a 5% false positive rate. The FASTER trial [18] also showed an association between preeclampsia and inhibin with a detection rate of 17% at a false positive rate of 3% but detection was not improved with additional markers.
Among preeclampsia pregnancies, approximately 70% of perinatal deaths and 60% of cases of severe neonatal morbidity occur in early onset (<34 weeks) preeclampsia even though these cases represent only about 10% of all preeclampsia cases [35]. As a result, a significant positive impact on perinatal morbidity and mortality can be achieved with effective screening programs for the early-onset form of the disease.
Olsen et al. [58] found that elevated levels of inhibin, hCG and AFP could each identify 22%-28% of early onset preeclampsia (<34 weeks) at an approximate 5% false positive rate and that the association of these markers with early onset preeclampsia was stronger than the association with late onset preeclampsia. Other studies indicate that PAPP-A may also perform better as a marker for early onset-preeclampsia rather than late-onset preeclampsia [59][60][61]. Kang et al. [57] found that inhibin was more strongly associated with early rather than late onset preeclampsia. Huang et al. [62] developed a multiple marker algorithm utilizing first trimester PAPP-A and second trimester AFP, hCG and uE3 to identify 18% of early onset (<32 weeks) preeclampsia. Inclusion of inhibin and maternal characteristics (such as previous history or family history of preeclampsia, parity and chronic hypertension) into such a protocol could potentially lead to significantly higher detection.
Recent data has indicated that a direct screen including maternal characteristics, PAPP-A, placental growth factor (PlGF), uterine artery doppler pulsatility index and mean arterial pressure can identify over 90% of early-onset preeclampsia pregnancies in the first trimester [63][64][65]. Coincidentally, PlGF along with AFP have been demonstrated to be effective in first trimester Down syndrome screening when combined with PAPP-A and free hCGβ [66][67][68]. Thus, an expanded Down syndrome screening protocol may lead to early identification of early-onset preeclampsia.

Intrauterine Growth Restriction
Until recently, the terminology used to describe intrauterine growth restriction (IUGR) has been inconsistent and confusing [69] and the term small for gestational age (SGA) has been used interchangeably with IUGR. An estimated fetal weight below the tenth percentile can alert the clinician to small fetal size but does not effectively differentiate between those fetuses who are small for pathological reasons and those that are constitutionally small but healthy. In an effort to differentiate between pathologically small and constitutionally small fetuses, the PORTO study [70] evaluated stricter criteria for the classification of IUGR. The authors found that those pregnancies more likely to have adverse pregnancy outcome or NICU admissions had estimated fetal weight <3rd percentile or abnormal umbilical artery (UA) Doppler compared to those pregnancies with normal UA Doppler and estimated fetal birth weight between the 3rd and 10th percentiles.
In general, studies of serum screening markers have used birth weight rather than estimated fetal weight to describe IUGR. There appears to be a tendency for extreme analyte values to be associated with more extreme low birth weight. D'antonio et al. [60], recently summarized the literature on the association of PAPP-A with IUGR. The detection rate of IUGR when it was defined as birth weight below the fifth percentile [17,54,60,[71][72][73] is generally greater than when it was defined as birth weight below the tenth percentile [16,17,54,[71][72][73][74][75] (Figure 2). As a follow-up to the FASTER trial, Dugoff et al. evaluated the effectiveness of the serum markers in identifying IUGR, defined either as below the tenth percentile or below the fifth percentile for birth weight [17,18]. Table 1 shows that PAPP-A, AFP, hCG, uE3 and inhibin identify a greater percentage of pregnancies with birth weight below the fifth percentile than those with birth weight below the tenth percentile. Since the below the tenth percentile group contains all of the pregnancies below the fifth percentile the difference between the two percentile cut-offs may not appear as significant. However, a comparison of those pregnancies with birth weight ≤5th percentile with those with birth weight between the sixth and tenth percentiles shows that the detection rates of the serum markers are significantly greater for the more extreme low birth weight group (Table 1). Spencer et al. [15] found that the pattern observed with the other serum markers was also true for second trimester free hCGβ. The authors found that the association of low second trimester free hCGβ (<0.5 MoM) was greater for birth weight below the third percentile (relative risk of 2.30) than for birth weight below the tenth percentile (relative risk of 1.98).
Roman et al. [76] used estimated fetal weight below the tenth percentile as a criterion for IUGR. In addition, the authors stratified the IUGR cases based on umbilical artery (UA) Doppler. Compared to pregnancies with normal growth, those pregnancies with IUGR and absent reverse end diastolic velocity (AREDV) had serum marker levels that were more significantly different than those pregnancies with IUGR and normal UA Doppler. In addition, using a combination of serum markers, 73% of the IUGR/AREDV cases were detected at a 5% false positive rate. By including maternal factors (history of chronic hypertension, lupus, pregestational diabetes and thrombophilia) in addition to serum markers 91% of IUGR/AREDV cases were identified. Further studies verifying the authors' results could potentially lead to a direct screen for pregnancies at risk for IUGR with abnormal UA Doppler, a condition associated with significant morbidity, mortality and NICU admission.

Preterm Birth
In 2012, 9.9% of singleton births were preterm (<37 weeks) including 2.8% which were early preterm (<34 weeks) [77]. Identification of pregnancies at high risk for preterm birth based on short cervix can identify approximately one third of preterm births [43]. Recent randomized control trials have indicated that treatment with progesterone [41][42][43] or cervical cerclage [44] can significantly reduce preterm birth. Since only one third of early preterm births (<34 weeks) have short cervix below 1.5 cm the addition of biochemical and other biophysicial markers may lead to reduction in incidence of preterm birth and by extension a reduction in perinatal morbidity and mortality.
Several studies have evaluated the association of PAPP-A with preterm birth. Table 2 shows that PAPP-A is more strongly associated with early preterm birth than with preterm birth. At a 5% false positive rate, early preterm birth was identified in 9%-15% of cases compared to 5%-9% in preterm birth [16,17,53,54,78]. The positive likelihood ratio of PAPP-A below the fifth percentile ranged from 2 to 3 for early preterm birth. Goetzinger [79], evaluated PAPP-A at a 10% false positive rate and contrary to the other studies actually had higher detection in the preterm birth group (24%) compared to the early preterm birth group (20%). However, when maternal characteristics (African American race, Body Mass Index, Prior preterm birth, history of chronic hypertension, history of pre-gestational diabetes) were factored in, the detection rate was the same (38%) in both groups.  [79] which is at 10% false positive rate. † Includes maternal characteristics of African American race, body mass index, prior preterm birth, history of chronic hypertension and history of pre-gestational diabetes.
Dugoff et al. [18] evaluated the association of second trimester markers with early preterm birth ≤32 weeks. The detection (false positive rates) for AFP, hCG, uE3 and inhibin were 9% (5%), 11% (1.7%), 17% (6.0%), and 22% (3.1%). The combination of any two abnormal analytes had a sensitivity of 16% with a false positive rate of 2.9%, a positive likelihood ratio of 5.5 and a negative likelihood ratio (0 or 1 abnormal markers) of 0.87. The combination of both an elevated AFP and inhibin had the largest association with early preterm birth (Odds ratio = 20.37).
Data on the incidence of early preterm birth and short cervix can also be converted to likelihood ratios. Using the summarized data from Werner et al. [80], the likelihood ratio for birth before 34 weeks for cervix length <1.5 cm, 1.6-2.5 cm and >2.5 cm are 24.3, 2.5 and 0.9, respectively. In a separate study, Heath et al. estimated the likelihood ratio for birth before 32 weeks for cervix length <1 cm, 1-2 cm, 2-3 cm, 3-4 cm, 4-5 cm, 5-6 cm and 6-7 cm to be 51.52, 2.66, 0.71, 0.48, 0.24, 0.04 and 0.01, respectively [81]. These results are largely in agreement with those of Werner et al. [80]. Iams et al. determined that the risk of preterm birth in pregnancies with negative fibronectin, large cervical length and no history of preterm birth is 1% compared to 64% in pregnancies with positive fibronectin, small cervical length and history of preterm birth [82,83]. Although not currently routine practice, the data in these studies can be adjusted by incorporation of early biochemical marker likelihood ratios to further refine risks and improve detection of early preterm birth.

Fetal Loss
First trimester screening typically takes place beginning at 11 weeks. However, in some programs the blood sample for biochemistry testing is drawn prior to the ultrasound and may be collected as early as 9 weeks. As a result, data on fetal loss can be stratified into 3 timeframes; loss prior to nuchal translucency (NT) ultrasound, loss prior to 24 weeks gestation and loss after 24 weeks gestation.
Cuckle et al. [84] and Krantz et al. [85] evaluated the association of free hCGβ and PAPP-A with fetal viability at the time of the nuchal translucency exam. Cuckle et al. [84] examined 155 patients where blood was drawn prior to the NT examination and found nine patients with non-viable pregnancies at the time of the ultrasound. The medians for PAPP-A and free hCGβ in the 9 non-viable pregnancies were 0.25 and 0.78 respectively. Krantz et al. [85] found median MoMs of 0.47 and 0.42, respectively in 55 patients experiencing fetal loss prior to the NT ultrasound compared to a control group of 6464 unaffected patients. Using the predicted odds from logistic regression, the detection rate for a 1%, 3%, 5% and 10% false positive rate was 29%, 44%, 49% and 60% respectively [85].
A number of studies [17,62,[86][87][88] have found that first trimester free hCGβ and PAPP-A are both associated with increased risk of early fetal loss between the time of the nuchal translucency exam and 24 weeks gestation. Dugoff et al. [87] evaluated a multiple marker approach to identify pregnancies at high risk for early fetal loss using a methodology similar to that used in calculating Down's risk based on multivariate Gaussian distributions. The authors found that at a 5% false positive rate PAPP-A plus maternal characteristics (maternal age, body mass index, race, parity, previous loss prior to 24 weeks, preterm birth <37 weeks and threatened abortion) could detect 23% of early fetal loss. Including AFP and uE3 in the model resulted in a detection rate of 39%. Increasing the false positive rate to 10% resulted in a detection rate of 46%. Huang et al. [62] also evaluated a multiple marker approach to identify pregnancies at high risk for early fetal loss. In their model the combination of AFP, second trimester hCG and uE3 detected 42% of early fetal loss. Separately, Benn et al. [89] found that among patients with elevated AFP and simultaneously low uE3 (<0.7 MoM), there was a 9 fold increase in risk of fetal death.
The performance of screening for late fetal loss does not appear to be as effective with observed detection rates between 3%-20% at a 5% false positive rate [87,88,90]. The most promising approach was observed by Dugoff et al. [87], who reported that the combination of inhibin with maternal factors (body mass index and race) could lead to the detection of 20% and 29% of late fetal loss at a false positive rate of 5% and 10%, respectively. Table 3 summarizes the performance of the various markers and their association with fetal loss prior to the nuchal translucency (NT) examination, between the NT examination and 24 weeks and after 24 weeks sorted by detection rate.

Placenta Accreta
Placenta accreta is a life-threatening obstetric complication resulting from abnormal placental implantation. The risk of placenta accreta increases significantly with placenta previa and the number of previous cesarean deliveries [91]. Currently, ultrasound and MRI are the best tools to diagnose placenta accreta although they have generally been applied only to high-risk pregnancies [92]. Therefore, improving the ability to identify pregnancies specifically at high risk for placenta accreta with biomarkers could help improve the efficient use of these imaging tools [92][93][94][95]. Desai et al. found that high levels of PAPP-A were associated with increased risk of placenta accreta [49] and moreover that PAPP-A is not associated with placenta previa or previous cesarean. Similarly, Dugoff et al. [17] also showed that PAPP-A was not associated with previa. This indicates that PAPP-A could potentially be combined with the clinical evaluation of placenta previa and history of cesarean section to help identify those patients that are at high risk for placenta accreta. Using a continuous model, Desai et al. [49] found that a PAPP-A of 2 MoM was associated with a 2 fold increase in risk and that a PAPP-A of 3 MoM was associated with a 4-fold increase in risk. On the other-hand, a PAPP-A of 0.5 was associated with a 5 fold decrease in risk. These risk changes are equivalent to the change in risk of one or two additional cesarean deliveries or two fewer cesarean deliveries, respectively.
In the second trimester, Zelop et al. [50] and Kupfernic et al. [51] showed in small studies that elevated AFP (>2.5 MoM) was associated with accreta. Hung et al. [96] and Dreux et al. [97] found that second trimester levels of AFP and free hCGβ were elevated in accreta. When AFP was beyond 2.5 MoM the odds ratio for placenta accreta was 8-10. When free hCGβ was beyond 2.5 MoM the odds ratio was 4-8.
Further study into the development of algorithms encompassing multiple marker cross-trimester protocols, repeat marker testing, prior history of cesarean section and existence of previa are warranted.

Open Neural Tube Defects
The concept of prenatal screening began with the use of AFP for the detection of open neural tube defects (ONTDs) and evolved so that the main focus of serum marker screening is now chromosomal abnormalities [1][2][3][4][5][6][7][8][9][10][11]. Current advances in non-invasive cffDNA testing are not directed at the identification of pregnancies affected by ONTDs. A first trimester ultrasound is effective in identifying anencephaly [98] but less so in identifying open spina bifida [99] although newer techniques may improve detection [100].
A second trimester anatomy scan can be effective in identifying neural tube defects in specialized centers focused on high risk pregnancies [101]; however, it has been demonstrated to be less effective in general practice focused on low risk pregnancies [102]. Therefore, ACOG recommends that maternal serum AFP screening be offered to all pregnant women and that those found to be at high risk for ONTD may be offered specialized ultrasound examination to identify the defect [103]. The importance of prenatal screening and detection of such defects may be even more relevant now that fetal surgery offers the promise of improved outcome in certain cases of open spina bifida [46][47][48].
The serum screen for open neural tube defects is straightforward with labs using either a 2.0 MoM or 2.5 MoM cut-off. The detection rate of open spina bifida is approximately 10 percentage points greater with a 2 MoM rather than a 2.5 MoM cutoff [104,105]. In addition, there have been significant improvements in AFP assays since screening was first introduced with radioimmunoassay in the 1970s. As a result the distribution of AFP is much narrower and thus the use of a 2.0 MoM can result in false positive rates of 2% or less [105]. Even though most laboratories report a patient-specific risk for open spina bifida the reported risk values may not be as accurate because many of the a priori risks incorporated into algorithms are based on incidence data collected prior to the implementation of folic acid dietary supplementation which has significantly reduced the risk of neural tube defects [106].

Conclusions
For adverse outcomes such as IUGR, preeclampsia and preterm birth the clinical presentation may vary widely with respect to maternal/fetal morbidity and mortality. The more severe form of these adverse outcomes have significantly higher rates of severe morbidity and mortality. The information presented in this review indicates that there is improved performance of serum markers with respect to the more severe form of various pregnancy complications. Moreover, the most severe cases tend to occur less frequently than the milder forms of these conditions ( Table 4). As a result, studies which evaluate the association of biomarkers with a broad definition of a given condition may underestimate the ability of such markers to identify pregnancies that are destined to develop the more severe form of the condition. Therefore, more effort should be made to narrowly define specific adverse outcomes which may be identified by maternal serum markers. Using these narrowly defined outcomes, clinicians can decide whether screening is worthwhile based on incidence rates and clinical impact. Published performance data among different studies are often inconsistent due to discrepancies in a number of factors such as the definition and/or description of the severity of the condition, the marker cutoffs used and the maternal characteristics incorporated into risk algorithms. Optimally, a risk-based approach similar to that used in aneuploidy screening would be used for each disease state, in which consistent definition of the disease state, continuous multiple marker likelihood estimates and consistent estimates of a priori risks based on maternal characteristics were incorporated. Additionally, refinements to the risk based on follow-up assessments after the completion of serum screening could further improve the process.
Clinicians are faced with a difficult dilemma in which they must balance the potential benefits of non-invasive genetic screening while not losing sight of the potential pitfalls in missing other adverse outcomes especially since there now appears to be opportunity to improve those outcomes with effective treatments. Aspirin shows great promise if administered prior to 16 weeks in reducing the risk of preeclampsia, IUGR, preterm birth and fetal death. Some of the protocols described above include second trimester markers and would require completion by 16 weeks to maximize the benefits of aspirin administration. However, it is likely that the effectiveness of aspirin is not based on a simple dichotomy of <16 weeks and ≥16 weeks so aspirin may still be effective at 17-18 weeks even if less so than at 16 weeks. More research is needed to evaluate the association between effectiveness and time of initiation of aspirin treatment.
Moving forward, the goal should be to develop and implement high-performance direct screening protocols for specifically defined adverse outcomes. When evaluating the adoption of cffDNA testing for aneuploidy, clinicians should ensure that they continue to utilize existing screening protocols or new direct screens to identify pregnancies at risk for adverse outcomes. Otherwise, there may potentially be an increase in the overall morbidity and mortality in the population.

Author Contributions
David Krantz gathered references and wrote the first draft. Terrence Hallahan, Jonathan Carmichael and David Janik reviewed the literature and participated in drafting the final manuscript.