The accuracy of cell- free DNA screening for fetal segmental copy number variants: A systematic review and meta- analysis

Background: The performance of cell- free DNA (cfDNA) screening for microscopic copy number variants (CNVs) is unclear. Objectives: This was a systematic review and meta- analysis to investigate the sensitivity, specificity and positive predictive value (PPV) of cfDNA screening for CNVs. Search Strategy: Articles published in EMBASE, PubMed or Web of Science before November 2022 were screened for inclusion. This protocol was registered with PROSPERO (23 March 2021, CRD42021250849) prior to initiation. Selection Criteria: Articles published in English, detailing diagnostic outcomes for at least 10 high- risk CNV results with cfDNA were considered for inclusion. Data Collection and Analysis: The PPV was calculated and pooled with random-effects models for double- arcsine transformed proportions, using cases with diagnostic confirmation. Overall sensitivity, specificity and a summary receiver-operating characteristics (ROC) curve were calculated using bivariate models. The risk of bias was assessed using QUADAS- 2.

is not suitable as a diagnostic test. 3,4 More recently, cfDNA screening panels have been expanded beyond common aneuploidies to include segmental copy number variants (CNVs), which encompass sub-microscopic deletions and duplications.
The phenotypical consequences of CNVs vary significantly, depending on size of the variant and the gene region involved, but range from completely benign to incompatible with life. 5,6 Several clinical syndromes have been identified as being attributable to specific chromosomal microdeletions, including 22q.11.2 syndrome (previously termed DiGeorge syndrome), 15q.11 microdeletion (Angelman) syndrome, and 5p-(Cri Du Chat) syndrome. [7][8][9] Individual CNVs, such as 22q.11.2 syndrome, are generally rare. 7 When considered as a collective, their frequency increases up to 6% in fetuses with anatomical anomalies identified on ultrasound. 10 They also occur unpredictably, with 90-95% of 22q.11.2 syndrome diagnoses attributable to de novo aberrations, which are more likely to be pathological. 7,11 This, in combination with the absence of traditional identifiable risk factors for fetal genetic anomalies, such as advanced maternal age, which has no correlation with CNVs, makes a reliable prenatal screening method for CNVs desirable. 12,13 To date, the performance of cfDNA screening for CNVs has been less than ideal. Several studies have documented a significantly lower positive predictive value (PPV) for CNVs compared with common aneuploidies. [14][15][16][17][18] Similarly, the sensitivity of cfDNA screening for CNVs appears suboptimal. 6 There is, however, little consensus regarding these estimates, with values for both PPV and sensitivity varying dramatically across studies. 19 This systematic review and meta-analysis aims to investigate the diagnostic accuracy of cell-free DNA screening for CNVs.

| M ET HODS
We conducted a systematic review of the literature and metaanalysis of diagnostic accuracy to assess cfDNA screening for fetal microscopic CNVs in the general obstetric population, using results from prenatal or postnatal cytogenetic diagnostic tests as validation. The protocol for this review was registered with the International Prospective Register of Systematic Reviews (PROSPERO) (23 March 2021, CRD42021250849), prior to its initiation, and results are reported in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement. 20

| Eligibility criteria
Studies eligible for inclusion were original research articles or abstracts reporting the performance of cfDNA screening for fetal CNVs in a pregnant population. Studies that reported fetal diagnostic confirmation for <10 high-risk results were excluded. 21,22 Animal studies, systematic reviews, metaanalyses, case-reports and articles not published in English (except where English translation was available) were excluded.

| Information sources, search strategy and selection process
PubMed, EMBASE and Web of Science were searched from inception to November 2022. The full search strategy, which was designed to investigate cfDNA screening for conditions other than the common trisomies as part of a broader research project, is provided in Appendix S1. Study selection was conducted using Covidence systematic review software (Veritas Health Innovation). Each title and abstract were independently reviewed for inclusion by two investigators (MA, SB or YR), with a third investigator consulted in case of disagreement (DR, FC or IF). Full texts of potentially eligible articles were then reviewed in a similar process. References of included articles were also manually screened to identify articles potentially missed in the search.

| Data collection process and data items
Following full-text review, data were manually extracted by one investigator per article (MA, SB or YR). In instances in which desired data were missing, investigators attempted to contact corresponding authors via email on two subsequent occasions, after which articles were excluded if no response was obtained. Information extracted included author names, year of publication, publication title, country, study design, anomalies screened for, populations screened, proportion of high-risk cfDNA screening results with and without diagnostic follow-up, and reported screening accuracy.
Studies that reported pregnancy outcomes only for highrisk cfDNA results were considered case series; those that reported the performance of cfDNA screening using case and control groups of fetuses with previously determined karyotypes were classified as case-control studies, and studies that prospectively assessed pregnancy outcome for both high-and low-risk cfDNA screening results were classified as cohort studies.
The populations screened in each article were categorised as predominately low-risk (<50% of individuals with high baseline risk of fetal CNV) or high-risk (≥50% of individuals with high baseline risk of fetal CNV). This delineation was made, as the majority of studies only provided grouped statistics for the population screened, thus we were unable to assess participant background risk as a continuous variable. Recognised risk factors for CNVs included high-risk serum or combined screening results, fetal anomalies on ultrasound examination, and prior history of chromosomal anomalies. Advanced maternal age was not considered a risk factor. 23

| Risk of bias assessment
Risk of bias assessment for each included study was conducted independently by two investigators (MA, SB or YR), with discordant results resolved by consensus, using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool. 24 For the 'patient selection' domains, the aforementioned categorisation of high-versus low-baseline aneuploidy risk was used to determine bias and applicability risks. Additionally, any studies in which participant selection was contrived (not random or sequential) were deemed high-risk of bias.
In the 'index test' domain, risk of bias was classified as high when fetal karyotype was known prior to cfDNA screening. Studies in which the cfDNA screening methodology or platform was not specified were labelled unclear for both bias and applicability.
For the 'reference standard' domains, studies were deemed high-risk for bias and applicability if >10% of diagnostic confirmations were ascertained by chorionic villus sampling (CVS), as these could include confined placental mosaicism and our analysis pertains to fetal outcomes. Additionally, studies in which >10% of diagnostic investigations were conducted by karyotyping (as opposed to microarray or high-depth sequencing), which may not detect sub-microscopic anomalies or in which the cytogenetic methods were unspecified, were considered high-and unclear risk of bias, respectively.
For the 'flow and timing' domain, the threshold for low risk of bias was arbitrarily set at ≥80% diagnostic follow-up rate.

| Effect measures, synthesis methods and statistical analysis
Positive predictive values were calculated using results from case series and cohort studies and excluding case-control studies. PPV were calculated as the number of true-positive cfDNA results validated by a diagnostic investigation, over the total number of results with diagnostic follow-up. Screening results with no genetic confirmation were excluded from the analysis. The PPV was calculated for CNVs overall, and individually for the most common CNV syndromes including 22q.11.2, 15q.11 microdeletion and 5p-.
To achieve stabilisation of the variances and because some studies had PPV of 0% or 100%, estimates were transformed using the Freeman-Tukey double-arcsine transformation. The transformed proportions were then pooled with random-effects models using inverse-variance weights and the DerSimonian and Laird method to estimate the betweenstudy heterogeneity.
We assessed heterogeneity by calculating the I 2 statistic. To explore heterogeneity, univariable mixed-effects metaregression models were fit to the data using predictors including year of publication, diagnostic test follow-up rates and baseline population risk category. Random-effects meta-analyses of subgroups according to baseline population risk (≥50% or <50% high-risk) and diagnostic follow-up rate (≥80% or <80%) were also conducted to investigate subgroup effects. With the aim of investigating the impact of bias on PPV estimates, sensitivity analysis was performed including only studies considered to be at low risk in all four 'bias' domains of the QUADAS-2 tool.
Sensitivity and specificity were calculated using results from case-control studies and cohort studies with diagnostic confirmation ascertained for both high-and low-risk screening results. A bivariate random-effects model was then used to estimate pooled sensitivity and specificity and create a summary receiver-operating characteristics (sROC) curve. The 95% confidence interval (95% CI) for the area under the sROC curve was estimated using 5000 bootstrap samples.
Publication bias and small-study effects were investigated through inspection of a funnel plot and with Egger's test. Analyses were conducted with the packages 'metafor' and 'mada' in R, and p-values < 0.05 were considered statistically significant. 25

| Study characteristics
In total, 7845 search results were identified, of which 1862 were duplicates. After screening the remaining 5983 results, 63 articles satisfied the inclusion criteria. 12,[14][15][16]18, The study selection process is shown in Figure 1. Authors were contacted for seven articles with incomplete data; data were obtained for one 38 and six were excluded. [84][85][86][87][88][89] Included studies were published between 2015 and 2022. Across all included studies, 1 591 459 women underwent cfDNA screening for CNVs, with 5481 receiving a high-risk result (screen-positive rate of 0.34%). This screen-positive rate may be over or underestimated, however, as some studies (n = 9) did not report the total number of women screened (thus reducing the denominator), or the number of women who received a high-risk result without diagnostics (n = 8) (thereby reducing the numerator). Diagnostic results were available for 3737 pregnancies that screened high-risk for a CNV (68.2%), including 934 at high-risk for one of the deletion syndromes (22q.11.2 syndrome, n = 632; 15q microdeletion, n = 179; 5p-syndrome, n = 123).

| Positive predictive value
The PPV of cfDNA screening for CNVs was generated using results from 3715 women with diagnostic confirmation. Meta-analysis revealed a pooled PPV of 37.5% (95% CI 30.6-44.8). The forest plot for PPV is shown in Figure 3. On sensitivity analysis for PPV using only results from the six studies deemed low risk in all four bias domains, the PPV was 34.3% (95% CI 22.3-47.4; Figure S3).
There was substantial statistical heterogeneity, with an I 2 statistic of 93.9% (p < 0.01). Meta-regressions were conducted for baseline cohort risk, year of publication and diagnostic follow-up rate; however, none significantly explained heterogeneity (p = 0.477, 0.445 and 0.327, respectively).
The sensitivity and specificity of 22q.11.2 syndrome were not meta-analysed due to these outcomes only being reported by three articles; however, reports ranged between 69.6% and 90.0% for sensitivity, and 99.7% and 100.0% for specificity. 39,60,76 4 | DISCUS SION

| Main findings
This meta-analysis revealed that the overall PPV of cfDNA screening for fetal CNVs is <40%. The probability of fetal confirmation is slightly lower among women with a low baseline risk of aneuploidy (32%) when estimates are generated using only results from studies with high rates of diagnostic confirmation (34%), and at low risk of bias (32%), although these differences were not statistically significant. The sensitivity of cfDNA screening for CNVs is 77%, and the specificity is over 99%. Another major finding of this review is the substantial heterogeneity among included studies.

| Interpretation
It is probable that the pooled PPV of 38% observed in this meta-analysis is overestimated by biases amongst included studies. We observed a reduction in PPV when only studies at low risk of bias were considered, and although this reduction was not significantly lower than the pooled PPV estimate, it does suggest that well-conducted studies lead to lower PPV estimates. Two prominent bias sources likely causing overestimation of the PPV include baseline cohort risk of CNVs and diagnostic confirmation rate. Pre-screening probability of fetal CNV is higher among women with risk factors, which increases the PPV. 90 Similarly, women with higher suspicion of CNVs are more likely to undergo diagnostic testing, thus incomplete diagnostic follow-up may lead to a greater proportion of high-risk pregnancies in the follow-up cohort. 91 F I G U R E 3 Forest plot of positive predictive value of cell-free DNA screening for fetal copy number variants.
None of the tested variables explained the substantial heterogeneity among studies. For baseline cohort risk, this may be due to our categorisation of 'high risk' versus 'low risk' using an arbitrary cut-off value of 50%. For diagnostic confirmation rates, heterogeneity may arise between studies with similar follow-up depending on the diagnostic methods utilised, as some cytogenetic investigations such as karyotype are less reliable in detecting CNVs, and CVS may provide a false indication of fetal involvement, particularly when the ultrasound is normal. There are also several other factors that may contribute to heterogeneity, including cfDNA screening platform, technology, depth of sequencing and range of sizes of detected anomalies.
Results for individual CNV conditions were markedly more uncertain than those for grouped CNVs, with wider confidence intervals. The pooled PPV for 22q.11.2 syndrome was also markedly higher than that of grouped CNVs. This is likely attributable to the small cohort sizes, in which even a small number of true-positive results have the power to inflate PPV. There was also significant potential for bias among these studies, with only one study investigating 22q.11.2 syndrome at low risk of bias, and no studies investigating either 5p-or 15q.11 microdeletion syndromes at low risk of bias.
The sensitivity observed is considerably lower than that of cfDNA screening for common trisomies, as almost onequarter of CNVs may go undetected by cfDNA screening, based on our findings. 1 This is likely attributable to smaller aberration sizes of many CNVs compared with whole chromosome aneuploidies, in tandem with the cost and technology restraints for sequencing depth in commercial cfDNA screening. There was one predominate outlier which reported a sensitivity of 46.1%, although this study consisted exclusively of high-risk results for microdeletion syndromes other than 22q.11.2. 77 The specificity revealed in this metaanalysis was high, despite the presence of one outlier which reported a specificity of 87.5%. 75

| Comparison with previous studies
To the best of our knowledge, this is the first meta-analysis investigating the performance of cfDNA screening for CNVs. A previous systematic review was conducted by Familiari et al. in which cfDNA screening was investigated exclusively for microdeletions or microduplications, and notably only in large cohorts with >5000 women. Despite a smaller number of included studies due to different search methodologies, researchers found a similar PPV as that for CNVs in this study, of approximately 40%, based on the results of seven studies. 19 Similar to the results of our meta-analysis, this study was also plagued by substantial heterogeneity, with PPV ranging between 29% and 91%.

| Clinical implications
A reliable screening method for CNVs is desirable, particularly for the CNV syndromes, as these conditions have profound impacts on health and development. Nevertheless, the results of this meta-analysis demonstrate that the clinical implementation of extended cfDNA screening panels should be approached with caution.
Based on the results of this analysis, approximately onethird of women who receive a high-risk result will have an affected fetus. It should, however, be stressed that this is likely an overestimate, given the high degree of heterogeneity and biases affecting included studies. While low PPV is expected in screening for rare diseases despite reasonable sensitivity and high specificity, the clinical implications of false-positive results are not negligible. These include significant parental anxiety and procedural-related risks of diagnostic investigations which should not be overlooked. 2 Even when fetal anomalies are observed on ultrasound prior to screening, diagnostic testing is arguably a more appropriate investigation for these pregnancies, as cfDNA is only appropriate for screening.
While the PPV is considerably lower than that of common aneuploidies such as trisomy 21, expanded screening for CNVs is defensible in the event of successful identification of a clinically significant anomaly that would otherwise be missed prenatally. However, the sensitivity revealed in this meta-analysis suggests that almost one-quarter of CNVs may be undetected by screening. Additionally, the benefit of screening for these anomalies is often questionable even in the event of successful identification, as the clinical consequences of many CNVs are poorly understood. 5,6 This creates a challenge for clinicians to provide genetic counselling for largely unpredictable phenotypes. F I G U R E 4 Summary receiver-operating characteristics (ROC) curve of cell-free DNA screening in the detection of copy number variants. The grey triangles represent estimates from nine individual studies, the closed circle represents the pooled estimate, and the dotted ellipse represents the 95% confidence region.

| Strengths and limitations
The primary strength of this meta-analysis is the number of articles reviewed, with the inclusion of more than 1.5 million women screened. By pooling these results, we were able to obtain estimates of PPV, sensitivity and specificity with relatively high precision. However, in pooling estimates, we are limited by the quality of the included studies which, as demonstrated by our bias assessments, varied considerably.
Another limitation pertains to the arbitrary selection of cut-off values of 50% for high versus low baseline cohort risk, and 80% for diagnostic confirmation rate in the subgroup analyses. While these values were selected to best capture any potential differences between subgroups, it is possible that the division of studies in this way may conceal more subtle gradient effects.
Finally, an integral adjacent to prenatal serum screening in clinical practice is ultrasound examination, as these results have significant influence on pre-test probability and, in turn, PPV. We were limited in this study, as most included articles did not report ultrasound findings, and subsequently we were unable to analyse the association between ultrasound results and screening performance. Similarily, while it is desirable to stratify screening performance by other variables such as the type/size of CNV detected and cfDNA screening technologies utilised, this was not possible, as information on such variables was often not reported.
The performance of cfDNA screening is substantially poorer for CNVs than for common trisomies, with substantial heterogeneity in the literature. Women should be informed about these limitations prior to expanded cfDNA screening, and the low PPV should be carefully considered when counselling women who receive a high-risk result for a CNV.

F U N DI NG I N FOR M AT ION
No funding was required or obtained for this research.

C ON F L IC T OF I N T E R E S T S TAT E M E N T
BWM is supported by a NHMRC Investigator grant (GNT1176437). BWM reports consultancy for ObsEva and Merck and travel support and research grants from Merck. MM is employed as a genetic counsellor at a private genetic testing provider. DLR has received research grants from NHMRC and Norman-Beischer Medical Research Foundation. The authors declare no competing interests. Completed disclosure of interest forms are available to view online as supporting information.

DATA AVA I L A BI L I T Y S TAT E M E N T
The datasets and code supporting the current study have not been deposited in a public repository but are available from the corresponding author on request.