Endocervical sampling in women with suspected cervical neoplasia: a systematic review and meta-analysis of diagnostic test accuracy studies

e 95; 799 women; 7 studies; low quality of evidence) for endocervical brush and 70% (95% conﬁdence interval, 42 e 89; 761 women; 7 studies; low quality of evidence) for endocervical curettage. Overall pooled speciﬁcity was 73% (95% conﬁdence interval, 36 e 93; 799 women; 7 studies; low quality of evidence) for endocervical brush and 81% (95% conﬁdence interval, 56 e 94; 761 women; 7 studies; low quality of evidence) for endocervical curettage. The risk ratio for inadequate samples with endocervical curettage compared with endocervical brush was 2.53 (95% conﬁdence interval, 0.58 e 11.0; P ¼ .215; low-certainty evidence). Two studies reported on patient discomfort; one found less discomfort in the endocervical brush group, and the other found no difference. CONCLUSION: No difference was found between endocervical brush and endocervical curettage in diagnostic accuracy, inadequate sampling rate, and adverse effects based on low-quality of evidence. Variation in the characteristics of women and the resulting diagnostic pathways make the external validity limited.


Introduction
Cervical cancer is the fourth most frequent cancer in women 1 and is a major cause of death worldwide, especially in low-income countries. 2 It is well-established that cervical cancer is caused by infection with human papillomavirus (HPV) in epithelial cells, and the virus is found in approximately 99.7% of reported cervical cancers. 3 HPV-induced cervical neoplasia is a precursor to cancer, gradually developing from mild cervical intraepithelial neoplasia (CIN1) to severe degrees (CIN3) and eventually to invasive cancer. 4 This natural history of the disease has led to the implementation of primary prevention with HPV vaccines and secondary prevention with screening programs.
Cervical neoplasia is mostly asymptomatic, but women may present with postcoital bleeding. The strategy for diagnosing cervical neoplasia is most commonly colposcopy, biopsies from the transformation zone, and endocervical sampling. Indication for these diagnostic tests may be abnormal cytology, positive HPV DNA testing, or symptoms (eg, postcoital bleeding). A positive finding (indicating high-grade neoplasia) on biopsies or endocervical sampling will result in further diagnostics, for example a colposcopy-guided conization. Two commonly used methods for obtaining an endocervical or cervical canal sample are the endocervical brush (EB) and endocervical curettage (ECC). 5 The EB is a collection device with a brush made of flexible plastic hairs that follow the contours of the endocervical canal, which provides a cytologic sample (single cells) from the endocervical surface. 6 Different EB sampling devices exist, with a combined brush and spatula device being the most effective for collecting endocervical cells. 7 ECC, on the other hand, is performed using a metal curette device that provides a histologic (tissue) sample.
There is considerable variation in choice of test method and therefore it is relevant to conduct a review.

Objective
We aimed to systematically review the existing literature and perform a meta-analysis of diagnostic test accuracy (DTA) to provide estimates of the diagnostic accuracy of the EB and ECC, respectively, for the detection of cervical neoplasia in women with any indication for colposcopy, biopsies, and endocervical sampling. Furthermore, we investigated patient discomfort and inadequate sampling.

Materials and Methods
Eligibility criteria, information sources, search strategy All diagnostic studies and randomized clinical trials (RCTs), regardless of setting, including women with abnormal cytology in cervical cancer screening and women with possible symptoms of cervical cancer undergoing colposcopy, biopsy, EB, or ECC sampling were included. The target condition was cervical neoplasia, and the reference standard was the final histologic result of either conization or hysterectomy. Furthermore, we also addressed the possible discomfort caused by either method.
We conducted a systematic search in the following 4 bibliographic databases on June 9, 2022, with no restrictions regarding language or date of publication: MEDLINE (Ovid), Embase (Ovid), the Cochrane Library, and CINAHL (EBSCO). The search strategy was originally developed by a medical librarian (H.S.L.) in collaboration with a subject advisor (L.K.P.) for a Danish national clinical guideline and was updated for the current review. To accommodate peer review, the original search strategy published in the protocol was modified to include studies without "colposkopy" as an obligatory search term, and >1000 more studies were screened with the search performed on June 9, 2022.
The modified search was a combination of "cervical" AND "neoplasia" AND "curettage," with the corresponding synonyms in subject headings and title or abstract. The full search strategy can be found in Appendix 1. The search results were imported into Covidence 8 for removal of duplicates and abstract screening.

Study selection
All studies were screened by 2 authors independently, and any disagreement was resolved through discussion.

Data synthesis
Two authors (B.B.B. and J.B.S.) extracted information about year, time period, author, country, funding, and threshold for treatment, and enough information was collected to create a 2-by-2 table (true positive, false positive, true negative, false negative).

Assessment of risk of bias
The same authors also individually performed bias assessment using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool 9 and individually rated the certainty of evidence using the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) methodology. 10 A full review protocol was published on February 24, 2021 on the International Prospective Register of Systematic Reviews (PROSPERO) (CRD42021222406) before conducting this review. 11 We adhered to the Preferred Reporting Items AJOG at a Glance Why was this study conducted? In suspected cervical neoplasia the cervical canal can either be sampled by endocervical brush or endocervical curettage. There is clinical variation in the diagnostic approach, and no previous review has compared the 2 modalities.

Key findings
We included 7 studies and found no statistical difference between the endocervical brush and curettage in sensitivity (81% vs 70%) and specificity (73% vs 81%). The studies had risk of bias and were small. No statistical difference was found in inadequate sample rate or adverse effects either.

What does this add to what is known?
Endocervical brush might be an alternative to cervical curretage in women with suspected cervical dysplasia but due to low quality evidence our findings were not conclusive.
Systematic Review ajog.org A sensitivity analysis was performed to validate the credibility of the results, excluding studies at high risk of bias in QUADAS-2 domains, as described in our protocol. A separate analysis was done for each domain in turn. Furthermore, a separate analysis stratifying by chosen study design was performed.

Study selection
We identified 2636 studies, and after removal of 1014 duplicates, 2 authors screened 1622 studies. We excluded 1506 studies after title and abstract screening and obtained 116 studies for full-text screening, 4 of which were from additional sources. Seven studies 16e22 were ultimately included in the meta-analysis. The studies were conducted from 1988 to 2013, with a total of 1097 women included in the final analysis. The study characteristics are shown in Table 1.

Study characteristics
The 7 studies were performed in the United States, Denmark, Turkey, and Canada and included between 62 and 388 women. Three studies were RCTs 18,20,21 and 4 were cohort studies, 16,17,19 where the study by Boardman et al 17 had the sequence of index tests randomized (tableeenvelope).
In the RCT by Mogensen et al, 21 women undergoing colposcopy-directed biopsies were randomized to either EB or ECC. Women with !CIN2 underwent conization, and 131 of 180 women were included in the final analysis. Women with CIN1 or an inadequate sample underwent follow-up with biopsies in a crossover design where previous EB changed to ECC and vice versa. The RCT by Klam et al 20 randomized women to either EB or ECC, and both index tests were assessed by a single pathologist. Only 147 of 315 women with biopsy-verified disease underwent conization. In addition, in the RCT by Goksedef et al 18 women were randomized to either EB or ECC, and both index tests were assessed by a single pathologist. However, in this study only 33 of 208 women with biopsyverified disease underwent cone biopsy.
In all 4 cohort studies participants had both index tests done and were compared with either conization or hysterectomy. One of the cohort studies included women with conization performed at a single center, 16 and the other included women with any indication for conization or hysterectomy. 17 In 3 RCTs and 1 cohort study, women with an abnormal cytology were included. 18e21 A considerable variation in the prevalence of cervical neoplasia was present in these studies, ranging from 6% 20 to 90%. 22 The results of sensitivity and specificity for the individual included studies are shown in Figure 2. Detection of !CIN 2 was chosen as the cutoff value; however, some studies did use other cutoffs ( Table 1). The highest sensitivity reported was 97% for the EB and 94% for ECC. 22 The highest specificity was 97% for the EB 20 and 97% for ECC. 17

Risk of bias of included studies
The QUADAS-2 scores are shown in Table 2. All studies were rated as having unclear risk of bias in patient selection because the index tests were assessed in 2 settings-one where women were undergoing conization (2 of the cohort studies 16,17 ) and one based on abnormal cytology referral (3 RCTs and 1 cohort study 18e21 ). However, in both settings only women who underwent conization (reference standard) were included in Systematic Review ajog.org the final analysis. Four studies 16e18, 22 were rated to have concerns of applicability because of a more selected patient group than our predefined target population, which included all women with an indication for colposcopy, biopsies, and endocervical sampling (patient selection; QUADAS-2, domain 1). The index tests were generally assessed to have low risk of bias, but there were some applicability issues related to our research question. Klam et al and Goksedef et al 18,20 were rated to have high risk of concerns regarding applicability because they used the same criteria for both ECC and EB (interpretation by a pathologist), which does not correspond to clinical practice (index test; QUADAS-2, domain 2). Most studies had insufficient details on the reference standard. Klam et al, 20 Mogensen et al, 21 and Goksedef et al 18 were RCTs in a clinical setup, and therefore not all women had a conization or hysterectomy done. Klam et al 20 was rated as having high risk of bias and applicability concerns regarding the reference standard because women without a reference test conducted were classified as true negatives. The rest of the included studies were rated as having unclear risk of bias and applicability concerns regarding the reference standard because different diagnostic criteria were used for conization and hysterectomy across the studies (QUADAS-2, domain 3). The criteria are shown in Table 1

Synthesis of results
We performed a DTA meta-analysis for summary sensitivity and specificity for all 7 included studies. The forest plots for our main analysis are shown in Figures 2  and 3. The sensitivity analysis and the Bayesian receiver-operating characteristics curves are shown in the supplied Appendix 2 and 3.
The sensitivity analysis did not alter the interpretation of the results; however, in the analysis sorted by study design, the sensitivities of the 2 methods were equivalent. In the sensitivity analysis on risk of bias, no studies were excluded in QUADAS-2 domain 1 (all studies had unclear risk of bias) and in domain 2 (all studies ajog.org Systematic Review had low risk of bias). One study 20 was omitted in QUADAS-2, domain 3 because it was found to have high risk of bias given that women without a reference standard were assigned to the true negative category. Three included studies 18,20,21 were omitted in QUADAS-2, domain 4 because not all included women had the reference standard assigned. When sorted by study design, the 3 RCTs reported equivalent sensitivity for both EB and ECC (Appendix 2).
We also performed a meta-analysis of summary RRs for risk of inadequate samples. The overall RR was 2.53 (95% CI, 0.58e11.0; P¼.215) for ECC compared with EB for inadequate samples ( Figure 4).
Two studies reported on discomfort, 18 We rated the evidence using the GRADE tool for diagnostic testing. 10 We downgraded the quality of evidence from high to low: one level because of inconsistency in study designs because studies had different criteria for conducting either hysterectomy or conization, and another level because of indirectness because of differences in included study population and because of imprecision in the estimates with wide CIs. The main results and the gradings are shown in Tables 3 and 4.

Main findings
We analyzed the results of 7 studies reporting the DTA of EB and ECC, and including data for 799 women undergoing EB and 761 women undergoing ECC.
The overall sensitivity was found to be slightly higher for EB than for ECC (lowcertainty evidence), and vice versa, specificity was found to be slightly higher for ECC than for EB (low-certainty evidence). The results, however, were nonsignificant.
Furthermore, a nonsignificant increased risk of inadequate samples with ECC compared with EB was found, and no clinical difference in discomfort of the women between the tests were found.
As reported, EB was found to have a nonsignificant higher sensitivity, but lower specificity than ECC. It has been argued that because of this lower specificity EB is not as valuable as ECC as a diagnostic test, but is rather a superior screening tool. 16 If assuming a cervical neoplasia prevalence of 50% for women undergoing endocervical sampling, and using our pooled sensitivity and specificity estimates for EB and ECC, EB would detect 6 more cases of cervical neoplasia relative to ECC for every 100 women undergoing endocervical sampling (Tables 3 and 4). However, EB would also lead to 4 more false-positives compared with ECC. Although conization is considered safe, it is still associated with side effects such as increased perinatal mortality owing to preterm delivery, stenosis, and a very small risk of excessive bleeding. 23e26 The above estimates are based on previous testing and chosen populations in the included studies. The diagnostic accuracy may differ in a population with, for instance, only endocervical high-grade squamous intraepithelial lesion (HSIL) cytology and negative biopsies. A concern of too many false-positives (too low specificity) may limit the use of EB as a sole diagnostic test in this population for referral to a diagnostic conization. Accuracy in this situation must be addressed empirically because it was only in the study by Hoffman et al 19 that endocervical HSIL in itself elicited a diagnostic conization. The diagnostic pathway for these women differs globally. In the Australian and Canadian guidelines, for women with negative colposcopy-guided biopsies but with HSIL-positive cytobrush samples (and HPV-DNAepositive samples in Australia), a diagnostic excisional Damkjaer. Endocervical sampling in suspected cervical neoplasia. Am J Obstet Gynecol 2022.

Systematic Review
ajog.org procedure should be considered. 27,28 However, in other countries, Denmark included, this is not the case, and isolated HSIL only warrants a follow-up examination. 5 It has been suggested that ECC may be more sensitive for disease in the endocervical canal because its sampling technique is more precise than EB, and the high sensitivity and lower specificity of EB could be related to ectocervical disease (contamination).
Modified sleeve technique has been proposed to increase the specificity of the EB by guiding and shielding the EB into the endocervical canal, thereby minimizing ectocervical contamination. 29 However, in the studies by Hoffman et al, 19 Mogensen et al, 21 and Klam et al 20 looking at verified endocervical disease, EB was still reported to have higher sensitivity for endocervical disease than ECC. In a study by Zahn et al, 30 the interobserver agreement in the interpretation of ECC by blinded pathologists was only moderate, which could be considered slightly concerning.
In 2 of the included studies, 16,20 1 and 7 inadequate samples, respectively, have been registered as true-negatives instead of being excluded in analysis as seen in the other studies. This could lead to a false lower sensitivity but higher specificity for ECC because of misclassification. Yet, because of few inadequate samples in the 2 studies and because the rest of the included 5 studies excluded inadequate samples, this is not likely to be the reason for the diverging sensitivities of EB and ECC. In light of these findings, a recent study by Mihaljevic et al 31 has reported in a subanalysis that ECC samples with abundant material had higher sensitivity than those with scant material. We would have liked to have included this study, but some errors and inconsistencies were found in their 2Â2 tables, and the corresponding author did not reply to our e-mail. Another study found but not yet published was a RCT by Gonzalez et al 32 that was presented in a conference paper in 2019 and has not undergone peer review. They did not publish the 2Â2 tables, and we have not been able to contact the authors by e-mail.
Only 2 studies reported women's discomfort. One study did not find any difference, and the other found a small association with slightly higher VAS score for ECC, but VAS of 2.55 vs VAS of 1.99 is not considered clinically relevant. 33 Our results are in accordance with the RCT from Undurraga et al, 34 reporting no difference in women's preference for the 2 procedures.

Strengths and limitations
The strengths of this review include a prespecified protocol, comprehensive bias assessment using the QUADAS-2 tool, and a thorough literature search conducted by an expert research librarian.
This review had some limitations, including a low number of studies and participants. DTA studies vary in study ajog.org Systematic Review design because no clear superior study design has been proposed. Therefore, the included studies varied in design between RCT and diagnostic cohort studies, which caused heterogeneity. Another limitation is potential selection bias because only women who underwent a conization and/or hysterectomy were included in the final analysis of all but 1 study. However, these diagnostic procedures are considered the gold standard and as such are necessary for proper diagnostic accuracy measurements. As mentioned, the diagnostic pathway for women varies globally, leading to variable prevalence of cervical neoplasia in the included studies. This variation needs to be addressed in future studies. Furthermore, some of the studies were >20 years old, and although no major technical breakthrough regarding endocervical samples has emerged, a liquid-based analysis of cytology is now, for instance, the most used type of analysis, and HPV-DNA status is now also used in adjunction to colposcopy referral.
Comparison with existing literature Our results are in accordance with the European Guidelines for Quality Assurance in Cervical Cancer Screening, Second Edition, 35 and a review focusing on ECC by Driggers et al, 36 reporting higher sensitivity for EB than for ECC, although also nonsignificant, leading to a higher detection rate for EB.
A real-life evidence study 37 supported the use of ECC, especially in women aged >50 years. The finding that ECC may be more useful in older women is also supported by Gage et al. 38 This may be because of a higher risk of inadequate biopsies in elderly women because the transformation zone is retracted with age into the cervical canal. ECC then might be more useful because it gives a histologic evaluation of the endocervical canal. However, these studies did not compare ECC with EB. A study also examined combining EB and ECC for endocervical sampling, 39 reporting no difference in the amount of material attained from using both sampling techniques. However, their finding was in direct contrast to the findings by the newer RCT by Undurraga et al. 34 More high-quality studies are needed to further address this idea of combined sampling.
Our results are in accordance with the RCT by Undurraga et al, 34 which reported a higher risk of inadequate samples with ECC than with EB (14.3% vs 2.0%; P¼.002), and we also reported a higher, albeit nonsignificant, risk of inadequate samples using ECC.

Conclusion and implications
No difference was found between EB and ECC in diagnostic accuracy, inadequate sampling rate, and adverse effects based on low-quality of evidence. Variation in the characteristics of women and the resulting diagnostic pathways make the external validity limited. -