Cervical cancer screening in low- and middle-income countries: A systematic review of economic evaluation studies

Highlights • Screening and early detection programs are a cornerstone of cervical cancer prevention.• Human papillomavirus DNA-based testing reduced the incidence of CC below per capita GDP.• Cost-effectiveness of HPV testing versus cytology in Low- and Middle-Income Countries.


Introduction
In November 2020, the Global Strategy for the Elimination of Cervical Cancer was launched during the World Health Assembly. If effective actions are not taken, the prevalence of the disease could increase to 700,000 cases by 2030, and the number of deaths could reach 400,000 each year in the next decade, according to the WHO. 1 Vaccination, screening, and treatment are the cornerstones to implement the strategy, which had the adhesion of 194 countries. The proposed global targets are 90% coverage of Human Papillomavirus (HPV) vaccination in girls under 15, 70% coverage with HPV testing among women aged 35 to 45 years old, and 90% coverage of treatment, including palliative care. 1 The main promoter of CC is the Human Papillomavirus (HPV), which mainly affects women over 30 years old, with a peak incidence in 45 to 50 years old. The disease develops slowly and has a long phase before becoming invasive. If diagnosed in the early stages, it is treatable and more likely to be cured through adequate screening, early detection, and treatment, which is cheaper for the health system. 3 The Pap smear test (Papanicolaou Test) is one of the screening methods used as a strategy to detect CC. Its main favorable aspect is the low cost. 4 In countries where this method is widely offered in organized public health programs, cervical cytology has significantly reduced incidence and mortality, particularly in countries with high target population coverage, control, and quality assurance associated with the program. 5 One of the limitations of cytology-based screening is the low sensitivity for the detection of precursor lesions (cervical intraepithelial neoplasm [CIN2+] grade 2 or higher) compared to HPV testing. 6 Another limitation is the complexity of the logistical and care infrastructure to implement quality control and carry out the appropriate clinical management of women with positive screening. For these reasons, cervical cytology screening has not yet reached high population coverage in lowand middle-income countries, where it usually occurs opportunistically. 7,8 The discovery that persistent infections with a few genetically related HPV types cause virtually all cases of CC has led not only to vaccine development but also to HPV testing. HPV testing for (high risk) carcinogenic types of HPV infections is more sensitive than cytology, allowing for greater safety and longer screening intervals. 9 The implementation of screening and early detection programs is one of the cornerstones of cancer prevention. Despite evidence that early detection saves lives, global disparities in access to services persist. Economic assessments are relevant to support the decision to incorporate and implement the most cost-effective strategies available to reduce female mortality from CC, especially in low-and middle-income countries. 10 For low-and middle-income countries, which often face budgetary constraints in their health systems, achieving the goals proposed by the WHO requires investments in cost-effective interventions. Thus, it is necessary to consolidate evidence and optimize the distribution of resources in these locations.
This Systematic Review (SR) aimed to analyze the cost-effectiveness of CC screening strategies by comparing the molecular tests for HPV and the Pap smear test used in women from low-and middle-income countries.

Methods
This SR was performed according to the guidelines of the Center for Reviews and Dissemination (CRD) 11 and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses checklist. 12 The protocol was previously registered in PROSPERO (CRD42020208135). 13

Eligibility criteria
Economic evaluation studies that reported cost-related results, such as Incremental Cost-Effectiveness Ratio (ICER); Incremental Cost-utility Ratio (ICUR); cost difference; incremental costs, and measures of effectiveness such as Years of Life Lost (YLL); Years of Life Saved (YLS); Quality-Adjusted Life Years (QALY) and Disability-Adjusted Life Years (DALY) from HPV DNA-based testing for cervical cancer screening and conventional cytology, in women from low-and middle-income countries, without age, language or publication date restrictions.
Were excluded studies performed with populations from highincome countries, hysterectomized women, HIV+ and screening performed by visual inspection of the cervix, liquid cytology, or computerassisted automated. Publications that were not economic studies (clinical studies, systematic reviews) or preliminary studies (conference abstracts) were excluded.

Sources of information and searches
The search for economic evaluation studies was carried out in August 2020 in the MEDLINE databases via PubMed, EMBASE, CRD (Centre for Review and Dissemination), and Latin American and Caribbean Literature in Health Sciences (LILACS) via the Virtual Health Library (BVS). Also, a manual search was performed in the reference list of the selected publications. In addition to terms related to screening methods, search strategies included terms related to cost-effectiveness (human papillomavirus tests; HPV test; Papanicolaou test; pap smear; economic evaluation) and were made available in the supplementary material (Chart 1).

Study selection
The selection was made by two researchers independently, who initially read titles and abstracts and then evaluated the full texts, using the Rayyan program 14 15 to exclude duplicates and merge records from different databases. Disagreements were discussed and solved by consensus or, eventually, by a third researcher.

Data extraction
Double-blinded data extraction was performed using a form previously prepared in an Excel spreadsheet (Microsoft Corp., Redmond, WA). Information extracted from selected articles included study characteristics (author and year of publication, location, and economic classification according to the World Bank, type of economic study and modeling, time horizon, payer perspective, and discount rate), population characteristics, tracking strategies used and results of cost and effectiveness measures.
To enable comparisons between studies carried out in different years, countries, and considering other currencies and accounting for the effects of inflation, the ICER measures were converted to international dollars and updated for the year 2019. In this process, the authors used the FXTOP 2001-2020 tool, 16 the site for currency conversion and historical exchange rates, and the purchasing power parity tables provided by the World Bank. 17 A descriptive synthesis including the analytic approach of the studies contextualized by geographical regions and countries' incomes were presented in tables.

Assessment of the report of economic evaluation studies
As there was no tool to assess the risk of bias in economic evaluations, the Consolidated Health Economic Evaluation Reporting Standards 18 (CHEERS) checklist was used as an instrument to verify whether the studies included in the SR contained the information considered essential.

Classification of studies according to dominance ranking
The Dominance Ranking Matrix (DRM) system, developed by the Joanna Briggs Institute, 19 was used to summarize and interpret the results of the economic evaluations. DRM provides three classification options: strong domain for intervention, weak domain for intervention, and no dominance for intervention. This hierarchical dominance matrix allows a visual summary of various economic analyses with different outcome measures (e.g., cost-effectiveness, cost-utility, cost-benefit) that otherwise would not be possible in a quantitative meta-analysis approach. 20,21 The authors emphasize that, although DRM is not a method of quantitative synthesis, its hierarchical structure allows for an interpretation of the levels of the dominance of an intervention based on the assessment of benefits for costs and health outcomes in a study.

Results
Four hundred sixty-seven potentially relevant publications were retrieved from the electronic databases, and 8 were identified through a manual search in the reference lists. After excluding duplicates (n = 34), 441 publications were chosen for the initial selection, of which 370 references were excluded. Seventy-one articles were selected for full reading, and of these, 15 were included in the SR (Fig. 1). The excluded studies, together with the reasons for their exclusions, were included in the supplementary material (Chart 2). Thirteen Cost-Effectiveness Analyses (CEA), one cost-minimization, and one cost-benefit analysis were found. The CEAs were complemented with budget impact analyses in 2 studies (Table 1). The analyses considered the health system perspective (n = 7), 23,24,27,32,33,35,36 followed by the patient and health system perspective (n = 6) and 25,26,28,30,31,34 the payer perspective (n = 2). 22,29 The predominant analytical models were the Markov (n = 7) 22,28,29,30,33,35,36 and Decision Tree models (n = 5). 23,24,27,32,34 Microsimulation models (n = 2) 25,26 and the Semi-Markov model (n = 1) were also used. 31 Table 2 summarizes the main results of the economic analysis studies. Cost-effectiveness values varied between studies due to the different strategies evaluated regarding age (35, 40, 45 years) and frequency of screening (1, 2, 3 times throughout life).
The outcome measures used were Years of Life Saved (YLS), Quality-Adjusted Life Years (QALY), and detected cases of high malignancy precursor lesions or cervical cancer.
Cost measures based on the perspective adopted included: costs per woman screened/100,000, lifetime costs, direct costs (medical and nonmedical), and programmatic costs. Only 4 studies showed results in international dollars, 25,28,32,33 the others used US dollars 22,23,26,27,30,31,36 or local currencies. 24,29,34,35 Regarding the cost-effectiveness threshold, most studies (n = 10) used the method recommended by the Commission on Macroeconomics and Health of the World Health Organization, 37 which establishes the value of the Gross Domestic Product (GDP) per capita of the country as a parameter to determine whether a technology is considered cost-effective (Table 2).
In 7 studies, the authors chose to present the cost-effectiveness threshold in US dollars. 22,26,30,[33][34][35][36] Only 2 studies presented values in     international dollars. 25,27 In one publication, the value was presented in local currency; 29 in another, willingness to pay was used as a measure, 23 and 3 studies provided no information. 24,27,31 As the per capita income of the included countries is relatively low (low and middle-income countries, with GDP between USD 1,702 and USD 6,631 or Int$ 1,005 and Int$ 9,486), the HPV DNA test was considered cost-effective in different scenarios.
The sensitivity analysis performed in most studies was deterministic (n = 10), 24,26-30,33-36 only 3 studies presented a probabilistic analysis 23,25,31 and 2 authors did not report it. 22,32 Among the parameters under analysis, those that significantly impacted the results were identified as sensitivity and specificity of competing tests; costs of HPV DNA testing and Pap smears; the prevalence of age-specific HPV, and the incidence rates of cervical cancer. Fig. 2 summarizes the items reported in the studies included in the SR, according to CHEERS. 18 It is important to highlight that this checklist is used to prepare economic evaluation studies and not analyze the methodological quality itself. It was used to examine the 24 items that ideally should be included in publications on economic evaluation in health.
For all economic evaluations, data on study perspectives, comparators, costs and outcomes, findings, and limitations were included, however, data on analytical methods was the least reported item in the studies.
Six of the studies included (40%) 22,28,30,33,35,36 presented transition probabilities; only one study explained the calculation of these probabilities or whether cycle correction was used, which are fundamental aspects in the development of Markov models. Only one study 25 (6.66%) reported the use of some calibration method.
None of the studies presented justification for the duration of the cycle, which must be based on the natural history of the disease.
Although the presentation of the reasons for choosing the specific type of decision model used is recommended, none of the authors stated their reasons. The description of the models and presentation of the figure or analytical scheme is in Fig. 2.
Results of applying the CHEERS 18 to each study are found in the supplementary material (Chart 3).
As for the performance of the screening tests based on accuracy measures, in all economic evaluations, both cervical cytology and HPV DNA tests (rapid test and hybrid capture) showed good specificity. Regarding sensitivity, there was the difference between the HPV-DNA tests and cervical cytology and within cervical cytology (ranging from 58.4%−72%) throughout the studies (supplementary material, Chart 4). Table 3 shows the interpretation of the results of economic assessments by the classification of the JBI Dominance Matrix, except for two studies: the evaluation carried out by Nahvijou et al. 32 because it was a cost-minimization study and the study of Levin et al., 30 as it did not present data for the analysis.
The dominance interpretation varied between the studies analyzed (n = 13) or within the same study, depending on the strategies compared. 19 In 6 studies, 23,[25][26][27]33,35 the HPV test was dominant compared to conventional cytology, which means there would be a favorable decision to incorporate the new test because it represents a lower cost, bringing savings to the health system and with an increase in effectiveness.
Another five studies 22,28,29,35,36 showed a weak dominance of the HPV test against cervical cytology, showing greater effectiveness but with a higher cost. In this case, more information is needed on the priorities and preferences of decision-makers, such as ICER values and country cost-effectiveness thresholds.
In particular, the study by Nahvijou et al. 33 showed that the HPV test had strong or weak dominance, respectively, compared to conventional cytology, depending on the effectiveness measure used, QALY or YLS.
Finally, two studies 24,31 showed that the HPV test was not dominant over conventional cytology and was unfavorable to its incorporation, as it did not present a difference in clinical effectiveness and showed a higher cost. It is noteworthy that in the study conducted by Mandelblatt et al., 31 for one of the strategies evaluated (screening every ten years in women starting at 35 years of age and up to 55 years of age), the effectiveness of the HPV test was lower.

Discussion
This SR analyzed economic assessment studies used to examine the value and performance of testing for CC screening in women from lowand middle-income countries. Of the 15 studies included, the majority were conducted in upper-middle-income countries (71%), underscoring the need for local modeling studies in low-and lower-middle-income countries.
Most of the total economic evaluations were cost-effectiveness analyses (13 studies), in line with another review study. 38 However, the model selection, the analysis perspective, and the comparative screening strategies varied between the studies and revealed a specific methodological heterogeneity.
The societal perspective is generally recommended because is the most comprehensive and includes costs for the health system, costs for the patient, costs from other sectors, and indirect costs due to loss of productivity. It also allows a complete analysis of all of the opportunity costs attributable to disease and could be preferred for cost analyses such as Cost Benefit Analysis (CBA), Cost Effectiveness Analysis (CEA), and Cost Utility Analysis (CUA). However, societal perspective requires presumably the biggest sizable data, often making it difficult to use in specific contexts. 39 In this SR all studies adopted the perspective of the health system and the patient, possibly because it is difficult and time-consuming to estimate all cost components from the societal perspective, and because these two perspectives are the most used in economic evaluation studies, precisely because they present a more pragmatic character in answer to a question. 40 There was a variation in the types of costs used in the different studies. In 10,22,24,24,27,29,[31][32][33][34][35][36] the authors presented only direct medical costs, while in the others, direct medical and non-medical costs were presented. 26,28,30,31 However, it is important to highlight that the estimate of non-medical direct costs is relevant, especially regarding the costs of patients and families (cost of transport to and from the health service; food, and accommodation, among others). 39 Indirect costs should also be measured whenever possible, as they involve costs arising from absenteeism, that is, the period the patient is absent from work to receive treatment, or due to lower productivity caused by the effect of the disease or its treatment. 39 Of the 15 studies included in the SR, 7 (46.67%) used the Markov modeling. Markov models are advantageous in diseases with repeated events over time, such as cancer. Its cyclical nature is convenient to characterize interventions repeated on a scheduled basis over time, as in the CC tracking strategies. 41 The model will simulate disease progression for a specific cohort of patients, assigning a probability of progression and regression between phases/classifications from dysplasia to invasive cancer.
Five studies (33.34%) that used Decision Tree, the simplest form of analytical models, were included in the SR. In this model, graphic resources are used to describe the possible paths taken by patients if they were under tracking strategies, interventions, or the treatments investigated. These paths include events and their respective probability of occurrence, and in the end, health costs and outcomes are assigned to each path taken. 42 The limited structure of the Decision Tree model makes its use suitable for an acute disease of a short-time period, however, it reveals Table 3 Cost-effectiveness of the included studies, according to the Joanna Briggs Institute (JBI) dominance classification matrix.
(+) The intervention is more cost-effective or more effective than the comparator; (0) The intervention has a cost or outcome/ benefit equal to the comparator; (-) The intervention is less costly or less effective than the comparator. (D; G; H), Intervention dominance (favorable); (A, E; I), Weak dominance of the intervention; (B; C; F), Non-dominance of the intervention (unfavorable). difficulties in modeling situations with recurrence of events and longterm periods, as in the case of chronic diseases like CC. In these situations, the use of the Markov Model is recommended. 43 Studies that used discrete individual-based models, microsimulation, or Semi-Markov models (n = 3; 20%) were also included in this SR.
As for the model parameters, although the performance of the screening tests showed high specificity ranging between 86% and 98%, there was a significant difference in sensitivity between molecular tests based on HPV DNA (Rapid test: 81% to 90%; Hybrid capture: 88% to 95%) and conventional cervical cytology by Pap smear test (range from 58.4% to 72%). These data corroborate the findings of the study of Ginsberg et al. 44 They show the possibility of distinguishing between the sensitivity and specificity of intra-and inter-regional screening interventions due to differences in method, collection, and professional experience of physicians and laboratory technicians.
It is important for health technology managers and evaluators to be perceptive in adopting measures that prioritize the screening of populations at risk to reduce the cost per year of life saved or the incremental cost-effectiveness ratio.
Health technology assessment studies are permeated by uncertainties related to the model's structure, parameters, and methodology. Most of the studies included only addressed parameter uncertainty by univariate deterministic sensitivity analysis. However, it is necessary to evaluate the effects of model uncertainty on the results.
Screening strategies varied concerning screening age (35, 40, 45 years) and frequency (1, 2, 3 times over a lifetime). The heterogeneity of the intervention and the evaluation outcome measure made it challenging to summarize qualitatively and, above all, quantitatively. In this regard, some authors consider it imprudent to perform a meta-analysis of economic analysis studies, as they believe that producing a robust scientific result is unlikely. This argument is based on variation in the use and costs of resources in different regional scenarios, in the peculiarity of the context (institutional, populational, behavioral, and cultural), and in the multiplicity of methods between the studies, which can interfere with cost-effectiveness measures due to heterogeneity.
The CHEERS instrument, 18 created to encourage greater standardization and promote better quality in the reports of economic evaluation studies in the health area, guides researchers in preparing the report and includes items in the checklist that ensure greater methodological transparency. However, it is not a tool assessment of the methodological quality itself.
One of the main features of CHEERS 18 is that it makes a clear distinction between economic evaluation studies based on a single source of primary data and studies based on modeling for decision analysis with inputs from multiple sources.
In mathematical modeling, the description of the model and its parameters must enable the results to be reproduced by other authors. This is especially relevant when using complex models such as discrete event simulations or Markov microsimulations.
In this SR, regarding the quality of the report, the authors have found that aspects such as the perspective of the study, the comparator; costs and outcomes; findings, and limitations were described in all studies. However, the model's description was considered complete in less than half of the studies, and none of them showed the reasons for their choice.
The absence of reporting the calibration method is possibly due to scarce standards in the calibration of models for cervical cancer screening, lack of consensus in the literature on the minimum specification that should be reported, and the insufficiency of local data to estimate the tracking parameters. 45,46 According to Silva et al., 47 critical analysis through a script only signals the strengths and weaknesses of a study, and it is up to the evaluator to weigh the results according to the context of each investigation. Thus, it is common to find studies that do not meet all the requirements. However, not considering an item may be a consequence of the lack of available information, but the importance of justification for each point not included in the checklist is highlighted.
To determine the cost-effectiveness of screening, McMeekin et al. 48 have pointed out that it is necessary to define the optimal age to start screening, if abnormal Pap smears can be better stratified according to risk, and the positive predictive value of the current tracking strategies, among others. The screening interval (every 5, 10 years), coverage (50%-80%), and adherence or compliance with visits (1, 2, 3 times over a lifetime) have been other aspects evaluated among the compared interventions.
The DRM shows the distribution of studies into three distinct bands, where a predominance of the number of studies in a given band will indicate the likely implication of the intervention. If more studies are located on a matrix space, this would mean a level of dominance associated with that range. However, if there are equal studies in two or three bands, no clear conclusions can be drawn. In this SR, the authors have found that most studies (11/13) are located in the favorable band for the intervention, and the distribution of the number of studies is similar in the classification of strong dominance (n = 6) and weak dominance (n = 5). This is the main methodological difference between the present study and the SR by Mezei et al. 10 The authors chose to evaluate the cost-effectiveness results by DRM 19 whilst Mezei et al. 10 only transformed the ICER results of the economic evaluations into international dollars and compared them directly.
Joanna Briggs Institute's analysis of the dominance of strategies by DRM (JBI, 2014) 19 in 13 of the studies included allowed us to observe how the results of analyzing benefits and costs between the investigated interventions can vary in different country contexts, with varying results in the same country and even inside the study itself, depending on the strategy. The studies in which the HPV test was dominant compared to conventional cytology 3,5,23,25,26,27,33 coincided with the authors' conclusions; however, the elaboration of the dominance matrix allowed us to observe that in the studies conducted by Campos, 2015 and 2017, 25,26 the gain in effectiveness measured in years of life saved was tiny (0.005-0.073 YLS; 0.004-0.065 YLS; respectively). On the other hand, studies that showed a weak dominance of HPV testing over smear cytology that evaluated greater effectiveness but higher cost 22,28,29,34,36 suggest that, although HPV-DNA testing prices are still very high, they could be negotiated due to the volume of purchase, considering that the HPV-DNA tests were associated with greater efficacy than conventional cytology due to their greater sensitivity and reproducibility. In all cases, the value of the ICER was lower than the cost-effectiveness threshold used (GDP per capita of the countries); however, in at least 2 35,36 of them, the effectiveness gain measured by the CIN2+ detected per case or by QALY gained, respectively, was also small (0.001; 0.05), which could suggest a thorough evaluation for decision making in favor of incorporating the new test.
One study 33 showed that the HPV-DNA test had strong dominance or weak dominance, respectively, compared to conventional cytology, depending on the measure of effectiveness used, QALY or YLS, pointing in the second case to lower effectiveness despite the lower cost of the HPV-DNA test. This difference may be related to the parameter values used in the model, a limitation mentioned by the study's authors.
The variation in the result presentation is another prominent feature in the studies. Some authors 19 argue that SR with summarized results from different contexts cannot be extracted, as opportunity costs, resources, comparators, and relevant interventions are very discrepant. However, the SR of economic assessments can be an additional tool for decision-makers, especially in understanding resource allocation and the potential impacts. This can be achieved by identifying gaps in the evidence base, alerts to essential outcomes for intervention selection/compensation, and a better understanding of the circumstances that provide cost-effective models/ interventions. 19 Donaldson et al. 49 suggest that the SR value of economic analyses is not to generate a single result or reliable recommendation about costeffectiveness but rather to help decision-makers understand the structure of the resource allocation problem and the potential impacts. Thus, the focus of this article was not to try to generate a summary estimate of the cost-effectiveness relationship but to demonstrate the variability and its determinants from one environment to another.

Conclusions
The main findings of this review indicate that the HPV-DNA test proved to be cost-effective compared to conventional smear cytology (Pap) in women from low-and middle-income countries for different strategies. Beginning the screening when women are 35 years old, repeating it every five years, and carrying out the test 2 and 3 times throughout life are successful strategies. However, as already discussed, the level of evidence (JBI) of the set of studies showed some disparities related to the types of outcomes evaluated and the types of costs used in the parameters of the models, according to the perspective adopted, which involved in some studies both the perspective of the health system and the patient.
While recognizing that differences in assessment contexts and populations imply that SRs of economic analyses are unlikely to produce unique answers, policymakers, healthcare professionals, patients, and other decision-makers can provide relevant information to choose or trade off the intervention analyzed.
This review is relevant to the public health policy in low-and middle-income countries because it shows evidence that, for most of the studies reviewed, the authors found at least one screening strategy that reduced the incidence of CC at a cost per year of life saved below the per capita GDP of the country investigated, showing the economic feasibility of saving thousands of lives per year by implementing cost-effective tracking strategies.
In terms of research, new screening methods have been proposed, combining CC prevention strategies that include screening and vaccination, bringing methodological challenges to choosing and designing the analytical model in economic evaluations, which are increasingly being used in incorporating technologies. On the other hand, the variability in test accuracy values, specifically from conventional cytology, suggests that new economic assessments could benefit from evidence syntheses or systematic reviews that address this aspect.
The present study is relevant due to the high disease burden of this type of cancer and the number of preventable deaths in women from low-and middle-income countries when there is timely identification of HPV infection through effective screening and access to appropriate treatment.