Population-based screening – the difficulty of how to do more good than harm and how to achieve it

Screening people without symptoms of disease is an attractive idea. Screening allows early detection of disease or elevated risk of disease, and has the potential for improved treatment and reduction of mortality. The list of future screening opportunities is set to grow because of the refinement of screening techniques, the increasing frequency of degenerative and chronic diseases, and the steadily growing body of evidence on genetic predispositions for various diseases. But how should we decide on the diseases for which screening should be done and on recommendations for how it should be implemented? We use the example of prostate cancer and genetic screening to show the importance of considering screening as an ongoing population-based intervention with beneficial and harmful effects, and not simply the use of a test. Assessing whether screening should be recommended and implemented for any named disease is therefore a multi-dimensional task in health technology assessment. There are several countries that already use established processes and criteria to assess the appropriateness of screening. We argue that the Swiss healthcare system needs a nationwide screening commission mandated to conduct appropriate evidence-based evaluation of the impact of proposed screening interventions, to issue evidence-based recommendations, and to monitor the performance of screening programmes introduced. Without explicit processes there is a danger that beneficial screening programmes could be neglected and that ineffective, and potentially harmful, screening procedures could be introduced.


Introduction
"All screening programmes do harm; some do good as well, and of these, some do more good than harm at reasonable cost" JAM Gray in BMJ 2008 [1] In March 2009, two large randomised trials reported conflicting results on the effect of testing for prostate specific antigen (PSA) in reducing deaths from prostate cancer in men aged over 50 [2,3].Two earlier randomised trials conducted in Canada and Sweden [4][5][6] had been inconclusive [7].In Switzerland, Kwiatkowski, Huber and Recker commented [8] on the results of the European study [2].They highlighted the 20% reduction in prostate cancer-specific mortality over a median follow-up time of nine years, and their claim that PSA testing should now become widespread elicited a wide range of responses [9][10][11].In contrast, the American study [3], nested in the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial [12], did not show a reduction in prostate cancer-specific mortality.Discussions to determine the reasons for the difference in results are ongoing [13,14].
The case of PSA testing for early detection of prostate cancer provides a good example of how difficult it is to decide whether population-wide screening should be recommended or not [13][14][15][16][17].The Swiss Society of Urologists initially endorsed PSA testing but later revised its position to state that the available evidence does not allow routine PSA testing in men to be recommended ("Aufgrund der vorliegenden Datenlage kann ein systematisches Testen der männlichen Bevölkerung mit dem PSA-Test nicht befürwortet werden."(www.urologie.ch/upload/Prostatafrue-herkennung09.pdf).However, the Society goes on to state that PSA testing could be performed in men aged 50-70 years with a life expectancy of at least ten years, after careful briefing of subjects on PSA testing.The role of specialist medical societies in making recommendations on a complex public health intervention for the general population should be discussed.
In this article we cite examples from cancer and genetic screening to explain the issues involved in assessing the evidence for and against screening.We focus on screening as a population-based intervention involving administration of the screening test together with follow-up examination and treatment, all of which has benefits, harms and costs.We argue that, in Switzerland, these factors, together with plans for implementation and ongoing monitoring, should be considered before deciding whether or not to start a new screening programme.We describe the processes established in a number of other countries where an independent committee conducts this assessment, and suggest that a similar body should be established in Switzerland.

Benefits and harms of screening
Screening has been defined in many different ways (Box 1).Common to all definitions is that something is done to people who seem healthy.This sets screening apart from most healthcare interventions.When people actively present with a health problem that requires treatment, they accept that the diagnostic process or treatment carry some risk of inflicting harm.When the same pro-cesses are applied to seemingly healthy people, the acceptable level of risk is much lower.

Screening Definitions.
What should a screening intervention in healthy people achieve?With regard to cancer, the people who profit from screening are those who a) would have died from the cancer but are cured, owing to earlier detection; b) would have been successfully treated for their cancer but whose quality of life is improved owing to earlier detection and less debilitating treatment; and c) do not have cancer and are reassured by the results of a screening test that correctly shows they do not have the disease.However, screening can also be harmful.The people who do not benefit, and might be harmed by screening, are those who a) die from a screen-detected cancer but whose clinical course was not improved by treatment; b) have cancer but would have survived even without screening; c) have a screening-detected cancer that would have not surfaced clinically during their lifetime, resulting in overdiagnosis and unnecessary treatment; d) have cancer but have a false negative screening test result; e) have a false positive result, which results in anxiety or unnecessary further investigation and treatment.

Assessing the appropriateness of screening
More than 40 years ago, Wilson and Jungner [18] proposed a framework for evaluating the appropriateness of screening to rationalise the increasing use of tests for early disease detection.Several countries have since introduced permanent bodies, independent of government, whose purpose is to assess new and existing screening technologies and to make recommendations to healthcare providers and funding authorities.Examples include the U.S. Preventive Services Task Force (USPSTF), the New Zealand National Health Committee and the United Kingdom National Screening Committee (UKNSC).The UKNSC has updated the Wilson and Jungner criteria to develop a 22-item list of criteria concerning the condition, the screening test, the treatment and the screening programme, all of which are to be met before introducing a screening programme (see www.screening.nhs.uk/criteria).Here we highlight a range of issues that need to be evaluated.

Figure 1
The balance of benefits, harm and quality (Adapted from [40]).

Review article
Swiss Med Wkly.2010;140:w13061 The need for randomised trials For cancer screening, the randomised controlled trial with mortality as the outcome is the only study design that allows unbiased comparison of outcomes in screened and unscreened groups [12,19].The UKNSC requires "evidence from high quality randomised controlled trials that the screening programme is effective in reducing mortality or morbidity".

Screening-detected cancers are different
In observational studies comparing screened and unscreened people, those whose cancer was diagnosed through screening often appear to survive longer than those who presented with symptoms, even if there is no benefit from screening.This is due to "lead time bias".In addition, tumours that are detected as a result of screening are more likely to be indolent, slow-growing or less aggressive than those that present with symptoms spontaneously or in the interval between two scheduled rounds of screening (interval cancers).This phenomenon is referred to as "length bias" [20][21][22].
Overdiagnosis is an extreme case of length bias: it refers to the detection of cancers by a screening test that would never have caused overt disease.These cancers result in unnecessary treatment and, at the very least, cause anxiety.Whether a screening-detected cancer is an overdiagnosis cannot be determined in the individual case.However, the extent of overdiagnosis in cancer can be shown from randomised trials where an elevated cancer incidence persists in the screened group in comparison with age-specific national cancer incidence data [23][24][25][26].Furthermore, for several cancers many more cancers are found in autopsy studies than will ultimately matter.The most prominent case is prostate cancer, for which overdiagnosis is estimated at 50 to 70%, meaning that cancer diagnosis in the target group for screening will increase by a factor of 1.5 to 1.7 when PSA testing is introduced [2,27,28].

Choice of outcome in cancer screening trials
There is considerable debate about whether trials of cancer screening should use a reduction in total mortality or a reduction in cancer-specific mortality as the outcome [29][30][31].In general, it depends largely on the proportion of deaths attributable to the disease.If this proportion is small, an impact on overall mortality is unlikely to be observed.For PSA testing, however, uncertainty remains as to whether prostate cancer-specific mortality is appropriate, as overdiagnosis leads to overtreatment and may lead to an increased mortality risk for other causes of death.

Levels of evidence for cancer screening
Evidence from randomised trials of a beneficial effect of screening on cancer-specific mortality is limited to mammography for breast cancer [32,33] and faecal occult blood testing (FOBT) for colorectal cancer [34].Data from randomised trials of the effectiveness of colonoscopy alone or in combination with FOBT are still awaited.Whilst screening for cervical cancer is well-established, there are no randomised trials to demonstrate its effectiveness.The observational evidence is, however, widely accepted as showing that regular cervical cytological screening lowers cervical cancer morbidity and mortality [35].Technological advances in cervical cancer screening, following the discovery of human papillomavirus (HPV) as the causative agent, are now being evaluated in randomised trials.A cluster randomised trial in 52 villages in India suggests that testing for carcinogenic HPV types may improve the clinical utility and cost-effectiveness of cervical cancer screening by prolonging the screening interval in women with a negative test.A single round of HPV testing reduced the numbers of advanced cervical cancers and deaths from cervical cancer [36].Additional randomised studies on the benefit of cervical cancer screening by HPV testing are currently ongoing in Canada, Finland, Italy, the Netherlands, Sweden and the UK [35].

The need for information on the harmful effects of screening and on cost-effectiveness
Policymakers, practitioners and the public also need solid evidence on the extent of the harmful effects of screening.Only then is it possible to judge whether the benefits outweigh the harms.In the case of PSA testing we now have weak evidence for a beneficial effect with regard to prostate-specific mortality [2,3].However, the two trials make it clear that the harm from overdiagnosis is a major problem.A better picture of the spectrum of harmful effects (including psychological, physical and financial costs of false positive results) should emerge from further evaluations of these large randomised trials and additional research [14,16].The European Urological Association endorsed this view in April 2009 [37].
Resources spent on screening need to be compared with other population-based measures in primary and secondary prevention (opportunity cost) [38].The UKNSC requires that the costs "should be economically balanced in relation to expenditure on medical care as a whole (i.e., value for money)".Cost-effectiveness evaluations needed for rational decisions on the introduction of population-based screening are always conditional on the quality of evidence regarding the effectiveness and harms of the screening procedure and assumptions about the natural history of the disease.They need to be continuously revised in the light of medical and technical progress, such as for example increases in the cost of novel and targeted chemotherapies [39].

Implementation and quality control of population screening
The best available evidence usually stems from well conducted studies in dedicated settings.But screening is a programme, a whole chain of activities, and not a test alone [19].Hence it is not guaranteed that the same balance of benefits and harms will be achieved when scaling up screening to a whole country and to all healthcare providers.Nationwide implementation requires a system to train staff involved in the screening and follow-up activities, continuous evaluation and quality control [1,19,40,41].The ultimate goal of all these activities is to maximise benefit and minimise harm (fig.1).Whether screening should be implemented as a systematic programme or in an opportunistic way is an important decision.In systematic programmes eligible individuals are invited for screening at regular agreed intervals.All those in the target population are included, screening coverage can be monitored and the quality of screening ensured.But such systems are difficult to implement in the absence of population registries or in highly mobile target populations.In opportunistic screening, healthcare providers offer screening when people attend health care settings for unrelated reasons.There are advantages to using existing infrastructure, but people who do not use health services or use them rarely will not have the opportunity to be screened regularly.Healthcare professionals may forget to offer the screening test regularly if consultation times are limited.It is also more difficult to monitor the coverage and quality of opportunistic screening, especially if the population is not welldefined.In the case of mammography screening for early detection of breast cancer [42], a whole set of indicators have been established which should be monitored regularly [43].In Switzerland this type of monitoring and evaluation has, for example, been implemented in the systematic mammography screening programme in the canton of Vaud [44][45][46].Obtaining some of the monitoring information was only possible because the canton of Vaud was already operating a population-based cancer registry.

The need for balanced information
Irrespective of whether screening is offered in a systematic or opportunistic manner, people need to be properly informed of the benefits and harms of the screening programme, from both the personal and population viewpoints, and the information must be offered in an understandable fashion [47][48][49].It is thought to be challenging to achieve high participation rates in screening programmes while informing target groups in a balanced, transparent and comprehensible way [50].Several studies have shown marked public overestimation of the benefits of screening for breast and prostate cancer, including in Switzerland [51][52][53].Furthermore, it is important to understand the reasons why persons participate in screening programmes.A study in Norway surveyed women after an invitation for a first round of mammography screening [54].Trust, gratitude and convenience were more important in the women's decision to participate than information on benefits and harms.However, it is possible to improve the presentation of statistical information by using clear reference classes and natural frequencies [55,56], so that persons or patients can take better-informed decisions.

New screening challenges: population genetic screening
The availability of genetic tests is expected to grow more rapidly than that of other screening tests [57][58][59].The hope of health improvement through genomics ranges from susceptibility testing to prevention of chronic diseases through targeted chemoprevention or behavioural interventions, testing for susceptibility genes in early detection, and testing for gene variants or expression profiles for targeted treatment [57].However, evaluation of genetic tests for public health practice remains poorly structured.Since the benefits and harms in genetic and non-genetic population screening do not differ in substantial ways, the same criteria established for non-genetic screening apply [58,60,61].But additional ethical, legal, and social aspects and possible harms need to be considered [62].For example, a positive test result for BRCA1 mutations and breast cancer risk in grandmother and daughter automatically implies the presence of a BRCA1 mutation in the mother, whether she agreed to testing or not [63].
In presymptomatic genetic screening additional safeguards against discrimination by insurance companies and employers, as well as against social stigmatisation, may become necessary.For example, a national targeted population carrier screening programme for severe and frequent genetic diseases in Israel has been implemented.It is targeted at Jewish and non-Jewish communities with a high degree of consanguinity and therefore prevalent genetic syndromes [64].To avoid stigmatisation, genetic testing in these communities is offered to all couples in their reproductive period, irrespective of their family history.While genetic testing in this programme is not mandatory, the Jewish ultraorthodox community requires genetic screening before marriage and the test result is one of the factors considered in the decision-making process for prearranged marriages.Premarital testing for thalassaemia is mandatory in some Middle-Eastern countries such as Iran and Saudi Arabia, albeit with no implications for the provision of marriage certificates by the government.To lower the incidence of thalassaemia a law has been enacted in Iran allowing termination of pregnancies before and up to the 120th day of pregnancy in cases of severe foetal disease [64].

Genetic screening in newborns
The example of cystic fibrosis (CF) exemplifies the challenges in implementing DNA tests in neonatal screening.According to data from randomised trials and observational studies, newborn screening for cystic fibrosis is associated with better growth and other nutritional indicators, lower morbidity, lower early mortality and improved lung function [65].It has also been associated with economic benefits in some studies [66], improved quality of life in family members, and improved reproductive decision-making regarding additional children [67].Potential harms of neonatal screening for cystic fibrosis include false-positive results leading to unnecessary follow-up tests and associated risk of acquisition of Pseudomonas aeruginosa infections in cystic fibrosis clinics, premature diagnosis of mild or atypical cases, identification of asymptomatic mutation carriers, and the risk of not recognising the presence of specific CFTR mutations [67].The types of screening test and how to use them in cascade remains controversial and needs adaptation to the population-specific genotype distribution and health system.Over 1600 mutations have been identified in the CFTR gene.In many cases their penetrance remains unclear and positive tests for homozygous or compound heterozygous mutation status are of unclear predictive value.Accordingly, two-step testing models for CF newborn screening are generally applied.Given the large number of mutations in the CFTR gene, the first step is the analysis of immunoreactive trypsin (IRT) levels in dried blood spots.As IRT testing is associated with poor specificity and positive predictive values, a second test is essential, often a second IRT, a DNA test or a combination

Review article
Swiss Med Wkly.2010;140:w13061 Swiss Medical Weekly • PDF of the online version • www.smw.ch of all these.The implementation of CF newborn screening as well as the screening protocols adopted vary widely across Europe [68,69].
Neonatal screening for alpha-1-antitrypsin deficiency (AATD) provides insight into the risks and benefits of genetic testing for late onset disorders [70].AATD is an autosomal co-dominant genetic disorder.Various mutations of the SERPINA 1 gene can in part cause liver disease or emphysema, smokers being more prone to develop the latter [71].Early screening and detection of severe AAT deficiency in the AATD-screening programme for newborns in Sweden was found to prevent uptake of smoking among AATD adolescents, but did not affect parents' smoking behaviour [72].In contrast, the α Coded Testing (ACT) study reported persistent smoking among AATD individuals after a positive test result [70].As most AATD subjects have a normal childhood and adulthood, and often have normal life expectancy in the absence of inhaled irritants such as smoking [71], ethical considerations on screening of newborns for AATD nevertheless arise.

Population genetic screening for adult-onset common diseases
Many novel genes with common low-penetrance variants and small relative and population-attributable risks for chronic adult-onset disorders are currently being identified.Their net benefit for population screening remains largely unresolved.Hereditary haemochromatosis, a prevalent inherited condition chiefly caused by a single mutation in the HFE gene, was long viewed as the "poster child" for population genetic screening [59,73].Clinical symptoms in haemochromatosis (i.e., fatigue, arthritis, impotence, cirrhosis, diabetes, cardiomyopathy) are the result of iron overload and can be prevented efficiently and at low cost by venesection.Initially it was believed that most subjects homozygous for the HFE mutation would ultimately develop haemochromatosis.Results from longitudinal studies now suggest much lower penetrance of the mutations, especially among women due to their monthly blood loss.This probably changes the cost-benefit balance of population HFE screening.If considered at all, screening should be target to men only or to specific age groups.
Despite a large number of recent studies linking various genetic variants to cardiovascular disease or type 2 diabetes, it is difficult to derive a net benefit from this new information.Adding information about novel genetic variants associated with the risk of developing these diseases to establish risk scores failed to improve risk prediction beyond that of obesity, smoking, cholesterol levels and family history [74][75][76][77].Independent studies and evaluations are currently underway to assess the clinical utility of these tests (www.egappreviews.org/).

Knowledge synthesis and cost-effectiveness also necessary in genomics
To promote the integration of validated genomic knowledge into medical and public health practice it is imperative to perform state of the art evidence synthesis [57].Consensus guidelines to assess the credibility of genetic associations have been defined and use three criteria: a) amount of evidence, b) replication of associations, and c) protection of observed associations from bias [78].RCTs to assess the clinical utility of genomic testing are very few in number, since assessing the effect of a genetic test on disease-specific morbidity and mortality may be related to ethical concerns.Randomised trials could, however, be beneficial in answering key questions of relevance to population screening [57], such as assessment of the differences in the effectiveness of lifestyle or therapeutic interventions between genetic subgroups.In assessing the clinical utility of a genetic test, non-medical benefits and harms are also relevant.Test information can be of individual use ("personal utility") in the absence of effective medical intervention.Genetic testing to identify people at elevated risk of Alzheimer is not paralleled by effective interventions to prevent the disease.Yet in the Risk Evaluation and Education for Alzheimer disease study (REVEAL) some subjects testing positive found the information helpful in preparing themselves and their families for potential development of the disease at a later age [79].
Future decisions about the allocation of sparse health care resources will require structured economic assessments [63].The economic burden will even increase if testing is left to the free market.Subjects opting to supply their DNA to companies offering direct-to-consumer genetic testing (i.e., 23andme (www.23andme.com/)or Navigenics (www.navigenics.com)are likely to seek medical advice and, potentially, further testing and screening.For this and other reasons the expert committee on genetic testing in humans has published a warning against direct-toconsumer genetic testing which does not comply with legal standards in Switzerland (www.bag.admin.ch/themen/medizin/00683/02724/04638/07332/index.html?lang=de).

A national screening commission for Switzerland
A central, multidisciplinary screening commission in Switzerland would help ensure that the introduction of population-based screening is evi-dence-based and safeguard against medical and non-medical harms.An expert panel for screening would provide essential skills in evaluating the existing evidence in a structured way, determining the structure of screening programmes and the provision of balanced information, and deciding on additional data needs before and after implementation of a screening programme.
A screening commission in Switzerland would have international and national roles.International multi-centre trials are often needed to generate the study sizes needed to obtain definitive evidence about the efficacy and effectiveness of screening.The complexity of screening-related issues might also necessitate collaboration with other international commissions.At the national level, data are also required for policy development and implementation including estimating prevalence of disease and risk factors (genetic and non-genetic), investigating cultural acceptance of screening and informed consent procedures, evaluating the psychosocial impact, availability of adequate health care services for screening and follow-up interventions and economic evaluation.A screening commission will therefore also have a central role in stimulating and directing research topics and infrastructure.National registration of diagnosis, large, internationally harmonised clinical and population cohorts and biobanks, as well as research on screening-related aspects of communication, behavioural and social aspects are fundamental to obtaining data for policy decisions in the area of population screening.
There is no specific advisory body on screening in Switzerland at present.Two federal commissions deal with some screening-related issues: The Federal Commission for Medical Services (Eidg.Leistungs-und Grundsatzkommission) advises the Federal Department of Home Affairs on the reimbursement of specific procedures and services, including screening tests, in the context of compulsory health insurance; and the Federal Commission on Genetic Tests will be confronted with various issues surrounding predictive genetic tests.Neither commission, however, evaluates the screening interventions themselves.Screening is not the only complex medical intervention for which comprehensive evaluation is needed to advise healthcare professionals, health policy decision makers and the public.There are at least three other specialised federal commissions in Switzerland giving advice to professionals and the authorities on the adoption of new interventions after indepth evaluation: immunisation; AIDS-related questions; and tobacco, alcohol and drug use.These commissions have their own budgets for communication and the conduct of evaluations and assessments.
Health technology assessment (HTA) provides an established methodological framework for evaluation that is well suited to the assessment of a complex populationbased intervention such as screening (Box 2: HTA defini-

Box 2
Characteristics of HTA.
tions [80]).HTA summarises available information on clinical and cost effectiveness as well as on societal aspects of health technologies; the results are directed mainly to decision makers at the institutional, administrative and political levels in assisting decision-making on implementation, financing or reimbursement.The following example illustrates the application of HTA to screening.In 1998 the British Columbia, Canada Advisory Council on Women's Health asked the British Columbia Office of Health Technology Assessment (BCOHTA) to: review current practice on triple marker screening (TMS) in pregnancy for early diagnosis of Down's syndrome; assess the performance of the tests; and critically examine the broader social, ethical and economic implications of establishing a TMS programme in British Columbia.BCOHTA used a variety of methods to address the research questions.Quantitative methods included a systematic literature review, analysis of routinely collected data, and economic modelling.Qualitative methods included focus group interviews with parents and caregivers of children with Down's syndrome or spina bifida, genetic counsellors, and primary care providers.The authors then formulated several policy options on how to offer and organise TMS in this Canadian province [81].

Conclusion
There is increasing pressure from politicians, healthcare providers and the public to make new screening tests available even in the absence of RCT evidence.Opportunities for screening are bound to increase in view of the increasing prevalence of degenerative diseases and due to technological advances in diagnostic and therapeutic procedures.There are good examples showing that effective screening may have a profound impact on the population's health, e.g., in cardiovascular disease (screening for hypertension or for dyslipidaemia) or on congenital malformation (e.g., aneuploidias).However, providing a new screening test on its own does not automatically result in a health benefit for the population screened.
The examples of PSA, breast cancer and genetic screening show that, in Switzerland, there is a deficit in the structure of scientific advice to the population, healthcare providers and the authorities.It is therefore time for Switzerland to follow the example of other countries.The Swiss healthcare system needs a national screening commission that is not influenced by vested interests and is mandated to conduct HTA on specific screening-related questions, to give advice to the public, clinicians, and decision makers, to issue recommendations and to supervise the performance of the screening programmes that are introduced.Without such an explicit effort, there is a danger in Switzerland that some beneficial screening programmes will be neglected and other ineffective, inefficient and potentially harmful screening procedures introduced.

Funding / potential competing interests
No funding; no competing interests.