Cohort Profile Cohort Profile : the Million Women Study

The Million Women Study started recruiting participants over 20 years ago, in 1996. The initial stimulus was to obtain robust prospective information on the risk of breast cancer associated with use of different types of menopausal hormone therapy (HT). When planning the necessary largescale prospective study, an equally important aim was to obtain reliable information on the effects of other potentially modifiable factors that affect women’s health as they age. In the early 1990s use of HT increased rapidly in the UK and elsewhere, stimulated in part by claims that use of HT could improve general well-being and increase life expectancy. By the mid-1990s, however, worldwide evidence was beginning to show that HT preparations increased breast cancer risk, though there was little information about the effect of the type of HT most commonly used in Europe, containing both oestrogens and progestagens. It was also clear that women born in the 1940s, who reached adulthood in the 1960s, had considerably different lifestyles compared with previous generations. For example, large proportions had begun smoking and using oral contraceptives as teenagers and young adults, and the long-term effects of these behaviours could not be studied reliably until the 1990s. At the same time there was growing concern about the effects of the increasing prevalence of obesity, and claims that other factors such as diet had important effects on health, all of which required largescale prospective evidence. The UK National Health Service (NHS) provides extraordinarily efficient ways of establishing and maintaining long-term follow-up for large prospective epidemiological studies. Over 99% of the UK population, and all Million Women Study participants, are registered with the NHS, and every individual has a unique NHS number. Electronic linkage, using each individual’s NHS number, to routinely collected NHS databases provides virtually complete follow-up information about deaths, emigrations, cancer registrations and hospital admissions. The NHS Breast Screening Programme invites all UK women registered with the NHS, of a specified age, for free routine breast screening every 3 years. In 1996–2001 the programme routinely invited women aged 50–64 years for mammographic screening, by sending each individual a letter offering them a specific date and time at a specific screening centre. In 66 NHS screening centres, the Million Women Study recruitment questionnaire was included with the invitation letter for screening. Pilot studies in 1994–96 had shown that inclusion of a questionnaire with the invitation did not affect uptake of breast screening. The coordinating centre for the Million Women Study is based in the Cancer Epidemiology Unit, Nuffield Department of Population Health, University of Oxford.

The Million Women Study started recruiting participants over 20 years ago, in 1996. The initial stimulus was to obtain robust prospective information on the risk of breast cancer associated with use of different types of menopausal hormone therapy (HT). When planning the necessary largescale prospective study, an equally important aim was to obtain reliable information on the effects of other potentially modifiable factors that affect women's health as they age.
In the early 1990s use of HT increased rapidly in the UK and elsewhere, stimulated in part by claims that use of HT could improve general well-being and increase life expectancy. By the mid-1990s, however, worldwide evidence was beginning to show that HT preparations increased breast cancer risk, though there was little information about the effect of the type of HT most commonly used in Europe, containing both oestrogens and progestagens. 1 It was also clear that women born in the 1940s, who reached adulthood in the 1960s, had considerably different lifestyles compared with previous generations. For example, large proportions had begun smoking and using oral contraceptives as teenagers and young adults, and the long-term effects of these behaviours could not be studied reliably until the 1990s. At the same time there was growing concern about the effects of the increasing prevalence of obesity, and claims that other factors such as diet had important effects on health, all of which required largescale prospective evidence.
The UK National Health Service (NHS) provides extraordinarily efficient ways of establishing and maintaining long-term follow-up for large prospective epidemiological studies. Over 99% of the UK population, and all Million Women Study participants, are registered with the NHS, and every individual has a unique NHS number. Electronic linkage, using each individual's NHS number, to routinely collected NHS databases provides virtually complete follow-up information about deaths, emigrations, cancer registrations and hospital admissions.
The NHS Breast Screening Programme invites all UK women registered with the NHS, of a specified age, for free routine breast screening every 3 years. In 1996-2001 the programme routinely invited women aged 50-64 years for mammographic screening, by sending each individual a letter offering them a specific date and time at a specific screening centre. In 66 NHS screening centres, the Million Women Study recruitment questionnaire was included with the invitation letter for screening. Pilot studies in 1994-96 had shown that inclusion of a questionnaire with the invitation did not affect uptake of breast screening. 2 The coordinating centre for the Million Women Study is based in the Cancer Epidemiology Unit, Nuffield Department of Population Health, University of Oxford.
The study was set up in collaboration with the NHS Breast Screening Programme, and is now funded mainly by the UK Medical Research Council and Cancer Research UK. Further information, including the study protocol, copies of questionnaires, data collected, data access policy and list of publications can be found on the study website [www.millionwomenstudy.org].
Who is in the cohort?
Between 1996 and 2001, the Million Women Study recruited about one in every four UK women born in 1935-1950, i.e. in the eligible age range (50-64 years) at the time of recruitment. The 66 NHS breast screening centres that recruited participants ( Figure 1) covered about half of the UK population. Half of the women invited by the participating screening centres brought completed questionnaires with them when they were screened, or posted their questionnaires to the Million Women Study Coordinating Centre. The recruitment questionnaire asked about sociodemographic, anthropometric, behavioural and reproductive factors, and also about women's past health. Participants gave written consent for re-contact and for follow-up through screening clinic and other medical records.
The study design and methods were first reported in 1999. 3 Selected characteristics of the cohort at recruitment, and details of subsequent follow-up, are shown in Table 1. The cohort included 1.32 million women without previous cancer. The median year of birth of the cohort was 1942, and at recruitment in 1996-2001 women were aged 56 [standard deviation (SD) 5] years on average. The study includes women with a wide range of backgrounds, behaviours and lifestyles at recruitment and their characteristics were, not surprisingly, broadly similar to those of all UK women of their age at the time. A fifth were current smokers at recruitment. Typical of UK women of their age, almost a quarter reported that they did not drink alcohol, and among the drinkers the average consumption was about five drinks per week. The cohort includes some 150 000 women (11%) from the lowest quintile of the national deprivation index, based on the Townsend score, 4 and so although the proportion is somewhat less than the national average, there are sufficiently large numbers to study reliably associations across the full range of socioeconomic status in the UK.

How often have they been followed up?
The Million Women Study is an open-ended prospective study of women in England and Scotland. The entire cohort is followed up annually by record linkage to routinely collected NHS data on deaths, emigrations, cancers and hospital admissions. For some study participants, additional electronic linked health data are available, for example for primary care consultations and prescriptions, and for cancer screening. Table 2 gives a summary of the type of health and health care follow-up data routinely collected, and of data providers.
Data on deaths and hospital admissions are available to 1 April 2017. By that date, 14% (185 233) of women had died, and 85% (1 128 056) had had at least one admission to hospital. Only 1.4% (n ¼ 19 705) of the cohort had been lost to follow-up by emigration, withdrawal from the NHS or for some other reason. Such women are included in relevant analyses until the date of their withdrawal from follow-up. Data on registered cancers have been provided to 1 January 2016, and 15% (n ¼ 201 988) of women had an incident cancer (excluding non-melanoma skin cancer) registered by this date.
Thus far, four re-survey postal questionnaires (at 3, 8, 12 and 15 years, on average, after recruitment) have been sent to all survivors, to obtain information on important factors that may change over time, such as smoking, alcohol consumption, weight and physical activity, and to collect new information on other exposures. Information has also been collected for subsets of participants through additional postal and online questionnaires, e.g. for diet and daily activities.
What has been measured?
At recruitment, and through the subsequent four re-survey questionnaires, information on about 1400 variables has been collected, about a range of sociodemographic, lifestyle and other personal factors. The questionnaires, and a summary table of the information collected at each, can be viewed on the study website. Some characteristics of women who responded to the 3-year and 8-year re-surveys are shown in Table 3.
Various validation and other studies have been done in subsets of the cohort to address methodological issues of measurement error, regression dilution and changes in exposures over time. [5][6][7][8][9][10] For example, height, weight, waist circumference, hip circumference and blood pressure were measured for about 4000 women to quantify measurement errors in self-reported data. 5 Self-reported information on menopausal HT use, 6 cervical screening, 7 and knee and hip replacement 8 has been compared with that in NHS records. Online 24-h recall of diet has been collected and repeat dietary questionnaires completed, to assess the repeatability of self-reported dietary data over time. 9,10 Clinical outcomes recorded in routinely collected NHS data have been compared with those recorded in medical notes, primary care records or screening records for breast cancer, 11 vascular disease, 12 motor neurone disease 13 and dementia. 14 All the investigations have indicated the excellent reliability of the routinely collected NHS diagnostic data. Since 2006, blood samples have been collected from about a 5% sample of women in the study for genetic and biochemical analyses, mainly concerning breast cancer. 15,16 What has it found? Key findings and publications Full details of the wide range of findings and publications using Million Women Study data are available on the study website. Two key findings are as follows.
i. We have shown that the risk of cancers of the breast and endometrium vary substantially by the type of HT used. 17,18 Use of oestrogen-progestagen preparations causes much greater increases in the risk of breast cancer than oestrogen-only preparations, whereas the reverse is found for endometrial cancer. For ovarian cancer, use of HT slightly increases risk, but there is no difference in the effects of oestrogen-progestagen and oestrogen-only preparations. 19 Because breast cancer is much more common than endometrial or ovarian cancer, the overall effect of HT on the three cancer types is dominated by the effects on breast cancer. Hence users of oestrogen-progestagen HT have substantially higher absolute risks of the three cancers together than users of oestrogen-only preparations or than women who do not use HT (Figure 2).   ii. We have shown that the hazards of smoking, and also the benefits of stopping smoking, in women are greater than previously thought. 20 The Million Women Study is well placed to estimate these risks, because women born in the 1930s and 1940s were the first generation in the UK to start smoking substantial numbers of cigarettes regularly in early adulthood and to continue to do so throughout their lives. Smokers were three times more likely to die prematurely than never smokers, the equivalent of losing 11 years of life, on average. We also found that stopping smoking is more effective in reducing the excess risk than previously thought; a woman who stops smoking at age 40 avoids about 90% of the excess risk associated with continued smoking ( Figure 3).
What are the main strengths and weaknesses of the study?
Strengths include: the large size of the cohort, which provides sufficient statistical power to study outcomes reliably and to compare risks across disease types and subtypes; the prospective collection of exposure information; and the virtually complete, long-term follow-up for major outcomes through linkage to routinely collected, complete, reliable, national electronic health records. Re-surveys and measurements of various factors provide repeated measures of changing exposures, so that analyses can take account of measurement error, regression dilution and changes in exposure over time. Data linkage offers the potential for continued, cost-effective follow-up for many more years and, as routine health databases expand, for incorporation of additional data. The study has the advantage that data are available to address a wide range of questions, now and in the future. Its observational nature, however, means that it is sometimes difficult to establish causation. The great majority of study participants are of White ethnicity (96%) and the cohort has no information on men. Blood samples are available only for a minority of participants, and were provided several years after recruitment. The current lack of routinely recorded information available for NHS outpatient diagnoses, and the fact that information is currently available from primary care only for a sample of the cohort, mean that there is at present limited information on outcomes not involving hospital admission.
Can I get hold of the data? Can I find out more?
The Million Women Study welcomes proposals for data access and sharing from bona fide researchers; details of the study, the data available and the data access application process can be found on the study website [http://www.mil lionwomenstudy.org/data_access]. Enquiries should be directed through the website to the Administrator, Richard Doll Centenary Archive.