Forecasted trends in vaccination coverage and correlations with socioeconomic factors: a global time-series analysis over 30 years

Background Incomplete immunisation coverage causes preventable illness and death in both developing and developed countries. Identiﬁ cation of factors that might modulate coverage could inform eﬀ ective immunisation programmes and policies. We constructed a performance indicator that could quantitatively approximate measures of the susceptibility of immunisation programmes to coverage losses, with an aim to identify correlations between trends in vaccine coverage and socioeconomic factors. Methods We undertook a data-driven time-series analysis to examine trends in coverage of diphtheria, tetanus, and pertussis (DTP) vaccination across 190 countries over the past 30 years. We grouped countries into six world regions according to WHO classiﬁ cations. We used Gaussian process regression to forecast future coverage rates and provide a vaccine performance index: a summary measure of the strength of immunisation coverage in a country. Findings Overall vaccine coverage increased in all six world regions between 1980 and 2010, with variation in volatility and trends. Our vaccine performance index identiﬁ ed that 53 countries had more than a 50% chance of missing the Global Vaccine Action Plan (GVAP) target of 90% worldwide coverage with three doses of DTP (DTP3) by 2015. These countries were mostly in sub-Saharan Africa and south Asia, but Austria and Ukraine also featured. Factors associated with DTP3 immunisation coverage varied by world region: personal income (Spearman’s ρ=0·66, p=0·0011) and government health spending (0·66, p<0·0001) were informative of immunisation coverage in the Eastern Mediterranean between 1980 and 2010, whereas primary school completion was informative of coverage in Africa (0·56, p<0·0001) over the same period. The proportion of births attended by skilled health staﬀ correlated signiﬁ cantly with immunisation coverage across many world regions. of failing to achieve the GVAP target of 90% coverage by 2015, and could aid policy makers’ assessments of the strength and resilience of immunisation programmes. Weakening correlations with socioeconomic factors show a need to tackle vaccine conﬁ dence, whereas strengthening correlations point to clear factors to address.


Introduction
Rates of vaccine-preventable diseases have decreased in many parts of the world in the past few decades, but many children remain unvaccinated. In 2013, UNICEF reported that 21·8 million children younger than 1 year had not completed the diphtheria, tetanus, and pertussis (DTP) immunisation series, with a similar number not receiving a single measles vaccination. 1 Although access to vaccinations is the main barrier in many settings, growing numbers of parents do not immunise their children based on their own personal attitudes. 2,3 Both socioeconomic and attitudinal barriers to vaccination coverage have been identifi ed in several countries. A set of recurring socioeconomic and demographic correlates of coverage have been reported, with parental education level, [4][5][6][7][8] age, 6,9,10 employment status and workplace, 4,6,7 religion, 11 ethnic origin, 12 gender of the child, 13 poverty, [14][15][16][17][18] and distance to health-care facilities 4,6,11,14,17,18 all linked to vaccine uptake, although marked diff erences often exist between countries. 11 Repeating themes have likewise been identifi ed in examination of personal reasons for vaccine acceptance or delay, which range from perceived risks about potential adverse events, to religious or political beliefs that are external to, albeit infl uential in, vaccination. 9,[19][20][21][22][23][24][25][26][27] Trust in health-care professionals 19,20,23,24 and the government 19,23,26 also features highly.
Recommendations to address gaps in vaccine coverage are context dependent. Targeting of mothers with low education, 28 low health literacy, 29 and dissemination of more information to communicate vaccine benefi ts and risks transparently 2 have been suggested to aid in improvement of vaccine hesitancy, as has the customisation of messages and engagement eff orts for specifi c groups. 30 Effi cient identifi cation of clusters of non-vaccinators through monitoring of local immunisation rates has been identifi ed as a key public For Gapminder see www.
gapminder.org health challenge for the prevention of outbreaks when local groups adopt a non-vaccination status (as was the case in the 2014-15 measles outbreak at Disneyland, Anaheim, CA, USA). 31 Monitoring of trust in immunisation programmes through various indices has also been proposed as a way to better understand the circumstances in which vaccine hesitancy might arise. 32,33 To build on the current literature, we constructed a vaccine performance index: a summary measure of the strength of immunisation coverage in a country. This index could be used to describe the variability in vaccine coverage induced by confi dence and access issues, allowing for an intercountry analysis of trends. We interpreted the coverage values in view of the Global Vaccine Action Plan (GVAP)'s target of attaining 90% vaccination coverage in all countries by 2015. 34 We discuss the implications of both the correlative study and the performance index on immunisation strategies. We undertook a large-scale, worldwide correlative analysis to investigate the association between a range of socioeconomic factors and vaccine coverage, and to identify temporal trends and diff erences between WHO world regions.

Data
We examined a broad range of quantitative data for 190 descriptive socioeconomic factors for 190 countries from 1980 to 2010. Factors spanned economics, health care, industry, demographics, communications, infrastructure, physical geography, trade, and education. This breadth of socioeconomic factors was used to expose, in an unbiased way, which indicators might be related to immunisation coverage. Countries were grouped into six world regions according to WHO classifi cations: Africa, the Americas, the Eastern Mediterranean, Europe, South-East Asia, and the Western Pacifi c. 35 We obtained data from the Gapminder website, which draws from sources including the World Bank, the International Labour Organization, and WHO. A data curation approach was used to fi lter and impute missing data (appendix).
Data for vaccine coverage (the proportion of a target group immunised) were obtained for BCG tuberculosis vaccine, DTP1 and DTP3 (representing the fi rst and third doses, respectively, and typically scheduled for 6 months after birth), measles-containing vaccine (MCV), and polio (POL3, representing the third dose of the polio vaccination

Research in context
Evidence before this study We searched Google Scholar between Jan 1, 1980, and Jan 1, 2016, for studies assessing socioeconomic correlates of vaccine coverage. Our search terms included the words "socio-economic", "demographic", "correlates", and "vaccine coverage", with variations in each individual term allowed. For example, "vaccine" could be with "vaccination" or "immunise" with "immunisation", in addition to specifi c names of vaccine-preventable diseases such as "DTP" and "MMR"; alternative spellings, such as "immunize", were also allowed. For attitudinal correlates of vaccine coverage, we used search terms such as "vaccine hesitancy", "vaccine acceptance", and "vaccine delay". We identifi ed few publications that examined diff erences in both attitudinal and socioeconomic correlates between countries, although we found some recent reviews that assessed diff ering attitudes towards vaccination uptake across Europe, and diff ering socioeconomic correlates in east African countries. Both attitudinal and socioeconomic surveys reported various correlates of vaccine uptake. Attitudes range from personal, religious, and political beliefs, to trust in health care and the government. Socioeconomic correlates vary between countries, but education level, religion, ethnic origin, and distance to health-care facilities are all recurring themes.

Added value of this study
We studied correlations between 190 socioeconomic factors and immunisation coverage in 190 countries, and report variations in the strength of socioeconomic correlates between world regions and across time. To our knowledge, this work represents the broadest analysis of socioeconomic links to vaccine coverage yet available, allowing insights into how socioeconomic correlates modulate coverage and how these correlates vary by world region. Forecasting of vaccine coverage time series enabled us to summarise the recent trend and variability in uptake rates and construct a vaccine performance index, which is, as far as we know, the fi rst quantitatively derived marker of vaccine performance. Our predictive performance index represents a distinctive, interpretable measure to assess likely future vaccine coverage behaviour and resilience of immunisation programmes to volatile changes triggered by external shocks, whether driven by political rumour or natural disaster.

Implications of all the available evidence
By use of a probabilistic forecast of future vaccination levels, we have provided a world map of a vaccine performance index that is informative of both the likelihood of stagnated or substandard coverage levels and the chance coverage levels will decline. These forecasts can easily identify countries that are likely to fail to achieve the Global Vaccine Action Plan target of 90% coverage of diphtheria, tetanus, and pertussis vaccine by 2015. We speculate that regions with strong socioeconomic ties to vaccination coverage can be interpreted as targets for intervention (because socioeconomic factors present barriers to vaccination), but that some regions (namely, Europe and the Americas), with low ties to socioeconomic factors, have increased attitudinal barriers to vaccination.
See Online for appendix programme) from the 2011 WHO-UNICEF estimates of vaccine coverage 36 for the same countries and across the same time period, although we used the latest estimates in our vaccine performance index. Immunisation rates have increased across all WHO regions in the past three decades: coverage is highest in Europe and the Americas, with other regions achieving less complete coverage (fi gure 1A). Joint investigation of all vaccine time series showed that strong mutual correlations exist between the DTP1, DTP3, MCV, and POL3 vaccines, but that BCG falls outside this grouping (fi gure 1B). Time series of DTP3 coverage in individual countries diff er substantially in magnitude, variability, and trends (fi gure 1C). We henceforth focused on DTP3 coverage because of this high correlation and in accordance with previous literature in which DTP3 is regarded a marker of the strength of a country's immunisation programmes (because the vaccine requires three diff erent administrations). 37 A correlative analysis of the other vaccines is presented in the appendix.
We illustrate time-series data for socioeconomic factors with two examples: female education (fi gure 1D) and rural access to water (fi gure 1E). Figure 1F shows an example scatter plot of average DTP3 coverage against female education levels for three regions. We further explored the dataset by use of t-distributed stochastic neighbour embedding (t-SNE), which is a local structure-preserving clustering technique that can improve visualisation quality of high-dimensional data over other clustering algorithms. 38 We used t-SNE for visualisation of a similarity matrix, constructed by taking correlations across all countries between all pairs of socioeconomic factors and vaccines in a single year (x it [ for socioeconomic factor or vaccine i in year t]) and averaging the correlation across all 31 years: where S ij are elements of the similarity matrix S, T is the total number of years, and ρ is the Spearman's rank correlation between two socioeconomic factors (or vaccines). Application of t-SNE showed strong links between socioeconomic factors in similar categories and revealed a separation of the BCG vaccine, which is closer in t-SNE space to factors relating to health spending, from the other four vaccines, which seem to be more correlated with schooling variables (fi gure 1G).

Correlative analysis
We used Spearman's rank correlation to explore links between coverage and socioeconomic factors (fi gure 1F) because coverage data are confi ned to the 0-100 interval and often non-linear relations arise between coverage and socioeconomic factors (which might be ordinal). To investigate the strengths of correlations with coverage over time, we computed a time-averaged correlation between each factor (x t ) and coverage (y t ) in a given year (t) for all countries in a particular region ρ(x t , y t ), and we dropped the index i referring to socioeconomic indicator; we then took the time-average of this value: where t 1 is the most distant year and t 2 is the most recent year. For historic trends, we considered 1980 as t 1 and 2010 as t 2 and, for recent trends, we considered 2001 as t 1 . To focus on the strongest signals, we agglomerated the top-ten correlating socioeconomic factors in each region across both time ranges for a cross comparison of informative correlates between regions; we restricted these socioeconomic factors to those with a Bonferroni-corrected meta p value (against the null hypothesis that the set of correlations were uninformative) of less than 0·01. In addition to these socioeconomic factors, we included other factors that are related and useful for comparison: primary school completion rate (male), improved sanitation access (overall and rural), and improved urban water access (appendix).

Forecasting
We fi rst logit-transformed vaccine coverage data: where y' is transformed vaccine coverage and y is raw coverage, to map the coverage values to the real line, so that Gaussian distributions over datapoints were not truncated at values of 0 or 100. We used Gaussian processes, with a squared exponential covariance function and linear mean function (to account for the broadly linear increases in trend in this transformed space), to forecast vaccine coverage data, based on previous coverage values. The appendix summarises the prediction accuracy of this method, which is quantifi ed by forecasting on a test dataset. We used the predictive distribution over forecasted values to defi ne a summary measure-the vaccine performance index-which balances the (desirable) probability of attainment of high future coverage against the (undesirable) probability of experiencing a future negative perturbation to coverage, informed by preceding trends and variability. We defi ned the vaccine performance index as the probability of achieving of coverage in excess of a threshold value (v), less the probability of a drop in coverage of size at least d: where VPI is the value of the vaccine performance index at time t * , t N is the most recent point for which coverage exists, and t * =t N +τ is the forecasted point (so that, for example, t N =1999 and t * =2000 for a country's 2000 vaccine    performance index value) and for which we mapped the predictive distributions back to the range 0-100. This expression is reweightable if the reader or policy maker wishes to focus on a diff erent combination of strategic properties. There are natural alternative forms to this index; for example, to assess the GVAP goal of 90% coverage in all countries by 2015, we set τ equal to 2 years, v equal to 90, and d equal to infi nity, thus using our model to forecast the probability that coverage in 2015 will exceed 90%. We used MATLAB (version 2015a) for all analyses.

Role of the funding source
The funder of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. The corresponding authors had full access to all the data in the study and had fi nal responsibility for the decision to submit for publication.

Results
We assessed countries' trends and variability by considering the probability of coverage exceeding 95% minus the probability of coverage falling by more than 2% 1 year in the future (ie, v=95, τ=1, and d=2); the appendix details the justifi cation for these parameter values and alternative forms of this index, which might be natural alternatives that address diff erent policy-making goals. Figure 2A 2, appendix). Surprisingly, many European countries have struggled to increase coverage to top levels: Norway, Iceland, Ukraine, and Romania, despite reasonably high DTP3 coverage rates in 2013 (83%, 91%, 94%, and 89%, respectively) all have vaccine performance index values far from +1 (fi gure 2, appendix), signifying coverage rates that are below 95%, and show little sign of improvement beyond this threshold. Findings for Africa are mixed: north African countries have high vaccine performance index values, whereas countries in sub-Saharan Africa generally have low values due to high variability and low coverage, with some exceptions, notably Rwanda, Burundi, Botswana, and Zimbabwe (fi gure 2, appendix). These low and volatile coverage rates are likely to be a sign of the vast number of people in sub-Saharan Africa with poor continued access to latter doses in the DTP programme.
South-East Asia generally performs poorly; many countries have vaccine performance values close to -1 (fi gure 2, appendix), suggesting that chances of a decline in coverage in the following year are high. Many Eastern Mediterranean countries have shown clear improvement in the past decade, with countries typically presenting high, sustained coverage rates, although Syria, Iraq, Pakistan, Afghanistan, and Yemen are notable exceptions (fi gure 2, appendix).
To assess the GVAP goal of 90% coverage in all countries by 2015, we used our vaccine performance index to forecast the probability that coverage in 2015 (2 years after the most recent available datapoints, in 2013) will exceed 90%. This analysis yielded a bimodal distribution (fi gure 3), whereby countries will either comfortably reach this goal or fall rather short of it (with the exception of South America, which has many borderline countries). European countries generally perform better than other regions (with the notable exceptions of Denmark, Iceland, Romania, Austria, Moldova, San Marino, and Ukraine), with an almost certain chance of reaching the GVAP goal (on the basis of their recent trend; fi gure 3). Countries in sub-Saharan Africa and the Indian subcontinent (with only few exceptions) seem likely to drastically miss the GVAP target (fi gure 3). This inference and forecasting approach highlights not only regions that require stronger action, but also borderline countries where comparatively minor action might realise the GVAP goal.
From the broad range of 190 socioeconomic factors investigated, those shown in fi gure 4A can be interpreted as the most likely to have a genuine (and consistent) link with DTP3 coverage over time. The strength of correlation between socioeconomic factors and DTP3 coverage varies between region, over time, and across socioeconomic factors (fi gure 4A). Births attended by skilled health staff and access to sanitation are the two factors that have mean-averaged p values of less than 0·05 across the largest number of regions (fi gure 4A). In Africa, the Americas, and Europe, the magnitude of correlations are high historically, but have decreased recently, and display patterns that are broadly similar: the importance of access to water and sanitation, births attended by skilled health staff , and economic factors in determining coverage is historically high, but diminishing (particularly in the Americas; fi gure 4A). An exception to this fi nding is the correlation with the service industry (the service socioeconomic factor is the net output of the service industry, defi ned as wholesale and retail trade, transport, and other governmental services such as education and health care 39 ), which is reasonably high and increasing in Africa and the Americas. By contrast, in the Eastern Mediterranean and the Western Pacifi c region, socioeconomic factors have remained correlated over time (fi gure 4A). The South-East Asia region has a more mixed pattern of recent changes in socioeconomic connections with, for example, a decreasing link to rates of primary school completion and an increasing link to water access and sanitation (fi gure 4A).
In both Africa and the Eastern Mediterranean, educational variables and access to water are particularly informative about DTP3 coverage, whilst in the Eastern Mediterranean alone, government health spending, income metrics, access to sanitation, phone use, and CO 2 emissions are strongly linked with coverage (fi gure 4A). Socioeconomic correlates are generally low (both recently and historically) in Europe, and are reasonably low in the Americas (fi gure 4A). In South-East Asia, education ratio and access to water are notably high correlates that have not decreased in importance (fi gure 4A). In the Western Pacifi c region, births attended by skilled health staff is the strongest correlate (and one that appears to be becoming more informative of DTP3 coverage; fi gure 4A).
The association between public health spending and coverage varies globally, with a strong link in the Eastern Mediterranean and weaker links in South-East Asia, Africa, and the Western Pacifi c (where the link is increasing), and the Americas (where it is decreasing; fi gure 4A). Only in the Eastern Mediterranean is gross domestic product per capita the strongest correlate of vaccine coverage; in all other regions the strongest links are with less directly economic factors (fi gure 4A).
We noted a signifi cant diff erence in the correlation strength between DTP3 coverage and male and female education in Africa and South-East Asia (fi gure 4B; p value from paired t tests <0·0001 for both); correlations were not signifi cant for the other regions at the p=0·05 level, but paternal education was more informative in the Western Pacifi c.
We plotted these correlations for the other vaccinations we examined; some notable observations arose from the comparison of these other vaccines with DTP3 (appendix). One regional trend was that the eff ect of public health spending and access to sanitation in South-East Asia, which was moderate for DTP3, was very high and often increasing for other vaccines. Another observation was that the Eastern Mediterranean, where we recorded stable or decreasing correlations with several socioeconomic factors for DTP3, instead showed increasing links with all factors (except investments) for BCG. The Americas, which had decreasing correlations for most factors with DTP3, showed decreases to almost Europe-like absences of coverage correlation in DTP1 and MCV, although all other regions and factors had similar patterns for DTP3 and MCV coverage. Europe is broadly the same across all vaccines except BCG, for which correlations seem to be negative between factors such as government health spending, sanitation access, income, and BCG coverage. All estimated correlations in the study were robust to our imputation procedure (appendix).

Discussion
In keeping with fi ndings from previous studies, our study found several socioeconomic factors to be informative of immunisation levels, such as correlates related to health-care facilities 3 and educational variables. [4][5][6][7][8] Literature from the past few years has indeed identifi ed various socioeconomic and attitudinal barriers to vaccination coverage for a number of countries, such as parental education level, 4-8 age, 6,9,10 employment status and workplace, 4,6,7 poverty, [14][15][16][17][18] inequity, 40 and distance to health-care facilities. 4,6,11,14,17,18 Findings from a systematic review 41 emphasised the link between out-of-hospital birth and reduced immunisation rates in countries with medium or low Human Development Index (HDI) scores, and associated the use of private health-care services and no health insurance with low immunisation levels in countries with very high HDI scores. We likewise found a link between income and births attended by skilled health staff and immunisation levels. We believe we are the fi rst to identify several other potential barriers to immunisation coverage; for example, the link between governmental health spending and primary school completion rates (in addition to income) across Eastern Mediterranean countries, and access to water and primary school completion rates in Africa. We suggest that having a set of low-magnitude socioeconomic correlations with vaccine coverage corresponds to an encouraging state wherein vaccine access is available to most of the population. Conversely, when strong socioeconomic correlations exist, they signal potential limiting factors in vaccine access, although further analyses would be needed to infer causation between these factors and coverage. Hence, Europe, which has displayed consistently low socioeconomic correlations, enjoys high access to all vaccines; however, it suff ers from non-infrastructural barriers to vaccination that centre on religious or philosophical beliefs and perceived risks rather than access, which we suspect is by contrast with other regions in which access is the main barrier. 3,42 Notably, the Americas has shown low (and decreasing) socioeconomic correlations with DTP1, MCV, and BCG, but higher correlations (although currently decreasing) associated with DTP3. This pattern suggests a transitional stage wherein vaccines in the region are broadly accessible, but that some factors (including attendance of medical staff at births) are restricting uptake of DTP3. This notion seems to be supported in the literature, which cites a range of socioeconomic determinants (including refusal from groups with higher socioeconomic status, 43,44 personal beliefs, and low access to health-care facilities) as barriers to vaccination. 3 We note the strength and consistency of births attended by skilled health staff as an informer of DTP3 coverage rates, and highlight the potential of this factor for use as a proxy indicator of the condition of a health-care system.
We speculate that regions where correlations are strengthening can be interpreted as more pressing targets for intervention. For example, the ratio of girls to boys in primary education, public health spending, and sanitation in South-East Asia and the Eastern Mediterranean; rural access to clean water (which is widely believed to be able to prevent polio in some parts of Nigeria 27 ) and the presence of skilled health staff in Africa, where a potential surrogate, children born in a public or private health-care facility, is consistently linked to completion of routine vaccinations. 11 Although an aim of the present study was to provide a global overview of correlates of immunisation coverage, we note several limitations. Regional variability is likely to exist within countries and also between other sets of countries grouped by regions other than those defi ned by the WHO. A large-scale global analysis investigating regional trends and identifying factors associated both within and between countries could therefore uncover more nuanced trends. Higher-frequency data from social media sources might be able to predict trends in vaccination uptake behaviour more accurately than can lower frequency national coverage data. The vaccine performance index methodology can be naturally integrated with models incorporating high-frequency time-series predictive variables to more accurately forecast immunisation levels as high-frequency data for vaccine sentiments becomes increasingly available. Such fi ne-grained analyses could facilitate the use of time lags in exploration of predictive models of vaccine coverage using combinations of high-frequency social media data and low-frequency socioeconomic factors.
Review of historical trends and variability in vaccination rates can aid policy makers' assessments of the strength and resilience of immunisation programmes. The 2014 GVAP report notes that a third of the world's 194 countries were failing to reach 90% coverage in 2013, 45 but these static summaries conceal both the vaccination track record and the possible future immunisation levels. A WHO report 46 notes that 65 countries had failed to achieve either "90% coverage nationally or 80% coverage in every district or equivalent administrative unit against diphtheria, tetanus and whooping cough by 2015", which is in line with predictions from our vaccine performance index.
We thus present our vaccine performance index as a tool for future research, which could enhance GVAP assessment reports for future targets by providing predictive measures that account for coverage trends and variability.

Contributors
All authors conceived and designed the study, contributed to data interpretation, and wrote the manuscript. AdF and HJL did the literature search. AdF collected the data. AdF and DMDS created the fi gures. AdF and IGJ did the data analysis.

Declaration of interests
HJL has received personal fees from GSK; honoraria for attending advisory board at meetings from Merck Vaccines; and grants from the UK National Institute for Health Research, the EU Innovative Medicines Initiative, Novartis, the Centre for Strategic and International Studies, and WHO, outside the submitted work. All other authors declare no competing interests.