Can We Use Routine Data for Strategic Decision Making? A Time Trend Comparison Between Survey and Routine Data in Mali

Routine data, which is available more regularly than the "gold standard" survey data, can be used to inform programmatic decisions in Mali at the national level. However, caution must be used if using data at a subnational level.

interventions are necessary to evaluate project or program impact and thus guide future efforts. Household survey data-including from U.S. Agency for International Development-funded Demographic and Health Surveys or United Nations Children's Fund-funded Multiple Indicator Cluster Surveys-are generally considered to use "gold standard" methods and, therefore, to produce high-quality population-based data. 1,2 These data are publicly available and free to access. However, because the surveys are expensive to conduct, they are conducted relatively infrequently. Country decision makers must make frequent programmatic decisions and adjustments, and survey data, typically available every 3 to 5 years, cannot be used for this purpose. In addition, while survey data are typically available at the subnational level (e.g., regional level), they are usually not available for smaller geographic areas such as districts. 1 To overcome these limitations, country-level planners often turn to routine data sources to fill these data gaps. Routine data collected by the health system are typically available on a monthly or quarterly basis. If properly collected and managed, they allow stakeholders to observe changes in coverage from year to year that can be important for time-sensitive decision making. At the district health facility level, they serve as an appropriate source for operational decision making. Because they are collected and managed by national staff from the health system, their collection is more sustainable. Finally, routine data are relatively inexpensive to collect. 2 Health information systems (HIS), which rely on routine data, can have significant limitations. Although many countries publish annual reports based on HIS data, 3 it can be difficult to access the data underlying the reports. 4 Some indicators of interest are not collected or are not aggregated in the HIS; when indicators are available, the data are sometimes of poor quality. 1,5,6 In many countries, the HIS is limited to data from public health facilities; data from private facilities are not included. For intervention coverage measures, the denominators-the estimates of the population in need -are often based on census projections. The accuracy of these projections can be affected by factors like the time since the last census and internal population movements.
Despite these well-documented limitations, routine data are often the de facto data source used for programmatic planning in low-and middle-income countries, particularly where no recent household survey exists. Within the context of the Global Affairs Canada-funded National Evaluation Platform-dedicated to improving evidence-based decision making 7-10a team of Malian researchers found that at least 5 major maternal, newborn, and child health and nutrition (MNCHN) programs rely on HIS data as their source of coverage indicator data in Mali. 11 A 2013 evaluation of Mali's HIS concluded that it had poor data quality in general, due in large part to poor data archiving and uneven record keeping. Regional HIS data were also found to be of generally higher quality than district-level data. 12 Reducing maternal, newborn, and under-5 mortality is a priority for the Government of Mali, and data are needed to inform this work. Mali's decennial plan for health and social development (Plan Décennal de Développement Sanitaire et Social 2014-2023) recognizes the need to increase routine data quality, timeliness, and use for decision making at all levels. 13 Given that routine data are widely used for planning and evaluation and to fill the gaps in between household surveys in Mali, it is of interest to decision makers to understand the comparability of these data. While some differences in the levels of coverage indicators between the 2 data sources are expected (because routine data are limited to the public health sector and because of some differences in indicator definitions), it would be useful to know whether the HIS captures the same time trends as population-based surveys. To answer this question, we compared time trends in routine and household survey data from 2001 to 2012 in Mali for 3 indicators to inform the use of routine data by decision makers in Mali.

METHODS
This analysis focused on 3 indicators: modern and traditional contraceptive prevalence rate (CPR), 3 doses of diphtheria, pertussis, and tetanus vaccine (DPT3), and institutional delivery. We focused on these because they were the most complete indicators across regions and years in the HIS and represented a range of services across the continuum of care.

Data Sources and Quality Assessment
We used DHS data collected in 2001, 14  Because routine data are widely used for planning and evaluation and to fill the gaps in between household surveys in Mali, decision makers seek to understand the comparability of these data. For routine data, coverage estimates were obtained from HIS-validated annual reports. Numerators and denominators were double extracted in a standardized format for each indicator, year, region, and at the national level from 2001 to 2012. For each indicator, the numerator as reported in the HIS (e.g., number of institutional deliveries) was independently extracted from the electronic database by 2 different individuals and then compared. Cases of discordance were discussed and verified by returning to the HIS database (Développement Sanitaire du Mali) until consensus was reached among the data extractors. Access to data was facilitated by the fact that the authors carrying out this work were part of Mali's National Evaluation Platform, as Keita et al describe. 7 This group of researchers includes members at the Cellule de Planification et de la Statistique, where HIS data are stored. Table 1 compares HIS and DHS definitions of the 3 indicators, and Table 2 shows changes in indicator definitions in the routine HIS data over time.

Data Analysis
We calculated cluster-stratified survey-weighted coverage estimates and standard errors using DHS data for each indicator and survey at national and regional levels. To calculate estimates for routine data, we divided the numerator as reported in the HIS by the estimated population denominator (using population projections from the 2009 national census) 17 at national and regional levels. We also calculated the standard errors for the survey and routine coverage estimates.
Because we had 3 survey estimates available, we defined 2 time intervals for comparison: 2001-2006 and 2006-2012. We visualized survey and routine coverage estimates with their 95% confidence intervals and compared the direction of the time trends in each interval.
To assess whether time trends from routine and survey data differed significantly from 2001 to 2006 or from 2006 to 2012, we standardized the difference of differences by subtracting the difference between 2001 and 2006 for survey data from the difference between 2001 and 2006 for routine data and dividing this quantity by the square root of the sum of survey and routine variance. Assuming a Gaussian distribution with mean 0, and standard deviation 1, we calculated the probability of a difference as or more extreme than the one we observed between survey and routine data. We reported P values for each comparison, aware that there is a 5% chance that a random observation from a Gaussian distribution will have a significant p-value, based on chance alone. We did not adjust P values for multiple comparisons.

Data Availability
We were able to obtain coverage estimates from the routine reports and databases for the national level and all regions for the 3 indicators we examined from 2001 to 2012.  (Tables  3 and 4). In addition, there were large differences between the point estimates from routine and survey data; routine data underestimated coverage of CPR and institutional delivery and overestimated DPT3 coverage relative to survey data.

Regional-Level Time Trends
Time trends for all 3 indicators varied widely between regions, particularly for DPT3 and institutional delivery using routine data, and there was far less consistency between survey and routine time trends, relative to national estimates (Figures 1, 2 (Tables 5 and 6). CPR time trends at the regional level were not significantly different between routine and survey data; the only exceptions were Gao (2001)(2002)(2003)(2004)(2005)(2006), and Ségou (2006-2012) ( Tables 5 and 6). In addition, the direction of the CPR trends was consistent between routine and survey data except for Kidal andTombouctou (2001-2006), although in both cases the difference between the routine and survey estimates was very small.

DISCUSSION
We aimed to compare routine and survey data trends over approximately 10 years at national and regional levels in Mali. We found that time trends for CPR, DPT3, and institutional delivery indicators in Mali were broadly similar between routine and survey data at the national level but were much more inconsistent at the regional level. This comparison is relevant to country and global stakeholders for several reasons. First, although household surveys are the preferred source for population-based measures of coverage, they are only available intermittently-every 3-5 years or even more infrequently-and therefore are of limited utility to support regular decision making. 18 Second, routine data are available at low levels of disaggregation and would be a more granular alternative to survey data. Third, in Mali, researchers and planners already rely heavily on routine data. 3 Given that the HIS is managed by Ministry of Health staff and that sense of country ownership over these data is high, routine data are more sustainable than externally coordinated and funded household surveys. However, it is important to understand to what extent these data may capture population changes in intervention coverage. Previous studies have reported poor quality of routine data, 1,5,6 but there have been limited assessments of external validity. Other analyses have found differences between routine and survey data with respect to point estimates, 18 and some have found few identifiable patterns. 19 We note that there were differences in indicator definition between survey and routine data ( Table 5). This is frequently the case with survey and routine data because the data sources capture different kinds of data and may in part explain the differences we observed. However, routine data are often used to proxy survey data, so comparing the 2 data sources remains relevant. We focused primarily on the comparability of time trends rather than specific indicator levels, as both the direction and magnitude of time trends are often used by stakeholders to make decisions about which interventions or geographic areas to prioritize. We found that national-level time trends were more comparable than regional-level trends, which may be due to the denominators used for the routine coverage estimates. Denominators for coverage indicators in routine data are typically based on projections from the most recent census. Internal migration-which would affect regional and district denominators but not national denominators-is often not captured in census projections. Depending on how recent these data are, the accuracy of the denominator may be affected. 20 The most recent census available in Mali at the time of the analysis was held in 2009. 17 An alternate approach for groups looking to replicate this analysis could be to use DHS-derived denominators for this analysis which would capture the distribution of women of reproductive age, births, and children by region.
The comparability of survey and routine data was generally better for CPR than for DPT3 and institutional delivery. This may be related to the fact that CPR changed very little from 2001 to 2012. In addition, the denominator for CPR, all women aged 15 to 49, is broader than the denominator for the other 2 indicators (pregnant women, and children aged 12-23 months) and may be less subject to error in census projections. 21 We found that at the national level, routine data overestimated vaccine coverage but under- We focused primarily on the comparability of time trends rather than specific indicator levels, as both the direction and magnitude of time trends are often used by stakeholders to make decisions about which interventions or geographic areas to prioritize.   Data extraction and cleaning for this analysis was a time-and labor-intensive process that required meticulous processing. Changes in district boundaries and indicator definition further complicated the process. Doing this rigorous, detailoriented endeavor regularly is not realistically feasible. However, we note that at the time of analysis, Mali did not have the District Health Information System, version 2 (https://www. dhis2.org/) (DHIS2) in place. 23 It is likely that the time and effort burden required for this process would have been considerably lighter if such a platform had already been established. 19 With additional investments in building both more robust reporting systems 18,20,22,24 and strong data use capacity including regular data quality assessments, 8,25,26 routine data quality is likely to improve and this level of 1-time, in-depth data cleaning may not be necessary.
With the introduction of DHIS2 in Mali, more standardized indicator definitions will be used. The capacity of this platform to produce data visualizations at more granular levels (i.e., health facility or district level) and to increase detection of data quality issues at that level can lead to improved quality of aggregated data at the regional or national level. Building an information culture whereby managers are incentivized to use the data collected to make concrete changes in their health facility or district is a way to ensure that HIS data quality continues to improve. 27 In a case study from Ethiopia, integrated supportive supervisionwhere managers aim to work with staff to review data and find solutions rather than adopting a punitive approach-has led to more accurate data being recorded and to data being used for decision making. 28 While our findings currently do not support using routine data for impact evaluations, initiatives such as these could eventually result in data of sufficient quality to be appropriate for this purpose.

Limitations
Because DHSs were conducted only every 5-6 years, we were not able to look at more granular time trends. Additionally, the 2012 DHS excluded 3 regions and several districts due to the security situation in these areas at the time of data collection. Because of this, we were unable to assess if our findings held true from 2006 to 2012 in excluded regions.
Furthermore, since household survey data are generally assumed to be of higher quality than routine data, we assumed that the DHS data represented "truth." We recognize, however, that household survey data has its own set of data quality issues, 29,30 and some household survey estimates may have substantial nonsampling error. In addition, regional estimates in the Mali DHS had wide confidence bounds due to relatively small sample sizes that may have limited our ability to detect significant differences between survey and routine data at the regional level.
We focused only on the external consistency of routine data and did not look at other data quality metrics, namely completeness, timeliness, internal consistency, and representativeness. Assessing these metrics may have led to a more complete picture of where data quality gaps exist and how they could be addressed. Taken together, these limitations may limit the validity of our findings if, for example, the DHS results did indeed have significant data quality issues or if other dimensions of data quality not explored in this article were of low quality.

CONCLUSION
Given the frequent use of routine data in maternal, newborn, and child health programs in Mali, we aimed to assess the difference between indicator time trends from routine and household survey data to guide decision makers in Mali.
Improving the data quality and accessibility of routine data is a high priority in many LMICs, and as part of this effort, it is important to assess the quality and usability of routine data in their current state. Trends in routine data appeared comparable to trends in household survey data at Using Routine Data for Decision Making in Mali www.ghspjournal.org the national level and therefore may be appropriate for use at that level, but time trends in routine data should be interpreted with caution at the subnational level. Given these findings, routine coverage data in Mali may not be suitable for impact evaluations, as evaluators need precise, accurate estimates of change to understand the extent to which a program is working. However, these data might be useful for planning and prioritization, if stakeholders keep in mind the potential error associated with subnational estimates. Given the potential for routine data to be a sustainable and timely source of appropriately disaggregated data, the push for improving the quality of routine data through exercises such as these should continue to be prioritized.