Metric partnerships: global burden of disease estimates within the World Bank, the World Health Organisation and the Institute for Health Metrics and Evaluation

The global burden of disease study—which has been affiliated with the World Bank and the World Health Organisation (WHO) and is now housed in the Institute for Health Metrics and Evaluation (IHME)—has become a very important tool to global health governance since it was first published in the 1993 World Development Report. In this article, based on literature review of primary and secondary sources as well as field notes from public events, we present first a summary of the origins and evolution of the GBD over the past 25 years. We then analyse two illustrative examples of estimates and the ways in which they gloss over the assumptions and knowledge gaps in their production, highlighting the importance of historical context by country and by disease in the quality of health data. Finally, we delve into the question of the end users of these estimates and the tensions that lie at the heart of producing estimates of local, national, and global burdens of disease. These tensions bring to light the different institutional ethics and motivations of IHME, WHO, and the World Bank, and they draw our attention to the importance of estimate methodologies in representing problems and their solutions in global health. With the rise in the investment in and the power of global health estimates, the question of representing global health problems becomes ever more entangled in decisions made about how to adjust reported numbers and to evolving statistical science. Ultimately, more work needs to be done to create evidence that is relevant and meaningful on country and district levels, which means shifting resources and support for quantitative—and qualitative—data production, analysis, and synthesis to countries that are the targeted beneficiaries of such global health estimates.


Amendments from Version 1
We believe that the careful feedback that all four of our reviewers provide in their comments from their own experiences working in this field highlights how complicated the politics of global health metrics are and how important the implications of these complications. Along with minor changes for correction and clarification, we have made the following changes to our manuscript in response to comments we have received: We agree that a much more comprehensive overview of the literature and interviews with key members of the development of the GBD is necessary to analyse further its limitations and implications. We are more careful to caveat our discussion and make clear that this is an analysis of the history of the GBD based on a limited data set. Like other articles in the global health governance field (e.g., Kickbusch & Reddy 2015;Doyle & Patel 2008), we are providing an analysis based on published materials and public debate. We hope to follow up this preliminary analysis with a more comprehensive analysis by interviewing key members of the field.

Introduction
At the 71 st World Health Assembly in Geneva in May 2018, a political alliance was struck between Chris Murray, Director of the Institute for Health Metrics and Evaluation (IHME), and Tedros Adhanom Ghebreyesus, Director-General of the World Health Organisation (WHO). IHME and WHO signed a Memorandum of Understanding (MOU), attempting to resolve what Lancet editor-in-chief Richard Horton called "a Cold War" of methodological, institutional, and philosophical tensions that occasionally erupted between the two organisations since IHME was established in 2007 1 . Since IHME started putting out the results of its Global Burden of Disease (GBD) study in 2012 2,3 , there have been controversial differences between IHME's estimates of disease burden and those of WHO 4,5 , particularly around malaria 6 , tuberculosis 7 , causes of child death 8,9 , and maternal mortality 10 .
The explicit agreement by IHME and WHO to produce a single GBD study-"a series of capstone papers summarizing high-level findings would be published in The Lancet 11 " before being used in official WHO documents-raises important questions about what an alliance of global burden of disease estimates means for global health governance and health policy in the global South. In official language, these estimates' producers have always meant them to go hand in hand with the development of better data collection and vital registration systems in countries with limited data collection infrastructure 12,13 . In the 2018 MOU itself, in fact, both WHO and IHME assert that "estimates are no replacement for data from strong surveillance systems 11 ". In reality, producing burden of disease estimates has always required WHO to reconcile the differences between its own disease estimates and country reported numbers. With this new alliance, the organisation will also have to reconcile estimates and methodologies from IHME with its own estimates and country reported numbers, as well as the two very different philosophical perspectives of the two institutions. These acts of reconciliation are mirrored by the attempted alliance between the World Bank and IHME to produce the Human Capital Index and the methodological tensions revealed between the two organisations with regards to the use of global disease burden estimates 14 .
The practice of estimation in health development is by no means new. Early examples of the use of mathematical models based on assumptions include Daniel Bernoulli's 1766 predictions of smallpox morbidity and mortality rates, should the English government not take on inoculation practices 15,16 . However, as has been remarked by many scholars 17 , the amount of money and labour that went into estimation-production radically increased in the era of the Millennium Development Goals and is expected to grow in the era of measuring the progress of the much more complicated Sustainable Development Goals. This includes IHME's production of the SDG-index, which was created particularly with the aim to assess the SDGs' measurability 18 .
In this article, we present first a summary of the origins and evolution of the GBD, which has been the object of much scrutiny 19 , over the past 25 years. Then, we analyse two illustrative examples of GBD estimates and the ways in which they gloss over the assumptions and knowledge gaps in their production, highlighting the importance of the historical context by country and by disease in the quality of health data. We show how these estimates gloss over their assumptions and knowledge gaps, attempting to quantify context and data quality. Finally, we delve into the question of the end users of these estimates. Who uses these estimates and who does not, and what tensions lie at the heart of producing estimates of local, national, and global burdens of disease? These tensions bring to light the different institutional ethics and motivations of IHME, WHO, and the World Bank, and draw our attention to how the production of data and estimates is key to representing problems and their solutions in global health. With the rise in investment and power of global health estimates, the question of representing global health problems becomes even more entangled in decisions made about how to adjust reported numbers and statistical equations. The tensions at the heart of the GBD study are of utmost importance because those who represent global health problems are also those who determine how money and influence flow to address them.

Methods
This analysis paper relies on two data sources. First, we used published articles to construct a timeline of the history and methodological conflicts of the global burden of disease study.
Second, we analysed popular media representations of partnerships between WHO, IHME, and the World Bank, as well as field notes taken by M.T. at three public events at IHME and the World Bank. These three events were IHME's 20th Global Burden of Disease Anniversary event in September 2017, IHME's Annual Board Meeting in June 2018, and Sir George Alleyne's lecture on human capital at the World Bank headquarters in July 2018. The paper does not attempt to give a comprehensive history of the GBD study-which over the course of its twentyfive-year history has included a very large cast of contributors, including those who have generously provided feedback on the first version of this article, and which has been the topic of extensive debate-but instead provides an introductory history of the development of the study and key moments in this debate in order to discuss the implications of two recent attempted and achieved partnerships between IHME, the World Bank and WHO (the Human Capital Project with the World Bank, and the 2018 MOU with WHO to create a single set of global health estimates).

Key historical moments and tensions at the heart of the Global Burden of Disease study
The GBD and the disability-adjusted life year (DALY) have been subject to detailed scrutiny since their inception in the early 1990s. The DALY and the GBD first came onto the world scene out of a partnership between Chris Murray, Alan Lopez, and Dean Jamison a , with the explicit institutional support of Harvard University, WHO, and the World Bank 20 . The Bank requested "a comparative, comprehensive, and detailed study of health loss worldwide to provide the basis for objective assessments about the probable benefits of applying packages of interventions 21 ," as part of the Bank's increasing influence and financial investment in international health development in the 1980s and early 1990s 22 . The team's work culminated in the publication of the 1993 World Development Report (WDR), Investing in Health 23 , which synthesized decades of work in health economics into the new health metric, the DALY, to provide punchy, useable language for justifying public and private, national and international investment in health. Murray and Lopez themselves define the study as a "systematic scientific effort to quantify the comparative magnitude of health loss from diseases, injuries, and risks by age, sex, and population over time 21 ".
The three explicitly stated aims of the GBD were "(i) to decouple epidemiological assessment of the magnitude of health problems from advocacy by interest groups of particular health policies or interventions; (ii) to include in international health policy debates information on non-fatal health outcomes along with information on mortality; and (iii) to undertake the quantification of health problems in units that can also be used in economic appraisal 24 ". The study's architects were motivated by providing estimates on morbidity rates, as well as mortality rates, on diseases and conditions from a neutral, objective perspective. These estimates were built to be plugged into economic appraisal calculations to give international health organisations and governments guidance for prioritising health interventions based on economic reasoning. Until that point, assessments of global disease morbidity and mortality rates had been the responsibility of different programmes at WHO that focused only on specific diseases and interested largely with mortality rates, with estimates that leant heavily toward overestimation 25 . The DALY, as a metric which quantified morbidity over time, was introduced as a means of drawing global attention to diseases and injuries that burdened populations but did not always result in death.
For all future GBD studies, 1990 served as the benchmark year, and over the course of the GBD's life, mortality and morbidity estimates for that year were consistently reworked based on new data inputs and new estimation procedures. The next big moment for the GBD was 1997, when Richard Horton at The Lancet published the first peer-reviewed series of articles based on the research that was at the heart of the 1993 WDR, which Murray and Lopez call the "first complete revision of the GBD 1990 study 21 ." The publication of these four articles 12,[26][27][28] in The Lancet was so foundational to the genesis of the GBD that it served as the starting point for IHME's 20 th anniversary celebration of the event in Seattle in 2017. Still using the year 1990, these articles addressed a few of the major concerns that scholars had identified in the original study published in 1993, including the criticism of the way the study weighted DALYs by age 29 . The DALY quantifies healthy time lost to illness or death against a standardised average healthy life expectation, by combining years of life lost to death (YLL) and years of life lost to disability (YLD). The healthy life expectancy (HALE) is estimated for every population represented in the study-determined by location, sex, and year-and it is the reference against which YLLs and YLDs are subtracted to determine DALYs for each population.
In each cycle of the GBD since, its architects have carefully reassessed the nature of the disability weights used to condition DALYs, among other changes over the decades ( Table 1). As authors at WHO assert is the case in their description of the evolution of the GBD, determining disability weights has proven to be one of the most contentious issues of producing global health estimates 30 . Producing disability weights requires an agreement on how to "define, measure and numerically value time lived in non-fatal health states". At the heart of determining this numerical value are debates about "societal" vs. "individual" judgements on the severity of certain losses of health over others, as well as whether to privilege health professionals' perspective over the general population's 31 . In an attempt to address these conflicting perspectives on valuing health, the GBD and GHE since 2010 have based their disability weights on a "comprehensive re-estimation of disability weights" based on surveys from the "general population 32 ".
By 2000, Chris Murray had joined Alan Lopez at WHO, and they published their national health systems ranking using the GBD and the DALY, with the explicit institutional support of WHO. They had the particular support of Gro Harlem Brundtland and a These three individuals are listed as key three among many others whose labour were essential to the process of producing and disseminating the global burden of disease study. Table 1. Key GBD studies from 1990 to the present. These are the official studies indicated as such by Chris Murray and Alan Lopez in their description of the evolution of the GBD 33 . Key articles and methodological changes are described here, but they are by no means exhaustive.

GBD Study
Year ( Given their large global health portfolio, BMGF also had a clear interest in ensuring timely, independent, and robust production of burden of diseases estimates, and IHME became their guide to do this. IHME lists impartiality as one of its five core principles: "For health evidence to be useful, it also must be credible, generated by a scientific process unimpeded by political, financial, or other types of interference. IHME was created to fill a gap in global health: to separate the measurement and evaluation of health policies and programs from the process of creating, implementing, and advocating for policies and programs 42 ".
In 2012, the group published their first GBD study in The Lancet, covering the years 1990, 2005, and 2010, calling it the GBD 2010 43 . As we will touch on more extensively in the next section, this study caused a stir in the global health community, as some crucial estimates, like those of malaria morbidity and maternal mortality, diverged heavily from those put out by WHO and its UN agency partners for the same year 44 . In 2015, IHME published the revamped GBD 2013 in The Lancet, including estimates for all of the years between 1990-2013, and since 2016, they have published complete annual reassessments of these years' estimates, adding new assessments in each cycle. IHME's GBD 2017 was published in The Lancet in November 2018.
The DALY, the metric at the heart of the GBD, was introduced explicitly as a means to draw global attention to diseases that do not only kill but also disable. By quantifying the loss of healthy years, the DALY was meant to unveil suffering, like lower back pain or depression, that weighs heavy on the world. Some scholars have argued that the metric, so ready to be used for cost-benefit analysis, and its proliferation more fundamentally have contributed to an intensification of economic rationality in and the neoliberalisation of global health thinking 45,46 . Medical anthropologist Vincanne Adams argues that "the DALY provides an economic measure of human productive value by calculating loss of productivity due to disease or disability 47 ". As a tool that was created for the World Bank's new increased investment in global health development, the DALY has from the start tied economic values to human suffering and attempts to alleviate it 48 . Explicitly meant to be useable for ministers of finance and drawing a particular American version of economics into the sphere of international health development, the DALY has led to the reign of health economics in global health governance, a recent shift which Sridhar calls the economic gaze 49 .
The GBD in its current form is also the result of a particular institutional view on the world. Expanding on Peter Byass' conceptual framework of global health estimates 50 , we argue that there are actually four different institutional positions that frame global health metrics production. We would call these four institutional positions "traditional academic population health department", "think tank", "health-policy oriented multilateral", and "economic-policy oriented multilateral", the last three of which are exemplified by IHME, WHO, and the World Bank. We argue that IHME is not a traditional academic population health department governed by academic publishing logics, which Byass contrasts against the governing logics of UN agencies' global health estimate production. Although there are aspects of academic scientific rigor that define IHME's estimate production-including oversight by an Independent Advisory Committee-they are also governed by the demand for deliverables by BMGF and have privileged access to private industry's data streams because of their ambiguous institutional nature. As multilateral organisations, WHO and the World Bank have contractual relationships with officials in the countries with which they work, and thus are responsible for producing statistics that are legible to country officials. With different mandates for the ultimate goal of such production of statistics-one which places wellbeing above all else and the other economic prosperity above all else-WHO and the World Bank nonetheless are both required to harmonise the interests of a wide range of national, international, and civil society actors, including in the process of producing contextualised evidence for disease burden, which IHME asserts is a form of 'politicizing' evidence b . The GBD in its current form has also been born out of collaborations with both WHO and the World Bank, meaning that the current collaboration with WHO is only the latest of a larger history of b We thank our reviewer Colin Matthews for emphasising this point.
interactions and compromises between these three frameworks of measuring health's progress and value in the world.
On the production of the global burden of disease One of the key challenges in producing a complete picture of the world's health has been the lack of adequate civil registration and vital statistics systems or health information systems in several countries and the need to use a limited dataset to estimate the global burden of disease 43 . This has resulted in a proliferation of modelling as a tool for estimating the larger picture, according to the data available. Inherent in producing estimates of population health is the question of how comfortable estimate producers are with extrapolating robust-seeming estimates from little or no data. "Imputing" data, in statistics parlance and the global health context, means bridging over conceived gaps in available data in one country with estimates based on data that does exist in comparable countries, often defined as comparable in terms of levels of GDP and regional proximity. This allows IHME, for example, to have estimates of malaria morbidity in the Central African Republic, where disease surveillance work has been incomplete since civil wars broke out in 2012 51 . Estimates' level of uncertainty is directly related to the presence of health data infrastructure, meaning that estimates are least robust for countries with weaker health systems, which are often those countries perceived as needing disease burden estimates the most. Since one of IHME's fundamental principles is that " [too] often, no estimate of a problem is interpreted as an estimate of no problem 21 ", the organisation is known to be much more comfortable with imputation than the global health organisations that often use its data such as the World Bank, WHO, and UNICEF.
However, imputing is not exclusively the practice of IHME. Various WHO, World Bank, and multilateral partnership programmes also practice imputation and other forms of data correction. While it is widely used, disagreements lie in the degree to which various organisations are comfortable with using imputation. Imputation and data correction can lead to tension between various producers of global health estimates, as well as between estimate producers and country officials. When the estimates of global disease and injury burden are used by other organisations or in health policy, they do not make explicit the complex methodologies nor the underlying primary data from country level involved in their creation. Discrepancies between IHME and WHO (and other UN agencies) sets of estimates have the effect of both reminding users of the complexities behind them, while also creating confusion for health policy makers 17 . Even without having to address the discrepancies between its estimates and IHME's, WHO has had to reconcile its own official estimates and country-reported morbidity and mortality rates. In its production of global and national health estimates, WHO uses multiple sources of data, although not as many as IHME, including Demographic Household Surveys (DHS) and the World Health Survey, to round out data provided through administratively reported data in public health clinics, in order to address potential bias within such production systems. As a result, WHO estimates can often be quite different than those that health ministers and finance ministers gather from their own statistics offices, and when global health estimates diverge dramatically from nationally gathered numbers, health ministers can unsurprisingly mistrust them 17 .
Furthermore, on the scale of global health estimates themselves, one of the most controversial discrepancies between IHME's and UN agencies' estimates, as well as with nationally reported numbers, are those of malaria mortality rates 6 . In 2010, IHME reported 1,238,000 deaths due to malaria, and that 524,000 of those were amongst individuals five years or older 2 . For the same year, WHO reported 655,000 deaths in total with approximately 91,700 of those amongst individuals five years or older 52 . This is due partly to the problem of a lack of information and diagnostic capacity in many places where malaria is endemic, but it is also due to discrepancies in defining the presence of malaria and the causality of a death. Since decreasing malaria morbidity and mortality was an explicit part of the Millennium Development Goals and the global health fight against malaria is heavily funded, this discrepancy caused a tumult in the global health world, which had three years to go to meet the Goals 53 . Both IHME and WHO have used estimations of parasite density produced by the Malaria Atlas Project in the process of determining malaria mortality 54 , but IHME's estimates also take into account a different interpretation of verbal autopsy data for morbidity in adults, based on the category of "fever of unknown cause 55 ", potentially influenced by the Gates Foundation's political investment in the eradication of the disease.
Another notable discrepancy between the two sets of estimates is that of maternal mortality rates (MMR) 10,56 . In 2010, four UN agencies-UNICEF, UNFPA, the World Bank, and WHOproduced MMR estimates for the time period of 1990-2008 57 , while IHME produced estimates for the time period of 1980-2008 58 . For the final year of the study, 2008, the difference between the two estimates was low: IHME estimated 342,900 maternal deaths compared with the UN agencies' estimate of 358,000. However, the two estimates differed markedly on their 1990 estimates, which changed the degree of global decrease in maternal deaths 10 . As estimates are used by global health organisations and putatively for country-level health policy makers to measure success of certain kinds of interventions or approaches to health problems, differences in change certainly confound attempts to distinguish failing or successful health campaigns.
The fact of the matter is that the production of these estimates is conditioned by many levels of assumption, and they can result in numbers that are unrecognizable at the local level. In anthropologist-physician Clare Wendland's analysis of maternal mortality ratio production and maternal care in a hospital in Malawi, she was confronted by health workers who expressed shock at the estimates produced far away that were being used to define MMR in the country: "They are saying we will meet the Millennium Development Goals. But I can't believe it. If it's that low, why are we still seeing this [much public mourning] every day? 59 " Additionally, the availability and quality of local empirical data that serves as a starting point for these estimates are often determined by global health organisations' priorities. Data collection systems may be thrown up around certain issues in conditions of precarity or where no problem was perceived before, such as cholera rates in post-disaster Haiti, malaria and HIV rates that were collected despite data retention strikes in Senegal 60 , and Zika across the Americas in the wake of it being defined an epidemic by WHO 61,62 . What data are produced is defined by political and societal priorities and what types of data collection systems are funded by donors. This problem on the level of collecting empirical data becomes part of the larger political entanglements of the global health estimates they are used to produce.
Who uses these estimates, and how? The significant investment in disease burden estimates raises the question of who uses these estimates and how they are consumed. Tracking the use of its studies has proven difficult for IHME itself, which relies on following citation data, tracking the use of its data visualization tools, feedback from its collaborative network, and overseeing awards like its Roux Prize, to determine who and which agencies have used its data. From these sources, we can see that their data is used by other academic researchers (i.e., a high number of citations), global health organisations (i.e., the use of GBD estimates in policy reports), and, to a lesser degree, ministers of health and local politicians (i.e., the use of GBD estimates in national and subnational health policy). Organisations like WHO and the World Bank have at times used IHME's data and at other times produced their own. The most direct consumer of this data is the Gates Foundation itself, the largest funder of IHME, which has mandated that the group produce a yearly revamp of their GBD study 63 . The justification given is that BMGF uses IHME data to inform its investment portfolio. Since the organisation does not make its funding justification public, it is unclear how much GBD data is used regularly within BMGF to inform its funding portfolio.
With respect to WHO, the 2018 IHME-WHO Memorandum of Understanding is the second of its kind, the first having been signed in 2015. According to Boerma and Mathers, the first memorandum was signed "to encourage collaboration on country capacity strengthening, data sharing, and interaction on methods, tools, and actual global health estimates 64 ". Part of this collaboration was the production of the Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER), which was a response to a call by WHO in 2013 for consensus and better guidance for reporting and interpreting health data 65 . When Dr. Tedros, who was on IHME's founding board in 2007 as noted above, became WHO's Director-General in 2016, he expressed his interest in reconciling the two systems of estimation. In the 2018 MOU, IHME and WHO carefully outline how the two organisations would collaborate on the General Programme of Work 2019-2023 (GPW 13), in policy dialogue and country capacity building, in publications, and specifically on the production of a single Global Burden of Disease study, the explanation of which occupies most of the document 11 . The agreement will require IHME and UN agencies to confront the methodological tensions at the heart of their differing approaches. Whether it will result in a more careful approach to global health estimates, more explicitly communicating uncertainty and addressing ethical issues at the heart of the study are questions yet to be answered 17 . The GBD alliance has the potential to create a global health data monopoly, extending the already extensive reach of the BMGF further into WHO in determining how global health problems are known and what kinds of approaches to health problems are viable.
In its continued relationship with the World Bank, IHME has been most recently tied to the Human Capital Project. In October 2017 at Columbia University, World Bank President Jim Kim formally announced the Human Capital Project, after first presenting on it at the 20 th GBD Anniversary event at IHME. He explained how the World Bank would publish the next generation of the health systems ranking from the 2000 World Health Report in Fall 2018, in partnership with IHME. The Human Capital Index would measure the educational attainment, educational quality, and functional health levels of each country 66 . In his 2017 presentation at IHME, he argued for the inclusion of estimates from IHME's GBD study in the proxy for functional health levels of each country. The goal of this new index was to measure countries' investments in the education and health of its own citizenry. The larger human capital debate, of which the development of the DALY is a part, is a theoretical argument that economists have put forward to argue that education, health, and other social services are not expenditures but investments in a country's economy 67 . As Flabbi and Gatti explain, when Gary Becker first introduced the idea that "investing in human capital is akin to investing in physical capital," it was quite controversial 68,69 . Even when the 1993 World Development Report 70 was released, which leans heavily on the human capital argument, there was push back from mainstream economists, as the Senior Director of the Health, Nutrition, and Population team, Tim Evans, reminded the audience at a talk on human capital at the World Bank in July 2018.
However, the execution of the Bank's Human Capital Index project has resulted instead in two separate methodologies of estimating human capital, one produced by IHME 71 and one by the World Bank 14 . This is a reversal of Bank President Jim Kim's earlier assertion about forging a partnership between the two organisations. At the July 2018 talk on human capital at the World Bank, a team member of the Bank's Human Capital Index project was asked why they were not taking advantage of IHME data and the "visibility" of the GBD and DALYs in their calculation and ranking of countries' investment and status of education and health. The team member argued that both the WHO and IHME versions of the GBD used higher levels of imputation than his team habitually used. He acknowledged that they did go first to the IHME data, but quickly realized that there were many parts of the estimates that were based off of scarce empirical data. For the Bank team, this was problematic for three reasons. First, imputing data meant losing the line of sight for ministers of finance, ministers of health, and ministers of education. World Bank staff would not be able to remind ministers of which study was used to produce them nor the history and context of the numbers referenced, and thus it would not be recognizable on the country level. Second, Bank staff would then lose the platform for advocating for better data collection and the importance of addressing the structural problems and inequities that exist beneath the data gaps. Finally, tracking progress on these indicators then becomes very complicated, as estimates change year to year also due to changes in estimation or statistical science in addition to changes in material conditions. As Boerma and his colleagues remind us, "Neither country policy makers nor the global development community are best served by a global flood of health estimates derived from complex models as investments in country data collection, analytical capacity, and use are lagging 64 ".
This methodological conflict between IHME and the World Bank shines a light on one of the conflicts at the heart of the use of IHME estimates in the monitoring of progress on the Sustainable Development Goals. There is first the problem mentioned here by the World Bank representative, and highlighted by others 72 , that with the complete revamping of the methodology of the GBD study every year, there will always be confusion over whether progress from one year to the next is the result of change in the outside world or change in estimation science. Secondly, and most importantly for understanding the significance of the 2018 IHME-WHO MOU, is the nature of the conflict of interest at the heart of the BMGF nearly fully financially supporting the production of the GBD while also being one of the largest financial supporters of WHO. Because of the model of production of the GBD-which lies between and outside academic and multilateral frameworks and is not open for country-level review-this means that BMGF maintains outsized control over the means of actually measuring the success of its own investments within the UN agency system c .
Beyond those critiques, a central value of the GBD can be found precisely in its ability to compare countries and regions in the context of health development. In the 2018 IHME-WHO MOU, they argue that the "GBD's utility is largely for comparisons across locations and over time 11 ". It is precisely this demand for comparability that calls and allows for the standardization of health data and the filling of data gaps 17,33 . However, it is worth asking the question of why global comparability is so important to achieving global health goals. The argument is that this work of comparing promotes healthy competition between countries, spurring those who see themselves as lagging behind into action, as global health leader Sir George Alleyne put it at a talk he gave on the World Bank's Human Capital Project in July 2018. Alleyne added that there is something inherently human about thinking about the world in the framework of hierarchies and that statistics have long contributed to how we determine rank in such hierarchies. It is this assumption about the nature of competition that has prompted projects like the Human Capital Project, of course, and also the 2000 World Health Systems ranking.
In the quantification and rationalization of uncertainty, in an attempt to eradicate it, do other forms of evidence become delegitimized that are important in the process of determining health policy priorities? What are the larger ramifications of practices of standardization, data correction, and imputation that are performed particularly with the goal of making local contexts readable from a satellite's view of the world? We would argue that it is certainly up for debate whether humans are doomed to be dominated by the work of competition rather than collaboration, and that there is an explicit history to the active construction and reification through statistics of "scientific" conceptions of hierarchies that benefit those who are at the top 73,74 .

Conclusion
Fundamentally, the production of the GBD and its centrality to global health governance raises the question of the larger effects of producing numerical assessments of disease burden "from a distance 75 ". How data is collected and how estimates are produced actually shape how we understand global health problems and their potential solutions, and thus estimates should never be taken for granted. When health workers, health policy makers, or even patients do not recognize their experiences in these estimates, like the Malawian physician mentioned above, the usefulness of such estimates and the kinds of knowledge with which they must be accompanied should be assessed. These estimates carry with them particular assumptions about the nature of illness and economics that attempt to universalize experiences of suffering and its impacts on our lives. They are excellent advocacy tools, but unfortunately their power extends beyond merely highlighting a problem, as they also leverage assumed health interventions along with them. What happens far less frequently, with important exceptions 33,76 , is the transfer of skills necessary to produce these estimates on a much more local level and investment in vital registration systems and health information systems 77,78 .
When numbers are called upon to "speak for" the health needs and priorities of populations in the global South, we must also ask the question of who produces these numbers, their apparent apolitical neutrality, and the broader governance structures that allow for them to hold the power they do. This does not mean that these estimates should not be produced, as global burden of disease estimates are an important "first pass" of the health profiles of different countries. However, they cannot be proxies for suffering in and of themselves, and global health benefits greatly from the visibility of the scientific and methodological conflicts that are at the heart of tensions between the two sets of powerful health estimates produced by WHO and by IHME. Ultimately, more work needs to be done to create evidence that is relevant and meaningful on country and district levels, which means shifting resources and support for quantitative-and qualitative-data production, analysis, and synthesis to countries that are the targeted beneficiaries of such global health estimates. c We thank our reviewers, Carla AbouZahr and Peter Byass, for emphasising the importance of this point.

Data availability
The primary and secondary sources used are cited. M.T.'s field notes from the three public events are under restricted availability. See the American Anthropological Association's 2003 "Statement on the Confidentiality of Field Notes" for more information. However, highly redacted versions of them may be available for academic research purposes. In order to access these highly redacted field notes, please contact M.T. (marlee.tichenor@ ed.ac.uk), outlining why the field notes would be relevant for your research and how they will be used.

© 2019 Boerma T. This is an open access peer review report distributed under the terms of the Creative Commons
, which permits unrestricted use, distribution, and reproduction in any medium, provided the original Attribution License work is properly cited.

Ties Boerma
Centre for Global Public Health (CGPH), University of Manitoba, Winnipeg, Canada The Global Burden of Disease (GBD) has become a major instrument in global health and appears destined to become even more important, if only because of the massive investments by the Gates Foundation into the production of GBD estimates. This paper reviews the position of the GBD as a tool in global health in the light of a recent agreement between IHME and WHO to produce a single GBD study, focusing on possible positive and negative implications of such an agreement.
The paper is a useful piece to stimulate debate. The methods used by the authors have limitations (review of literature review and field notes from three public events) and could have benefited from more extensive and systematic research on the subject. This would have led to a more comprehensive understanding of the history of GBD and relevant concurrent developments in the global estimation space, but in spite of this shortcoming it is a thoughtful contribution which can serve to generate more reflections. In this light, there are a few areas where the paper could benefit from more comprehensive considerations:

Scope of GBD
The GBD initially was intended to provide a comprehensive analysis of health loss. The key outputs were mortality by cause, DALYs and healthy life expectancies (HALE). The latter measure is not mentioned in the paper but is probably the most suitable single measure for assessment of progress towards the health goal of the sustainable development goals (health and wellbeing for all). In recent years, the GBD study has become synonymous with any estimate of individual disease burden, from maternal mortality to mental illness, and is an essential component of indexes such as those for IHME health SDG and the IHME quality and access index . WHO (as well as UNICEF and UNAIDS) has traditionally generated estimates through its programs, working with expert groups with broad representation from the world's leading scientists in their respective fields. WHO fitted these disease-specific estimates into an overall envelope of the total number of deaths in the world to produce an overall picture of the burden of disease and ensure that overestimation would not occur. WHO's decision to rely much more on IHME's GBD for the overall burden of disease process makes sense. A key point is however the extent to which the specific WHO programs will be able to continue to produce independent estimates of disease burden in their respective areas, through their own expert group mechanisms, with IHME involvement where appropriate, and supported by solid investments by WHO.

Global actors
The focus of the paper is on WHO and the World Bank. There are however other actors that need to be considered when moving forward. For instance, the United Nations Population Division produces estimates of the current and future population for each country on a regular basis. These estimates are produced in collaboration with countries. IHME is now using its own population estimates and this has a major impact on all GBD estimates. A very different actor in global health is the Lancet. The journal is mentioned frequently in the paper but its role is not examined. The Lancet has been instrumental in driving the global health agenda during the past decades and has been an effective vehicle for the publication of UN health estimates. The Lancet has been publishing volumes on the GBD with dazzling frequency and details on country statistics and 1,2 has been publishing volumes on the GBD with dazzling frequency and details on country statistics and rankings. The Lancet is owned by Elsevier, one of the largest publishing companies in the world, with a different set of institutional ethics and motivations that are worth considering in the debate.

Monitoring
There is a tendency to increasingly use the GBD for the monitoring of progress in the context of major international health goals, most notably the SDG. The GBD study can produce almost any index to assess country progress for any year through the modelling exercises . The distinction between actual country progress and prediction becomes blurred . This is not a good practice and certainly puts countries in the role of spectators.

Countries
The paper rightly raises concerns on the role of countries in the GBD exercise. This is not unique to the IHME GBD and is also a challenge for WHO and other UN agencies. WHO's consultation processes to share estimates with countries for inputs (e.g. bring in new data, challenge methods and assumptions) are important but need considerable strengthening to make countries a full partner in these processes. Until today, IHME has invested far too little in making tools available to countries. In spite of all its sophistication it should be possible to develop GBD tools that can be put in the hands of countries and with default values and methods regularly updated through a web-based process. This would be a major step forward, especially if combined with analyses to identify data gaps and what can be done to reduce those gaps. The SDG have clearly laid out an agenda where countries are much more central than global actors. This deserves a systematic approach by WHO and its partners including IHME.

If applicable, is the statistical analysis and its interpretation appropriate? Not applicable
Are all the source data underlying the results available to ensure full reproducibility? Partly Are the conclusions drawn adequately supported by the results?

Carla AbouZahr
Bloomberg Data for Health Initiative, Melbourne, Australia CAZ Consulting Sárl, Geneva, Switzerland This is a timely and welcome effort to trace the history of the origins and evolution of the GBD. The article covers two interrelated topics: the evolution of the GBD since its origins in the 1993 World Development report; and the implications of the recently signed MOU between IHME and WHO, described by the editor of the Lancet as ending " a cold war" between the two institutions. The authors present a summary of the development of GBD and raise important questions about the role of GBD estimates with regard to global health governance, health policy and relationships between donors and countries. However, there are some gaps in the historical account and in the analysis that should be addressed prior to publication.
The section describes the two sources upon which the paper is based: published articles on Methods GBD, along with "media representations on WHO, IHME and World Bank partnerships" and notes taken during public events, two of which were IHME organised, including the 20 year anniversary. I do not think this is a sufficient basis upon which to draw conclusions. It would have been advisable to have consulted with other individuals and institutions involved in the evolution of GBD. The reporting of the 1993 World Development Report appears to have been largely influenced by reference 20, the 2017 paper by Murray and Lopez in which three individuals are singled out as contributors to WDR 1993 (with a footnote to "many others whose labour was essential"). There is no mention of the large teams drawn from multiple agencies, including not only the World Bank and WHO but also academia and other UN agencies. This is a weakness because the institutional support provided by global agencies was instrumental for the long-term sustainability of GBD.
The authors quote the original purpose of GBD, to provide "a detailed study of health loss to provide the basis for objective assessments about the probable benefits of applying packages of interventions." This is correct, but the drive to "decouple epidemiological assessment …. from advocacy by interest groups of particular health policies or interventions," seems to have crept in later, along with criticism of the estimation methods used by WHO, UNICEF and others .
The authors might have taken a more critical view of this judgement and questioned whether technical experts working on specific health topics are incapable of "a neutral, objective perspective" (page 4). I wonder if they discussed these issues with WHO staff. There is no mention of the various technical evaluation reference groups established by WHO to guide the estimation process and provide objective, independent advice on the methods and results. See for example, and http://cherg.org/main.html . http://www.epidem.org Given that this paper purports to trace the evolution of GBD they could have paid more attention to the ways in which the GBD evolved to take account of the criticisms that arose from the beginning. The authors state that age weighting was dropped in the 1997 GBD but this should be checked. I believe that age and time discounting were not dropped until the 2010 GBD following the advice from an array of scientists and ethicists . The authors affirm that "the architects of the GBD have carefully reassessed ….. the disability weights, ….. among other changes over the decades" but provide no specifics.
They might also have considered whether other criticisms of GBD have been sufficiently addressed, for example, neglect of the gender dimensions of health , inadequate examination of the underlying social and economic determinants of health ; and the exclusion of certain adverse health outcomes, notably stillbirths . More discussion of the ways in which the DALY metric has been modified (or not) over time and the implications of such changes would have added value to the history .
The authors could have been more inquisitive about how the data needed to fuel the annual GBD updates are gathered and from whom. Perhaps part of the rationale for the MOU is to enable IHME to benefit from WHO's privileged access to country data? They might also have examined the unintended effects of the annual GBD revisions which rely on ever more complex and data-hungry algorithms. The risk is that the GBD turns into an academic extractive industry that takes valuable resources -data -from countries and adds value in a faraway institution that is accountable only to its own funding partner.
An aspect that the authors do not mention is that GBD undertaking would not be possible without the huge advances in information technology behind the complex calculations and algorithms that drive the GBD. Such IT capacities and skills are not widely available, even in many middle-income settings, let alone in the global south. In their absence, it is impossible to truly understand the inner workings of the GBD or to replicate what it does.
The section on users and uses of GBD is very interesting but could have been enriched by a more careful analysis of the ways in which GBD is used and by whom. Most users welcome the GBD for its overview of 1 2 3 4 5,6 7 analysis of the ways in which GBD is used and by whom. Most users welcome the GBD for its overview of global patterns of death and disease and value IHME innovation in areas such as health futures, trend forecasting, costing and cost-effectiveness, and visualization of complex data. However, the GBD work is contentious not only because estimates differ between WHO and IHME but also because increasingly the indicator estimates are used for monitoring progress on indicators, such as those included in the SDGs, that are highly politicised. Using the estimates for monitoring is technically problematic; in practice the estimates are predictions of indicator values for a given year extrapolated from past trends. The use of predictions for monitoring purposes has been criticised by Murray among others .
In discussing the available information on users, the authors could have thought more about the reasons for the relative lack of use by country decision makers. Did the authors consult with some of the country policy makers whose data are used in the development of GBD? The GBD annual updates typically involve a complete re-estimation of the whole time series rather than simply adding new values for recent years, effectively shifting the goal posts at each reiteration of the estimates. From a policy-maker's perspective in a particular country, it is hard to decipher whether the updates reflect new data or changes in the estimation methods. This can seem highly confusing, not to say dubious, even in countries with strong statistical systems . The use of confidence intervals to reduce the impact of the differences may allay the concerns of technical experts but can be hard to explain to policy makers.
The authors do not suggest how the GBD might be made more useful and relevant to country policy makers or how to reduce the inevitable tensions arising when global estimates differ from country values. They could draw on the WHO experience of sharing its estimates with countries prior to publication and working with them to explain why the modelled estimates differ from country reported values. Inevitably there are disagreements, but the approach has contributed to improved understanding of the estimation process, capacity development and trust building between countries and agencies. IHME does not engage in similar country consultation processes prior to publication. I agree with the author's concerns about the implications for global health governance and health policy of "an alliance of global burden of disease estimates" which may potentially create a "global health data monopoly." They might have explored this issue in more detail. While this is not the first attempt to improve collaboration between WHO and IHME, the tenor in this instance seems to place IHME, with its vast human and financial resources, firmly in the lead, rather than on equal terms with WHO. The authors suggest that the alliance will require WHO to "reconcile estimates and methodologies from IHME with its own estimates and country reported numbers" and that the GBD findings would be published in the Lancet, "before being used in official WHO documents". If this is correct, WHO will find itself in the position of having to justify IHME estimates to its own Member States! It would have been instructive had the authors pursued some of their contacts to answer some more detailed questions. What is the balance of gains and losses if global and country health policy-makers have access to only a single source of global health data? What are the implications for WHO's governance architecture if the secretariat has to take on the responsibility of reconciling country data with IHME estimates? Will WHO continue to develop its own estimates? What will happen to the technical advisory mechanisms established by WHO programmes that offer good models of inter-agency collaboration? Are there potential conflicts of interest given the growing role of the Gates Foundation in both IHME and WHO as well as disease-focussed health programmes around the world?
The authors' overall conclusions seem incontrovertible. The future development of the GBD has major implications for countries and for the international development system in terms of governance, health policy, transparency, ownership, and accountability. Further open discussion of these issues is much needed. 8,9 needed.
Editorial comments: Table 1 should include the WHO update on methods on 2018 ( https://www.who.int/healthinfo/global_burden_disease/GlobalDALY_method_2000_2016.pdf Table 1 should provide correct citations. Table 1 Line 1 there is no mention of WHO institutional support to the 1993 WDR. Table 1 Line 3 There were several WHRs that were based on GBD analyses and these should be cited correctly, 1999, 2000, 2001, 2002. All used the concept of DALYs. Table 12 The comments "new methodologies for finding mortality" should be explained. There is a danger of over-personalising the admittedly somewhat fraught history of the Global Burden of DIsease (GBD) programme over a quarter of a century. The consistent personal aspect of this history is tied up around Chris Murray and Alan Lopez, who at the outset of the process were WHO staffers -a point that does not come across clearly from the start here. Not surprisingly, they have moved institutions and attracted different funding support several times over the whole period. Thus WHO and the World Bank were involved from the start, and continue to be so; the 2018 MoU between WHO and GBD is the latest development in this long saga. There have also been significant collaborations along the way, for example the development of the GATHER guidelines for reporting global health estimates , which was the direct product of technical work involving both WHO and IHME, and should be mentioned here. However, there have certainly been institutional and personal frictions associated with the complex history of GBD, which are important to explore and understand.

The 25 years of GBD:
This section describes much of the practice and politics of GBD development, but in parts has a more journalistic than academic style, and particularly as the basis for lacks a conceptual framework understanding some of the successes and failures along the way. My suggestion would be that insights into this process should be based on understanding and comparing the varying world views of the key institutions and individuals involved. In a 2010 commentary , when WHO-GBD relationships were particularly difficult, I suggested that an important source of difference in understanding arose between the UN world view, in which member states are consitutents who need to be carried along in the process of deriving estimates, versus academic institutions' world view in which research sources and methods need to processed as cleverly as possible and promulgated to the world.
(note two typos here in the acronym for Bill and Melinda Gates Foundation, should be BMGF)

On the production of GBD:
Partly deriving from GBD's distinctive world view, of seeking to assert estimates of everything for everywhere at all times -which is a noble academic ambition -a lot of gap-filling has to be done because the ideal underlying data simply don't exist. In many instances, imputed estimates turn out very plausibly, and therefore, it might be argued, usefully. However, taking this approach does have its dangers, and there will be examples where for some reason the estimating process falls over (particularly if over-sophisticated modelling is applied to very sparse data). An interesting example was seen in GBD estimates of dengue deaths and case numbers, which, when divided to derive case-fatality rates, improbably showed that the USA had the highest dengue case-fatality rate in the world . Thus there seems to be a case for more effective plausibility checking mechanisms for GBD estimates, rather than necessarily considering what comes out of the modelling as some kind of "truth".

Who uses these estimates?:
Perhaps a more important question is "How are these estimates used?". While the point is made that BMGF mandates and supports the GBD process as a means of prioritising its global health priorities, you should also discuss this point as a potential conflict of interest. Every development agency rightly delights in their own successes, but this becomes tricky if they are also supporting the underlying metrics. At the 1 2 3 Thank you for your careful analysis of our article and for your very important critiques and references, and we have attempted to address your most important points. As we mentioned in the main response text, we have included a framework for conceptualising the different modes of global health estimate production, building from your own conceptual framework that compares the "UN world view" with that of academic institutions (Byass 2010). As we mention in a footnote in the article, we are grateful for your emphasis on BMGF's conflict of interest with regards to funding both global health investments and the potentially primary mode of measuring their success in the SDG framework.
No competing interests were disclosed. Competing Interests:

Version 2
Author Response 29 Jan 2020 , University of Edinburgh, Edinburgh, UK

Marlee Tichenor
In response to Colin Mathers' comment on the previous version of our article: Thank you for your careful analysis of our article and for your very important critiques and references, and we have attempted to address your most important points. We appreciate your point that there are multiple ways of interpreting the relationships between the different institutions involved in the GBD study over its lifespan -particularly between WHO and IHME since the latter's creation in 2007 -and that the nature of that relationship has changed over time. We use Horton's shorthand of "a Cold War" as a gesture to these varied tensions -in the introductory paragraph, we have tried to be clearer that Horton's take is one of many to describe this longstanding relationship between the two organisations. With regards to the benchmark year of 1990, we did not mean to indicate that the same data from that year has been used in GBD studies since, but instead to use that year to highlight the number of times that mortality and morbidity estimates have been reworked based on new data and new procedures in every iteration of the study in the years since. We have also corrected the point we made about the heart of the discrepancies between IHME and WHO estimates of malaria mortality and morbidity.
No competing interests were disclosed. Competing Interests:

Version 1
Reader Comment 05 May 2019 , Former WHO Coordinator for Mortality and Health Analysis, Geneva, Switzerland

Colin Mathers
As a former WHO staff member, who played a key role in the production and clearance of WHO health statistics over the last 15 years, and a long-time collaborator with the GBD enterprise and with IHME, I read this paper with considerable interest and note that the authors state that they primarily used notes taken at three IHME events. There were apparently no inputs from WHO staff involved in the interaction with IHME and GBD. This has resulted in some inaccuracies in the paper, some of which I address in the with IHME and GBD. This has resulted in some inaccuracies in the paper, some of which I address in the comments below. It is disappointing that an article examining the interaction between IHME and WHO/UN did not make the effort to include inputs from WHO and UN people who are closely involved in global estimates as well.
Despite what the editor of the Lancet, Richard Horton, is quoted as saying in the paper, there was no so-called "cold war" between WHO and IHME before 2012. Ties Boerma and I were members of the core scientific group for the first GBD2010. This was the central scientific decision-making group set up in 2007, with 15 members of whom 9 were from outside Chris Murray's research group. I and many other WHO staff contributed to the work of the GBD over the next five years, though Ties and I became increasingly concerned that the external core group were being excluded from access to the data and analyses. Around the period 2011 to 2012, six of the external core group members withdrew from the core group due to this and related issues. Apart from myself and Ties Boerma from WHO, this included Bob Black and Neff Walker from Johns Hopkins University, and Ken Hill and Dean Jamison from Harvard University. From WHO's point of view, there was no cold war (1), and various WHO staff continued to provide data and contribute to GBD analyses, and WHO continued to make use of analyses derived from the IHME GBD results. However, because we could not gain access to data and analyses, WHO staff were unable to agree to be authors on GBD papers and WHO as an institution was unable to endorse the results. Perhaps more importantly, WHO was also unable to examine areas where GBD results differed from WHO and other UN statistics in order to reconcile differences and potentially improve global health statistics. On page 4, the paper claims that the GBD 1990 data were reworked in various ways and used for the next 25 years, until IHME undertook the GBD2010. This is quite incorrect. During the period from 1999 through to 2008, the majority of mortality and morbidity estimates (for almost all diseases of public health importance) were revised with new inputs. This included development of new model life tables at WHO, a big growth in disease-specific modelling both at WHO and by academic collaborators, and the establishment of various UN interagency groups, particularly for MDG targeted diseases. I have reviewed WHO work on GBD during the period 1999-2008 and estimate that morbidity and disability estimates were revised using new data for around 90% of the disease and injury causes (including all those of public health importance) and mortality estimates were revised for 100% of causes. Disability weights were the main area where a comprehensive update was not carried out, though quite a few were revised using a European study (2), the World Health Surveys (3) and other sources of population information on health states.
The paper is incorrect in saying that the difference in malaria mortality estimates is because the IHME uses MAP parasite prevalence. WHO also uses the same parasite prevalence data as a major input to its estimates of malaria mortality (4). The big difference arises from IHME interpretation of verbal autopsy data in a way which maps much more "fever of unknown cause" to malaria for adults than WHO does.
The paper notes the difference in the IHME estimated trend for maternal mortality compared to that estimated by WHO, although there is little difference in the latest year estimates. Both IHME and WHO methods estimate the proportion of all female deaths in the reproductive period that are maternal deaths, and these estimates are reasonably similar. The trend difference in numbers of deaths arises because the IHME life tables have flatter adult female mortality trends than the UN life tables (5). The IHME life tables place greater credence on sibling history data for periods long before surveys and have flatter adult mortality trends in parts of Africa. This results in flatter maternal mortality trends.
In the discussion, the authors question the value of competition in achieving global health goals and link this to the emphasis in the GBD and indeed in all the UN global health statistics on the comparability of statistics across locations and times. While it is arguable whether the whole global targets setting process spurs healthy competition between countries, the concern about comparability in statistics is essentially a 1.

5.
spurs healthy competition between countries, the concern about comparability in statistics is essentially a concern to have meaningful statistics. And any statistic is only meaningful and interpretable through comparison. For example, an average death rate of 8,945 per 100,000 population is uninterpretable to almost everyone, unless put in a comparative context. Measurement only has meaning if a standard scale is used (or at least fixed scales that can be translated to each other). Since bias varies over time as well as over space, you could argue that lack of concern for comparability would be like tracking your weight with a scale whose zero is varying in an unknown way over time.
The authors do raise relevant ad important issues around the potential creation of a global health data monopoly, the concentration of analytic skills in a first-world institution, and the broader governance structures and accountability for statistics. Many developing countries have little interest in the outputs of a US academic group, but are very concerned about WHO and UN statistics. UN agencies have a mandate to produce statistics and some responsibility to consult with countries. IHME has tried to spin this as "political interference" which has largely not been the case, at least in my experience carrying out a central statistical clearance role in WHO and in working with the various UN interagency groups. The downside of IHME "independence" is that there have been quite drastic changes in methods and estimates from revision to revision for some causes and topics with little responsiveness in some cases to those who pointed out problems before publication. A recent example includes drug overdose deaths for USA, where GBD2016 excluded prescription opioid deaths (without documenting this) for unknown reasons, and GBD2017 included them, resulting in a more than doubling of drug overdose deaths. The sudden introduction of very different birth denominators in GBD2016 similarly knocked around half a million child deaths off the global total compared to UN (which previously was almost identical). IHME is now estimating its own population and birth numbers. So the mortality and other outputs are inhabiting a parallel demographic universe to those of the UN agencies. This makes the issues of understanding difference even more complex and opaque. And I suspect will unfortunately limit the ability of UN agencies to make direct use of IHME results.