The effect of health financing systems on health system outcomes: A cross‐country panel analysis

Abstract Several low‐ and middle‐income countries are considering health financing system reforms to accelerate progress toward universal health coverage (UHC). However, empirical evidence of the effect of health financing systems on health system outcomes is scarce, partly because it is difficult to quantitatively capture the ‘health financing system’. We assign country‐year observations to one of three health financing systems (i.e., predominantly out‐of‐pocket, social health insurance (SHI) or government‐financed), using clustering based on out‐of‐pocket, contributory SHI and non‐contributory government expenditure, as a percentage of total health expenditures. We then estimate the effect of these different systems on health system outcomes, using fixed effects regressions. We find that transitions from OOP‐dominant to government‐financed systems improved most outcomes more than did transitions to SHI systems. Transitions to government financing increases life expectancy (+1.3 years, p < 0.05) and reduces under‐5 mortality (−8.7%, p < 0.05) and catastrophic health expenditure incidence (−3.3 percentage points, p < 0.05). Results are robust to several sensitivity tests. It is more likely that increases in non‐contributory government financing rather than SHI financing improve health system outcomes. Notable reasons include SHI's higher implementation costs and more limited coverage. These results may raise a warning for policymakers considering SHI reforms to reach UHC.

via SHI agencies, implying that a contribution is required to access services, irrespective of whether the contribution is subsidized by the government or not. Government financing refers to any other non-contributory public health expenditure, that is, where access to services is automatic, not linked to contributions, and usually based on citizenship or residency status. In either case, the aim is to increase pooled public health expenditure and to transition away from out-of-pocket (OOP) private health expenditure (World Health Organization, 2019) toward UHC. HFS reforms entail substantial long-term administrative efforts (e.g., setting up new laws and functional agencies), and may impact financial risk protection and population health for years to come.
Despite the importance of HFS as a major factor for achieving UHC, there is scarce empirical cross-country evidence on the impact of HFSs on health system outcomes. Two important but regionally focused studies (on OECD and Eastern European countries) from more than a decade ago concluded that introducing SHI led to no improvement or even to a deterioration of health outcomes, while having increased costs (Wagstaff, 2009a;Wagstaff and Moreno-Serra, 2009a). A common issue in these studies is that a country's HFS, depending on existing laws, could only be classified as either "tax-based" or "SHI". By allowing only these two classifications, countries financed predominantly by OOP expenditures were (mis-)classified as either tax-based or SHI. In addition, only the effects of transitioning from tax-based to SHI HFS were examined, thus ignoring the potential effects of transitioning from predominantly OOP to either tax-based or SHI HFSs. Another global study found that (proportional) increases in expenditure in contributory SHI and non-contributory government financing are positively correlated with service coverage indicators, but only non-contributory government financing is correlated with improvements in financial risk protection (Wagstaff & Neelsen, 2020). This study investigated the association of HFS with financial risk protection and service coverage but not health status, controlled only for GDP per capita, and most importantly did not investigate transitions from OOP to either SHI or government financing predominant HFS, which is arguably the decision commonly faced by policymakers when contemplating potential paths toward UHC. Other, broadly related studies have investigated the impact of public health expenditure on health system outcomes, mostly finding a positive effect (Nakamura et al., 2016), yet without differentiating public health expenditures into government or SHI financing sources. Finally, a recent systematic review of relevant country case-studies concludes that public health insurance, defined as SHI and community-based health insurance, appears to reduce financial risk protection (Erlangga et al., 2019). However, it is not clear whether this effect is applicable to the entire populations of SHI-countries, or to SHI beneficiaries alone.
In this paper, we seek to assess the impact of different HFSs on health system outcomes (i.e., health status, financial risk protection and utilization (Kutzin, 2008)), and on health expenditures, with a view to informing decisions about potential transitions to either contributory SHI or non-contributory government financing, aimed at accelerating progress toward UHC. We also shed light on potential contextual factors likely to affect the impact of HFSs.
We find that transitions from OOP-to SHI-predominant HFS resulted in increased total health expenditure. However, transitions to government-predominant HFS resulted in greater immunization coverage, and improved health system outcomes (life expectancy, under-5 mortality and incidence of catastrophic health expenditure). As potential reasons, we discuss the role of (higher) costs for implementing SHI, its benefits being contribution-linked, the tendency to favor secondary/tertiary care expenditures, and SHI's limited ability to decrease OOP expenditures. We also detect a role for contextual factors: in particular, increases in informal sector size diminish the effects of HFS on most health system outcomes. Other contextual factors considered (GDP per capita, governance) also act as effect modifiers of HFS, albeit to a lesser extent.
Endogeneity, driven by reverse causality (e.g., as countries with low financial risk protection may be more likely to introduce SHI) and omitted variable bias, is a central challenge in all studies investigating the association between HFS, health expenditure and health system outcomes (Nakamura et al., 2016), and our study is no exception in this regard. We seek to address endogeneity via fixed effects regressions, exploiting the variation in HFS generated by the health financing transition, which allows controlling for the influence of unobservable or unmeasured time-invariant factors. As our results are robust to most, but not all, different specifications and outcomes, concerns regarding endogeneity driven by reverse causality are not completely resolved. For this reason, we do not claim to provide entirely causal evidence.
This paper contributes to the literature in several ways. First, we refine the classification of HFS by using a machine-learning, data-driven approach, which allows us to distinguish contributory SHI-, non-contributory government-, and OOP-predominant HFSs. Second, we examine the separate effects of transitions from OOP-predominant to SHIor government-predominant HFSs. This is an advance on previous studies that commonly considered public health expenditure as a bundled aggregate, irrespectively of its specific financing nature (Filmer & Pritchett, 1999;Moreno-Serra & Smith, 2015;Nakamura et al., 2016;Rajkumar & Swaroop, 2008), and on studies that did not model transitions from OOP-to SHI-or government-predominant HFSs (Wagstaff, 2009a;Wagstaff & Moreno-Serra, 2009a;Wagstaff & Neelsen, 2020). We also use panel data across more country-years than previous HFS studies, and-in order to reduce omitted variable bias-we take into account the potential role of multiple contextual factors that were used in the public health expenditure literature (Nakamura et al., 2016), but were neglected in previous SHI-related studies (Wagstaff, 2009a;Wagstaff & Moreno-Serra, 2009a;Wagstaff & Neelsen, 2020). Finally, we provide more depth to the conclusion that "context matters", by empirically investigating interactions between contextual factors (e.g., informal sector size) and HFS transitions.
While these results should not be taken to imply that non-contributory government-predominant HFSs are always 'better' than contributory SHI-predominant systems, they may raise a warning to policymakers favoring the path of SHI to accelerate progress toward UHC, while reassuring those aiming for expansions of non-contributory government-financed systems. GABANI et Al. F I G U R E 1 Conceptual framework. Source: authors' elaboration, expanding frameworks presented in (Kutzin, 2008). The conceptual framework follows a logic model representation. Black lines represent potential causal pathways between HFS and health system outcomes; numbers attached to black lines refer to hypotheses listed in the "Hypotheses" section. The two hypotheses and two unintended consequences noted in the figure are not exhaustive. Red lines represent the pathways investigated by this study. Blue lines represent pathways investigated by existing cross-country regression studies: Moreno-Serra, 2009, andNeelsen, 2020 [Colour figure can be viewed at wileyonlinelibrary.com] Figure 1 illustrates the hypothetical pathways mapping HFS reforms to health system outcomes through intermediate outputs and outcomes. More detailed pathway examples are provided in Appendix 1. A country with 'predominant-OOP HFS' has its total health expenditure (THE) predominantly contributed through OOP, as identified via cluster analysis, and similarly for other classification categories (more on this in Section 3.1).

| HEALTH FINANCING SYSTEMS (HFS) AND HYPOTHETICAL EFFECTS ON HEALTH SYSTEM OUTCOMES
As shown in the above framework, governments may reform their HFSs to increase (pooled, pre-paid) public revenues and public health expenditure. These efforts are in line with the recommendation of increasing public health expenditures in order to avoid that OOP expenditures increase impoverishment (Xu et al., 2003) and decrease utilization of health services (Qin et al., 2019). For example, assume a country with very high OOP health expenditures decides to subsidize completely all health services to under-5 year old children and pregnant women. The pool of (fully) covered patients increases and more public revenues (payroll contributions and/or general taxes) are then raised to pay for those services. OOP expenditures of pregnant women and families with under-5 year old children will be expected to decrease as services previously paid by OOP are now subsidized by the government via pre-paid taxes: these families are now more protected against financial risk (related to OOP health expenditures). Government non-contributory financing emerges as the largest contributor to THE, exemplifying what we call a transition (Fan & Savedoff, 2014a;World Health Organization, 2019) from OOP-predominant to government-predominant HFS. As in the health financing transition described in the literature (Fan & Savedoff, 2014a), THE per capita is likely to grow while government financing expenditure becomes predominant and OOP expenditure as percentage of THE decreases. Higher THE, especially via increased government financing and decreased OOP expenditure, would translate into more services offered to the population (Qin et al., 2019). Assuming there is demand for the services, the population will use more services than before, and, if those services are of sufficient quality, population health will improve (Moreno-Serra & Smith, 2015). Similarly, THE may be spent more efficiently when it is pooled and pre-paid: families paying for services OOP do not pool together their financial resources and pre-pay for complex and efficient services, such as vaccination or public health campaigns, and neither can they share the risks of ill health across life-stages (old-young) or social strata (rich-poor). Pre-paid pooled expenditures in principle allow the government to deliver cost-effective, preventative health services such as vaccinations, community health, and others, as well as to pool the risks of different individuals together (Moreno-Serra & Smith, 2015). In both these cases (i.e., when THE increases and/or when THE is spent more efficiently), if pregnant women and families with children under-5 who utilize public services belong to poorer population groups, health equity may also be positively impacted.
Contextual factors and unintended consequences are included in the conceptual framework. Contextual factors are to a certain extent not directly part of-and to a certain extent may be external to-HFS transitions (e.g., quality of governance, education or income per capita), and may modify the effect of HFSs on health system outcomes (Rajkumar & Swaroop, 2008). One example of an unintended consequence is that the introduction of SHI funded via payroll contributions SHI may drive lower formal employment (Savedoff, 2004;Wagstaff & Moreno-Serra, 2009b;Yazbeck et al., 2020), which in turn may reduce the amount of revenues generated to fund the system and hence limit SHI coverage. Figure 1 also shows the effects investigated by previous studies (blue lines), and the effects investigated in this paper (red lines), highlighting our contribution to the literature. An important clarification is required: as shown by the red arrow, we do not investigate the effect of an SHI reform in the way this has been done in (Wagstaff, 2009a;Wagstaff & Moreno-Serra, 2009a), which would classify as "SHI" any country with SHI policies, laws and institutions. The effect we investigate is that of health expenditure transitions, from being OOP-predominant to being government-or SHI-predominant. Take for example, Ghana, which introduced the National Health Insurance Fund (NHIF, a form of SHI) in 2004. By 2017, OOP still accounted for the largest proportion of Ghana's total health expenditures. Hence, Ghana is classified in this paper as an OOP-predominant HFS country, rather than a SHI country, despite the existence of SHI laws and institutions. Therefore, our analysis does not estimate the effect of the introduction of the SHI policy in Ghana. Similarly, Brazil's public health system was instituted by law in 1990, but its HFS transitioned from being OOP-predominant to being government-predominant only in the 2000-2017 period: our study measures the effect of Brazil's (and other countries') health financing transition out of OOP expenditure, rather than the introduction of laws to expand primary healthcare services. As these examples illustrate, our baseline estimates of the effects for SHI-and government financing-predominant HFS should be interpreted as the effects of SHI or government financing policies that are successful in making a HFS transition from OOP predominance to SHI or government financing predominance. SHI or government financing policies that fail to do so would result in HFS being classified as OOP predominant HFS. The effect of increased SHI or government financing expenditure that does not translate into a change in predominant HFS is explored in models where we use SHI and government finance as % of THE as treatment variables (Appendix 6), instead of using SHI-and government financing-predominant HFS dummies, and in models exploring within-group changes in financing arrangements percentages (Appendix 4).
Government financing is universal in that it provides healthcare coverage to the population automatically based on residency or citizenship status, without requiring a direct contribution. Health services are pre-paid, usually by general taxation, and there is usually a common pool for all residents/citizens. Predominantly government-financed countries, whose public health systems are often referred to as "national health service", are for example, UK, Italy, Spain, Australia, Canada, and Cuba. Publicly funded health insurance schemes that are entirely non-contributory (e.g., Thailand Universal Coverage Scheme (Sumriddetchkajorn et al., 2019), or India Ayushman Bharat Pradhan Mantri Jan Aarogya Yojana) are also considered government financing. Due to data limitations, non-contributory government financing arrangements that show features typical of health insurance schemes (e.g., provider and payer split, health insurance premiums or budgets paid by the government) cannot be separated from other non-contributory government financing arrangements. SHI-financing is also pre-paid, but it differentiates itself from government financing by being contributory: a contribution has to be paid for a person/household to be able to receive healthcare coverage. Traditionally, the contribution is a deduction from the person's payroll. Individuals or households that do not contribute are not covered. In recent 'extended' forms, population groups that are usually identified as unable or ineligible for payroll or premium contributions are covered through government subsidies out of general tax revenues. In either of these cases, there may be different pools in the same country. Examples of SHI-predominant countries are for example, Germany, France, Austria, Japan, Poland, and Turkey. Both government financing and SHI financing are heterogeneous and implementation differs by country. OOP financing is generally characterized by private citizens buying or paying for health services when needed, without any pre-payment or risk pooling. Some government financing-and SHI-predominant HFSs may have OOP co-payments made by citizens/members: these fees are included in OOP expenditures. OOP-predominant countries are for example, Armenia, Bangladesh, Mali, Ecuador, Liberia and India.
Previous studies have classified into the "SHI" group those countries with SHI laws, SHI institutions and/or earmarked payroll deductions (Wagstaff, 2009b;Wagstaff & Moreno-Serra, 2009a) (i.e., the Bismarck model). All other countries were usually classified as "tax-based" (i.e., the Beveridge model (van der Zee & Kroneman, 2007)). This approach arguably runs the risk of potentially having misclassified OOP-predominant countries as tax-based HFS (Armenia, Azerbaijan, Ukraine, Uzbekistan, Kyrgyz Republic). As shown in Table 1, in all these countries, OOP expenditure is the main contributor to THE. In addition, we provide examples from other countries not included in (Wagstaff, 2009a;Wagstaff & Moreno-Serra, 2009a).
One option is to use expenditure data to define HFS via arbitrary thresholds. However, arbitrary choices may also misclassify countries with no clearly predominant financing arrangement.
By contrast, a clustering approach provides a classification that has two main benefits: it is largely data-driven and uses as input health expenditures, rather than more arbitrary classification mechanisms based on information, which would be hard to interpret or collect across all world countries. Using k-means clustering (MacQueen, 1967), each country-year combination is assigned to the HFS that has the closest mean values of government-, SHI-and OOP-expenditure as percentage of THE. More detail regarding the clustering procedure is provided in Appendix 3. In this approach, the arbitrary choices are limited to the input factors and the number of groups. For the input factors, OOP, SHI and government financing expenditure as % of THE are chosen because, together, they make 89% of total health expenditure in our sample. Other schemes (non-profit institutions serving households (NPISH), voluntary health insurance) are below 5% as a % of THE, and in no country-year observation are found to be the largest scheme. We choose to have three groups because in this way we can better address the research question (i.e., the effect of transitions from OOP to SHI and government financing HFSs), and because clustering optimization analyses (Makles, 2012) suggest that three groups is an optimal choice (see Appendix 3). The HFS variable generated by the analysis has three possible values: government-, SHI-and OOP-predominant HFS by country-year. We use the word "predominant" because in all cases HFS are a mix of different health financing arrangements: while one arrangement is predominant, other arrangements coexist. In fact, another benefit of clustering is that it recognizes the mixed nature of HFS by considering data regarding all three major health financing arrangements when assigning country-year observations to HFS groups. In this paper, a health financing transition is defined as a country's "switch" that lasts at least 2 years from an OOP-predominant HFS to a SHI-or government-predominant HFS.
As the definition of the predominant HFS by country-year may affect our results, we explore the robustness of our main results to different 'predominant HFS' definitions. First, we define the predominant HFS using the highest value between government-, SHI-and OOP-expenditure as percentage of THE. Second, to address concerns that country-year observations may be classified as OOP-predominant while having OOP expenditures as % of THE below 40% (see Table 1), we use different thresholds to define OOP-predominant HFS. In other words, we define a country-year observation as OOP-predominant only if OOP expenditures as percentage of THE is larger than a threshold t, for example, 50%, 45%, 40%, etc. Third, we add other health financing arrangements variables to the clustering procedure so that all health financing arrangements making up 100% of THE (i.e., NPISH as % of THE, voluntary health insurance as % of THE, enterprise schemes as % of THE, and rest of the world schemes as % of THE) are considered.

| Empirical strategy
The main specification is as follows: Where represents an outcome of interest from Figure 1, in country at time . and are HFS dummies that take value 1 if the country-year observation respectively belongs to the SHI-predominant or government-predominant HFS group, and 0 otherwise. OOP is the reference HFS. is a vector of control variables. represents time fixed effects (FE), and country FE, which respectively control for cross-country shocks and time-invariant unobservable variables. Coefficients 1 and 2 can be interpreted as the within-country effect on outcome of transitioning (i.e., switching) from OOP, the reference category, to SHI-and government-predominant HFS, holding controls (detailed later) constant. Note: *possible classifications: OOP-, government-or SHI-predominant,. As an example, we take 2017 data. Countries in italics were not included in (Wagstaff, 2009b;Wagstaff & Moreno-Serra, 2009a), we classified them based on the rules used in those papers. The sum of OOP, government financing and SHI as % of THE may not equal 100% due to other health financing arrangements (e.g., voluntary private health insurance arrangements, non-resident arrangements).
T A B L E 1 Comparison of countries' health financing system (HFS) classification across studies To investigate the question "how does context matter", we augment our model by interacting SHI-and government-predominant HFS dummies with several contextual factors (Equation 2), detailed later. = + 1 (SHI × CF ) + 2 ( × ) + + SHI + + + + + For the contextual factor analysis, we are interested in the interaction terms coefficients 1 and 2 , which will be interpreted as modification on the effect of and on outcome by computing and at different values of . The main model in Equation (2) is similar to a generalized difference-in-difference (DiD) estimator, with two reversible treatments, one reference group (OOP predominant group), and different treatment timing (i.e., a country can switch from the OOP group to or groups and vice versa at any ). DiD assumes a parallel trend: we therefore subject our results to tests of the DiD parallel trend assumption as done in (Wagstaff, 2009a;Wagstaff & Moreno-Serra, 2009a).

| Specification tests
We conduct tests of the parallel trend assumption using random trend and differential trend models. In the random trend model, we relax the parallel trend assumption by adding country-specific linear trends ( ), as shown in the following equation: We estimate Equation (3) with and without country-specific linear trends. We then test whether the and effects are different in FE models with country-specific trends (FECS) and FE models without them (FE) (Clogg et al., 1995): This test allows using SEs clustered at country level. Non-rejection of the tests in Equation (4) (2 tests per model, one for 1 and one for 2 ) would suggest that FECS and FE are not different, that are not correlated with SHI or GOV , and that the parallel trend assumption (PTA) is consistent with our data. This can be seen intuitively: Equation (3) without is equal to Equation (1). The random trend model assumes that each country trend is linear and is not affected by SHI and . These assumptions are likely to not hold in our case, as it is likely that SHI and affect country trends. We therefore relax the parallel trend assumption using a differential trend model (Blundell & Costa Dias, 2009;Wagstaff, 2009a;Wagstaff & Moreno-Serra, 2009a). The error term is now: Where is an unobserved (differential) trend whose effect on the outcomes is different across SHI-, government-and OOP-predominant countries. This allows each HFS group trend to be non-linear and modified by SHI and , as shown in the following equation: Equation (6) can be estimated via fixed effects, with interactions between year dummies (first year dummy is excluded and used as reference) and treatment dummies: The effect of each transition can be calculated as the average effect of SHI and , respectively: As shown in (Wagstaff, 2009a;Wagstaff & Moreno-Serra, 2009a) the PTA in the differential trend implies that ( − ) = ( − ) = 0 , which can be tested via the following nonlinear restriction, for 1 and 2 : Again, non-rejection of these tests would suggest that the PTA is consistent with our data. This can be seen intuitively: Equation (6) reduces itself to Equation (1) when ( − ) = ( − ) = 0 Reverse causality does remain a concern, as a country will likely increase SHI and government financing when population health is deteriorating (e.g., a health crisis such as Ebola or COVID-19): we expect that reverse causality will bias the estimated coefficients for SHI and government HFSs downward for life expectancy, and upward for mortality and catastrophic health expenditure incidence. We run a test of reverse causality (in a Granger sense) used in the related literature (Gruber & Hanratty, 1995;Wagstaff, 2009a;Wagstaff & Moreno-Serra, 2009a), noting that the test does not necessarily imply causality (Angrist a&nd Pischke, 2008a). We add to Equations (1), Equations (3) and (7) lead HFS variables ( +1 , +1 ) that indicate whether the following year there will be a transition from OOP to SHI or government financing. A non-zero coefficient would suggest that endogeneity is not appropriately addressed, while a zero coefficient would indicate the opposite.
Finally, the recent literature on country and time FE regressions has highlighted the problem ("negative weights") that, in the context of heterogeneous treatment effects, the FE estimator is a weighted average of different effects, including the treatment effect of early versus late treatment adopter, and vice-versa (Goodman-Bacon, 2018). We therefore decompose 1 and 2 (Equation 1) to explore whether this issue is affecting our results, for countries for which the transition was staggered (i.e., they remained exposed to the HFS they transitioned to).
Stata 14 (StataCorp, 2015) has been used. Heteroskedastic-and within-panel serial correlation-robust SEs, clustered at the country level, are reported. A replication package is provided in the data availability statement.

| DATA
We use annual data for the 2000-2017 period across a global sample of countries from different sources; due to data limitations, our main models include 124 countries. Sample construction details, variables definition, and source datasets are provided in Appendix 2.

| Health financing data
The data on health expenditures (by financing arrangement) as percentages of THE, which is used for the cluster analysis, are from the WHO Global Health Expenditure Database (GHED), for the 2000-2017 period. WHO collects GHED data from countries using the System of Health Accounts (SHA) 2011 methodology (OECD et al., 2017). We use data under the "Health Care Financing Schemes" section, classification codes HF.1-4. In this paper, we use "arrangement" as a synonym of scheme, to avoid confusion with HFS (health financing system). If tax revenues are used to finance a SHI agency providing contributory SHI coverage, those revenues are "channelled via" SHI and are counted as SHI expenditure. Predominance can be read as "health expenditures channelled predominantly via a" non-contributory government, contributory SHI, or OOP arrangement, based on clustering results. As noted in the literature (Rannan-Eliya, 2008), OOP financing estimates suffer from potential data quality concerns. SHI as a health financing scheme comprises both compulsory public health insurance (96% of total SHI, across all countries, 2000-2017) and compulsory private health insurance (4% of total SHI financing, across all countries, 2000-2017).

| Intermediate outcomes and health system outcomes
Intermediate outcomes comprise the immunization coverage index (i.e., the average of measles, DPT and hepatitis immunization rates) from World Bank World Development Indicators (WDI) (World Bank, 2019c), and (logged) THE per capita in current US$ from WHO GHED. Health status health system outcomes are life expectancy (LE), maternal mortality (MM), and under-5 child mortality (U5M), also from WDI. Mortality outcomes have been logged, as done in the related literature. The World Bank Health Equity and Financial Protection indicators (HEFPI) dataset (World Bank, 2019a) has been used for the financial risk protection health system outcomes. Since there are many different measures of financial risk protection, the most commonly used (Wagstaff et al., 2018) has been chosen: catastrophic health expenditure incidence at the 10% level (CAT 10%). Health equity and the UHC index are not used as an outcome due to data limitations. Data for the UHC index was available only for 2010 within the data period of the analysis 2000-2017 from the Global Burden of Disease UHC dataset (Lozano et al., 2020).

| Contextual factors: Control variables and interaction terms
We select control variables (contextual factors in our conceptual framework, Figure 1) that may confound the relationship between public health expenditure and health outcomes (Nakamura et al., 2016). The WDI dataset was used for (logged) GDP per capita (PPP, constant 2011 US$), education (primary school enrollment gross %), urbanization rate, % population with drinking water access, Gini index, and proportion of population above-65 and below-14 (Nakamura et al., 2016). The Worldwide Governance Indicators (WGI) dataset (World Bank, 2019b) was used to extract the control variables government effectiveness and corruption control. We do not control for THE, hospital beds and health workforce, as these factors would be on the causal pathway between HFS and health system outcomes (i.e., "bad controls" (Angrist & Pischke, 2008b)).
Contextual factors used as interaction terms in Equation (2) are often cited as "conditions required for" HFS to be successful : (logged) GDP per capita, government effectiveness, corruption control, percentage of health revenues from payroll contributions (i.e., labour-tax), informal sector size (informal workers as % of non-agricultural jobs), and general government expenditure (GGE) as % of GDP.

| RESULTS
This section is organized as follows: first, we show clustering results, then we present FE estimates, and, finally, tests and robustness checks including sub-sample analysis (e.g., for LMICs specifically) are shown. Figure 2 shows the results of the k-means clustering analysis. In the 2000-2017 period, the proportion of predominantly-OOP countries decreased (−8%), while SHI-predominant and government predominant increased (+4% each). The clustering analysis confirms the health financing transition from OOP to public health expenditure (Fan and Savedoff, 2014b), that is, government and SHI HFS (World Health Organization, 2019). Table 2 provides descriptive statistics for the three HFS groups. In SHI-predominant country-year observations, SHI channelled expenditure does not exceed 50% of THE, and there is slightly higher public health expenditure as a proportion of THE (i.e., sum of SHI and government financing as a % of THE) versus predominantly-government financed systems. Table 2 suggests that selection into a HFS may not be random: SHI-predominant observations show higher income, better health systems outcomes, higher THE, lower investments in primary health care (PHC) and lower informal sector size, versus other HFS′ groups. The OOP-predominant HFS group is characterized, on average, by government financing being almost 30% of THE: in other words, OOP-predominant systems are government financing systems with low public health expenditure. Figure 3 focuses on the countries that switched from OOP to SHI or government financing HFS, or vice versa, in 2000-2017 (full list across 18 years in Appendix 3). Given that in FE regressions, within-country variation is the focus (see 3.2.1), we note that seven countries switched from OOP to SHI, and 30 countries switched from OOP to government financing systems. SHI transitions show a lower decrease in OOP expenditures as % of THE (−6% percentage points), compared to government financing transitions (−13% percentage points). In both cases, the main public health expenditure arrangement increased significantly.

Variable
Used as  In SHI transitions, not only OOP but also government financing did decrease (−8% of THE). For the government financing predominant HFS, the transition from OOP predominant to government financing predominant HFS is driven by growth in GGE as % of GDP (+6%) and growth in domestic health expenditure as % of GGE (+18%), which have finally resulted in a substantial increase in non-contributory government financing as % of THE ( Figure 4). Table 3 shows estimates from Equation (1). HFS coefficients in Table 3 represent the decrease/increase in the dependent variable ("outcome") as a result of switching to a government-or SHI-predominant HFS, from the reference OOP-predominant system. HFS coefficients for logged outcomes (THE per capita, U5M and MM) are interpreted as ∆ % = ( − 1) . In terms of intermediate outcomes (as depicted in Figure 1), SHI transitions increase THE (column 1, +12.4%), while no such effect is visible for government financing transitions. FE estimates in column  show that transitioning from a predominantly OOP to government-or SHI-predominant HFS have effects that are not statistically different from zero on immunization coverage. GABANI et Al.

F I G U R E 3
Average of OOP, SHI and government financing as % of THE, during health financing transitions. Source: Author elaboration. The figure shows SHI, OOP and government financing as % of THE for countries that switched from OOP-to SHI-predominant and government financing-predominant HFS. The sum of OOP, government financing and SHI as % of THE may not equal 100% due to other health financing arrangements (e.g., voluntary private health insurance, non-resident arrangements) F I G U R E 4 Histogram of parallel trend assumption specification tests p-values. Source: Author elaboration. Histogram of p-values resulting from PTA tests of the  and dummy variables, across 2 specifications, random trend model PTA test (Equation 4) and differential trend model PTA test (Equation 9), all 6 outcomes (total of 24 tests) As for health system outcomes, transitioning from OOP-to government-predominant HFS shows-for LE, U5M and CAT10%, respectively-rather strong evidence of an improvement (LE: +1.3 years, U5M: −8.7%, CAT10%: −3.3% points). Government-predominant HFS transitions improve LE, U5M, and CAT10% (three out of four health system outcomes) significantly (p < 0.05) more so than SHI-predominant HFS transitions. However, for CAT10%, the SHI lead tests-results not shown, see -suggest that SHI reverse causality may be a concern: since the SHI HFS lead is statistically different from zero, it appears that SHI transitions occur when CAT10% is high, and high CAT10% "anticipates" SHI transitions. No significant SHI transitions effects are found for maternal or under-5 mortality.
One concern is that health financing mix heterogeneity within HFS groups may affect our results. For example, an increase in SHI-financing as % of THE within the OOP predominant group may affect outcomes. To scrutinize this, we run FE regressions of government, SHI, and OOP expenditures as a percentage of THE on all outcomes within the government-, SHI-and OOP-predominant sub-groups: in only six models out of 36, within-group changes in financing arrangements show effects on outcomes different from zero (at 10% level) (see Appendix 4). In other words, within-group heterogeneity in the percentage of expenditure channelled via different health financing arrangements has limited impact on outcomes.

| How does context matter?
Estimates of Equation (2) using all six outcomes and six contextual factors (GDP per capita, informal sector size, proportion of health revenues from labor taxes, government expenditure as percentage of GDP, control of corruption, government effectiveness) are presented in Appendix 5. In seven of the 36 models estimated, at least one interaction term is significant (5% level), confirming empirically a non-trivial role of contextual factors. We report on those significant estimates only.
The informal sector size is the contextual factor modifying the effect of HFS transitions in most cases: a one percentage point increase in informal sector size together with a transition to SHI-predominant HFS increases U5M by 0.6%, and decreases immunization coverage by 0.2% points (the latter, when informal sector is beyond 65%). The same increase in informal sector size together with government financing HFS has a very similar effect on immunization coverage, but no effect on U5M. An increase in the log of GDP per capita improves the negative effect of SHI transitions on immunization coverage (+4.8% points), while better corruption control together with SHI-predominant HFS transition delivered higher general government health expenditure as % of general government expenditure (+0.9% points in general government health expenditure per 1 point increase in the control of corruption index).
A percentage point increase in general government expenditure (as % of GDP) together with government-predominant HFS transitions decreases general government health expenditure (% of general government expenditure), but the effect is very small (−0.08% points): possibly ministries of finance having large budgets tend to prioritize health sector funding slightly less  (1). Robust SEs, clustered at country-level, in parentheses. Details on HFS switches are detailed in Appendix 3. Full regression results including control variables are presented in Appendix 4. All models control for all variables listed as "control" in Table 2. p-values for two-sided t-tests are reported as: ***p < 0.01, **p < 0.05, *p < 0.1.
T A B L E 3 FE estimates for intermediate outcomes, health system outcomes as a proportion of total budget, when high absolute funding levels are considered sufficient. A one percentage point increase in health revenues coming from labor taxes together with SHI-predominant transitions also decreases general government health expenditure (% of general government expenditure) (−0.12% points).

| Specification tests and robustness checks
We present first the results of parallel trend assumption specification tests (Equations 4 and 9) which suggest that the parallel trend assumption is consistent with our data in the large majority of cases (∼75%), justifying the use of Equation (1) as our main specification. In the cases interested by potential parallel trend assumption rejections, we present random trend and differential trend model estimates (see Appendix 6, Panel G). These results do not change our conclusion. Second, we present in Figure 5 the results of the reverse causality tests (p-values of and 1-year leads in Equations (1), (3), and (7)): the vast majority of lead HFS (∼85%) are not significantly different from zero, suggesting that reverse causality is a rather limited issue across the vast majority of outcomes. However, the reverse causality (in a Granger sense) tests suggest that SHI transition occur when CAT 10% is particularly high, therefore the SHI coefficient for CAT 10% is likely affected by reverse causality.
Beyond specification tests, we subject our estimates to a series of robustness checks (Appendix 6). First, based on potentially very different contextual patterns in LMICs as compared to high-income countries, we restrict the sample to LMICs. We also run sub-group analyses restricting the sample to the high-and middle-income countries, and to middle-income countries only. Second, given concerns about public health data quality (Moreno-Serra & Smith, 2015;Nakamura et al., 2016), we remove outliers (approx. 1% of the sample) using a non-arbitrary methodology (Billor et al., 2000). Third, we use 1-year lagged HFS independent variables as HFS effects on health system outcomes may not be contemporaneous. To explore the robustness of our main results to potentially lagged effects and reverse causality (in a Granger sense), we also implement visual event studies. Fourth, since general government expenditure as a percentage of GDP may limit the impact of HFS, we add it as a control variable. Fifth, we estimate Equation (1) using government and SHI as % of THE instead of HFS dummy variables (removing OOP expenditures as % of THE from the model due to collinearity issues, since the sum of all health financing arrangements is 100%). Sixth, we provide estimates of random trend and differential trend models in cases where the parallel trend assumption is rejected. Seventh, In the related literature, mortality outcomes have been either log-transformed (Bokhari et al., 2007;Filmer & Pritchett, 1999;Rajkumar & Swaroop, 2008) or un-transformed (Moreno-Serra & Smith, 2015; Wagstaff, 2009a;Wagstaff & Moreno-Serra, 2009a): to accommodate this alternative practice, in Panel H we use the natural units version of previously logged outcomes. Eighth, since the use of one of our control variables (Gini index) results in a loss of approximately half of total observations, we remove it to check for potential selection bias induced by missing observations. In addition, we remove other controls so that all countries in the dataset are included in the regression. Ninth, we add development assistance for health (as % of THE) to the list of control variables. Finally, we apply adjustments to all time-varying controls that are related to HFS treatment (Zeldow & Hatfield, 2021).
Our baseline results are largely robust to the vast majority of the above-mentioned specifications changes. Using random trend and differential models that relax the parallel trend assumption, the estimated coefficients on LE and CAT10% lose significance but show the same sign as the baseline coefficients. Adding unit-specific linear trends (i.e., random trend model) would increase the importance of countries treated at the beginning and end of the panel (Goodman-Bacon, 2018 outcomes, baseline results are unaffected by relaxing the parallel trend assumption. In one specification (HFS percentages), the government HFS effect on LE loses significance. In the same specification, government financing improves CAT10% and U5M significantly more than SHI (not shown). Our results are particularly sensitive to this robustness check: while our clustering-based HFS definition captures changes in the predominant HFS, HFS percentages capture changes in THE composition regardless of the predominant HFS. In other words, these results suggest that increasing SHI or government financing as a percentage of THE may not have a sizable impact on health outcomes, if the predominant HFS does not change.
Event studies confirm that government financing improves LE and U5M, in particular after 4-5 years, while SHI does not show improvements for any outcome. Using different definitions of HFS (i.e., using the highest value between government financing, SHI and OOP expenditures as a percentage of THE, and using OOP thresholds, as described in Section 3.1) does not substantially affect the main results either. In the "highest HFS value" specification, the coefficient for SHI effect on THE lose significance and the coefficient for SHI effect on logged MM shows a worsening, significant effect (+12.1% points, p < 0.01), suggesting possible health system outcomes worsening due to SHI transitions. The main results are also confirmed when adding all health financing arrangements variables in WHO GHED (i.e., NPISH as % of THE, enterprise schemes as % of THE, voluntary health insurance as % of THE), and when we add all health financing arrangement variables plus THE per capita as input variables to the clustering procedure: in either case, government financing HFS performs better than SHI HFS for all outcomes. Setting the number of clusters to four also shows that government financing predominant HFS perform better than predominant SHI HFS for U5M and CHE 10%, and never shows government financing being worse than SHI. Using additional health system outcomes (CHE 25%, impoverishment driven by OOP expenditures at the 1.90US$ and 3.20US$ poverty line, male and female adult mortality), and health system outcomes from different data sources (i.e., World Bank WDI "Maternal Mortality Ratio, National Estimates"; infant mortality and U5M from Demographic and Health Surveys), confirms that in most cases government financing predominant HFSs show better outcomes than SHI-predominant HFSs.
Our baseline results are not affected by comparisons of late and early switchers ("negative weights"): the weight of 1 and 2 (Equation 1) driven by comparing late and early switchers outcomes is marginal for countries switching from OOP-predominant to government financing HFS (weight 3%-5%) and to SHI-predominant HFS (weight 1%-2%).
Across our three main specifications (DiD FE, random trend, and differential trend models), government financing transitions improve (at the 10% level) outcomes more than SHI transitions in most cases (53% of the time, Figure 6). The effects of SHI transitions on health outcomes do not exceed that of government financing transitions for any of the outcomes. Estimates from robustness checks largely confirm these conclusions.

| DISCUSSION
Achieving UHC is a widely shared health policy objective, and several countries are considering health financing systems (HFS) reforms  to accelerate progress toward UHC. These HFS reforms seek to accelerate the health financing transition (Fan & Savedoff, 2014b; World Health Organization, 2019) from OOP-predominant to public health expenditure (i.e., SHI-or government-predominant expenditure as % of THE) predominant HFS. As policymakers face alternative health financing paths, it is important to understand what (if any) differences to health system outcomes they make.  7]). Whenever the p-value is below 10%, there is at least suggestive evidence that government financing HFS transitions show better results than SHI HFS transitions. The p-values are 15, resulting from three models (FE model, random trend model and differential trend model) times five health system outcomes (LE, U5M, MM, CAT 10%, immunization). THE is not considered as an outcome because larger health expenditure is desirable only if larger expenditure translates into more services coverage Our main research objective has been to investigate the effect of transitions from OOP-predominant to government-or SHI-predominant HFSs on health system outcomes (i.e., health status, financial risk protection and utilization). Based on a conceptual framework for HFS transitions, we model HFS transitions from OOP-predominant to SHI-and government-predominant HFSs, assigning each country-year observation to a predominant HFS using clustering-a machine learning approach. We estimate the effect of HFS transitions on intermediate and health system outcomes via FE regressions, controlling for time-invariant as well as several contextual factors, while excluding potential "bad controls" (Angrist & Pischke, 2008b) on the causal pathway.
Transitions from OOP-to both government-predominant and SHI-predominant HFSs are both expected to deliver health system outcomes improvements via increased public health expenditure (see Section 2). However, we find that the effects of government-predominant HFS transitions was more favourable than SHI-predominant HFS transitions, for most outcomes. For the few outcomes where this was not the case, SHI and government-predominant HFSs showed similar results. Hence, there is no outcome for which SHI transitions showed significantly better outcomes than government financing. These results are robust to most checks and tests.
Why do transitions to government financing appear to be superior to those to SHI? While we do not conduct a formal mediation analysis, we discuss several hypotheses on channels of influence, commenting on how the data may or may not support each possible channel.
The main difference between government and SHI financing is that SHI requires contributions made by or on behalf of the person accessing healthcare services. Despite recent cases of general taxation funding SHI expenditure (World Health Organization, 2019), SHI remains mostly financed by regular, typically wage-related contributions (i.e., labor taxes, see Table 2): for many LMIC countries, this means that while formal workers are covered via compulsory contributions, for large parts of the population (i.e., informal workers) insurance coverage is voluntary (Barasa et al., 2021). SHI arrangements to cover the uninsured vary considerably across countries, and may generate pool fragmentation and pro-rich bias (Barasa et al., 2021) (e.g., a pool with comprehensive benefit package for well-off formal workers, and another one with a limited benefit package for the poor, the elderly, or an otherwise defined population group). Even when the non-contributing poor or vulnerable are covered by subsidies, the informal non-poor may be left out of affordable and quality options (Wagstaff, 2010). In our findings, informal sector size turns out indeed as the contextual factor with the biggest negative impact on the effects of HFS transitions.
SHI expansions may also come at higher costs and take longer time, compared to expansions of existing government financing mechanisms (see column (World Health Organization, 2020), Appendix 6). SHI requires institutional, technical and managerial capacity, and substantial investment to collect revenues and manage the provider-payment system (Wagstaff, 2010). Limited regulatory capacity of purchasing institutions has been noted as a key issue (Wagstaff, 2010), and the time to develop capacity is not negligible: several countries in Western Europe took more than 70 years to reach UHC via SHI (Carrin & James, 2005). Expanding existing government financing arrangements would likely require less costs and time. The non-healthcare-related costs of SHI introductions or expansions may increase public health expenditure versus an OOP-predominant-system, with little improvements to healthcare coverage and finally health outcomes. SHI HFS have also traditionally focused on secondary/ tertiary healthcare (Wagstaff, 2010) (suggested by Table 2, PHC expenditure descriptive statistics), which may be less efficient than PHC (Anderson et al., 2018). A full assessment of the relative performance of different types of HFS reforms would of course require a comparison of both the incremental costs and benefits of either HFS-type-a challenge that is beyond the scope of this paper, and one that has hitherto not been met in the existing research (Kreif et al., 2021).
SHI transitions appear to not have succeeded in decreasing OOP expenditures as % of THE by as much as government financing transitions. SHI transitions decreased the reliance of THE on OOP expenditures, but they did so partially at the expense of non-contributory government financing (see Figure 3). By contrast, government financing transitions did not result in a significant decrease in SHI financing (as % of THE), as illustrated by the experience of Moldova and Russia (see Appendix 7): increases in SHI expenditure (as % of THE) were accompanied by substantial decreases in government financing (as % of THE), less so in OOP (as % of THE), and a flattening of the U5M curve. At the same time, THE in both countries continued to grow.
Estimates using SHI and government financing as a % of THE (rather than predominant financing dummy variables) do not support the idea of SHI as a complementary arrangement either (Appendix 6, Panel F). Increases in SHI expenditure (% of THE) increased THE, but did not improve outcomes. This is compatible with the hypothesis that SHI for formal workers may result in pool fragmentation and pro-rich health expenditure (Wagstaff, 2010), and that implementation costs are a reason for SHI's limited effects. Both these issues arise regardless of SHI being a complementary or a predominant HFS. Rather than introducing SHI as a complementary arrangement, favourable SHI features (e.g., provider-purchaser split, explicit benefit packages entitlement, beneficiaries included in governance bodies, covering vulnerable groups via ad-hoc interventions (Wagstaff, 2009a;Wagstaff, 2010;Wagstaff & Moreno-Serra, 2009a)) could be included in existing government financing systems, and vice-versa (e.g., via removing SHI link between contributions and services' access, making it de-facto government financing).
Government-financed systems may have undesirable features, too. While automatic universal coverage is a positive feature, benefit packages are often too ambitious, so that the "depth" of this coverage and the actual package of services delivered is often limited in LMICs (Wagstaff, 2013). Often, the purchaser-provider split is missing, and when it is present, there is no joint decision-making body, which includes purchaser(s), covered populations and providers. These arrangements can be implemented in government-predominant HFS, but they are more typical of SHI-predominant HFS. SHI-predominant HFS could see a positive healthcare coverage effect from efficient purchaser-provider systems: however, for such effect to materialize, a well-functioning provider network is required. Assuming that a higher GDP per capita may mean better provider networks, the fact that SHI transitions have a more beneficial effect on immunization coverage when GDP per capita is higher (see Appendix 5 and Section 5.3) seem to support this idea. Similarly, a realistic and explicit benefit package, and the idea of entitlement provided by SHI, are seen as the main advantages of SHI , and could be considered for inclusion in government financing systems.
Other contextual factors play a role, too. Perhaps counter-intuitively, higher labour-tax financing resulted in decreases in government health expenditure (in % of general government expenditure), for SHI-predominant HFS transitions (see Appendix 5, column 36). Ministries of Finance may respond to higher SHI labour-tax revenues by decreasing transfers from general tax revenues to health. While we find no evidence that a (proportional) increase in labour-tax revenues modifies HFS effects on health, labour-tax increases may increase informal sector size (Wagstaff, 2010;Wagstaff & Moreno-Serra, 2009b), which we find worsen SHI effects on health (Appendix 5, columns 1-6). Since we do not investigate the HFS impact on labor outcomes, and research in LMICs on this topic is limited (Le et al., 2019), this is an area for further research.
Many countries are contemplating SHI reforms for different reasons Yazbeck et al., 2020): increasing financial autonomy and increased budgets for health via earmarked-to-health labor taxes, the political attraction of providing entitlements (usually to formal sector workers, which include civil servants), and considering SHI enrollment as the UHC coverage measure. With government financing, all citizens/residents are covered, and the issue is the depth of such coverage, which is difficult to measure, while with SHI there is the SHI coverage measure to report on as "progress toward UHC". Further research could focus on other reasons driving a resurgence in SHI reforms.
Since concerns about reverse causality and the parallel trends assumption could not be entirely resolved, and our results were robust to most-but not all-different specifications and outcomes, we do not claim to have presented fully causal impact estimates. The limitations that are to be borne in mind when interpreting the findings include: first, we do not take into consideration more extensive HFS and health system heterogeneity due to data limitations. Other health system features (e.g., gatekeeping, different provider-payment systems, pooling fragmentation, private-public providers, provider networks, governance structures, etc.) may also affect health system outcomes, but data on a global scale does not exist to capture those. Second, we note that the sample comprises only seven largely middle-income countries that transitioned from OOP to SHI predominant systems (as mentioned in Section 5.1). However, interactions of the HFS treatment variable with log GDP per capita show limited heterogeneity in the effect of HFSs due to changes in log GDP per capita (Appendix 5), suggesting that this might not be a major issue. Finally, we have not addressed formally "how" (e.g., via mediation analysis) or "for whom" different HFSs work (e.g., health equity), due to data limitations.
While bearing these caveats in mind, the policy implication of these findings is that policymakers considering SHI transitions to accelerate progress toward UHC should take these results as a call for caution. For LMIC policymakers facing the challenge of large informal sectors, re higher poverty rates, and often not-well-functioning provider networks, the odds of accelerating progress toward UHC via introduction or expansion of contributory SHI appear more contained, as noted in the recent literature (Barasa et al., 2021). Pursuing the road toward non-contributory financing expansions to accelerate progress toward UHC would appear as the more promising avenue, based on our findings. But then again, one cannot exclude the possibility that SHI can be made to work well for health system outcomes, and we cannot present non-contributory government financing as being unambiguously superior to contributory SHI in every situation (Savedoff, 2004;Wagstaff, 2010;Yazbeck et al., 2020). For both expansions, our contextual factors analysis findings suggest that SHI performs better when informal sector is smaller, GDP per capita is higher, and, to a lesser extent, when control of corruption is higher and labor tax financing is lower. Other contextual factors that may improve the effects of SHI transitions comprise higher wages, functioning provider networks, higher government technical, regulatory and financial capacity, and lower average household size (Hsiao et al., 2007). Information regarding these contextual factors, and their expected trend, can further strengthen decision-making confidence regarding HFSs reforms. These policy implications and findings are also relevant for governmental and non-governmental development partners supporting governments in moving toward UHC via technical and financial assistance.

ACKNOWLEDGMENTS
We would like to thank Dr. Rodrigo Moreno-Serra for thoughtful comments, participants and especially discussants at the Health Economics Study Group meeting 2020, the IHEA conference 2021, the World Bank Health Financing Global Solutions Group seminar in October 2021, and the Centre for Health Economics seminar in February 2021. Jacopo Gabani receives funding via the Alan Maynard PhD scholarship from the University of York, Centre for Health Economics. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

CONFLICT OF INTEREST
None of the authors has any conflict of interest to declare.

DATA AVAILABILITY STATEMENT
Datasets are detailed in the data section, and corresponding links are in the references section. Datasets and Stata do files are available upon request to the corresponding author, and a replication package with all datasets and do files is also available at the Open Science Framework repository https://osf.io/snczj/.

ETHICS STATEMENT
We did not have to obtain ethical approval. All data used is secondary country-level data from publicly available sources. The below figure helps clarify the idea that health financing reforms may impact health system outcomes through different pathways depending on the health financing function that is being reformed, as shown below. We have taken out 18 small and island countries, given that governance, health systems and health financing for those countries present peculiarities when compared to other countries. In small and island countries, changes in predominant HFS are more frequent than other countries (0.72 changes per country for small and island countries, vs. 0.43 in other countries, in the 2000-2017 period), and there are transitions from SHI to government financing system (and vice-versa) that are not seen in other countries, raising concerns about data quality and relevance in the context of a cross-country analysis.

APPENDIX 2. Detail of variables and data construction
Baseline results change when these countries are included, and when the HFS is defined via clustering. However, the main conclusion that the effect of government financing on health system outcomes is more or as favourable as that of SHI transitions is maintained. We also include all countries and define the predominant HFS using the highest value among government financing, SHI and OOP expenditures as % of THE, showing that estimates are similar to our baseline results and that, again, the conclusion that government financing effect on health system outcomes is better or as good as SHI is confirmed. These checks are presented with other robustness checks in Appendix 6, Table A3, panels E and F.
Variable definitions and source: GABANI et Al.

Variable Definition Source
Government financing as % of THE "Participation is automatic: For all citizens/residents; or a specific group of the population (e.g., the poor) defined by law/government regulation." WHO GHED (

APPENDIX 3. Health financing arrangement details, and k-means clustering
The health financing arrangement classifications (government financing, SHI and OOP) are not decided by the authors: they are defined by the System of Health Accounts (SHA), Health Financing Scheme (Chapter 7) standards, which form the basis for the WHO GHED. The reader is referred to the SHA manual for a detailed description of the health financing schemes mentioned and used in the paper. Figure A1 clarifies the meaning of government-financed, SHI-financed and OOP-financed. Government financing refers to "Government schemes". SHI-financing refers to Compulsory contributory health insurance schemes, which includes both SHI-proper and private compulsory health insurance. We call it SHI-financing to avoid confusion, since this is the usual name in the literature. Community based health insurance (not shown in the figure) is included under voluntary payment schemes. We also emphasize that in WHO GHED, what matters is the health financing scheme through which the monies are spent: if tax revenues are used to finance the country SHI agency providing contributory SHI, then such monies will be recorded under SHI (see p. 170 of the SHA 2011 manual (StataCorp, 2015)). However, if government pays premiums or finance the budget of an agency providing non-contributory health insurance or services, those monies count as government financing. To avoid confusion, THE in this paper refers to "current health expenditure" in the GHED dataset. Finally, we note that external financing (i.e., development assistance for health) will be considered part of any given country's public financing scheme (i.e., SHI or government financing) if on-budget. When off-budget, external financing will largely be considered as financing via not-for-profit institutions serving households.
K-means clustering (MacQueen, 1967) is an unsupervized machine learning technique described in many books. The k-means cluster algorithm assigns each country-year combination (i.e., each observation) to the cluster with the least squared Euclidean distance (other distances can be used), which is the cluster with the closest mean.
In this study, we choose to have three clusters as we expect to have country-year combinations that belong to one of the following three groups: predominantly government-financed, SHI-financed or OOP financed. We choose that the variables used for clustering are government-, SHI-, and OOP-financing as a % of THE. The algorithm starts by assigning random cluster "centroids" (a vector of three values, government schemes, SHI, and OOP, as a % of THE) and assigning each country-year combination to the nearest cluster (i.e., the cluster with the least Euclidean distance). At this point, a new cluster centroid is calculated: it is a vector of the mean of the country-year combinations assigned to the cluster. The process is repeated until the cluster centroids position does not change. Via this process, k-means minimizes intra-cluster variance: in other words, the sum of squared distances between each country-year combination and the centroid of the assigned cluster is minimized.
K-means clustering therefore considers, for each country-year combination, the vector of government schemes, SHI, and OOP, each as a % of THE, and that k-means is non-arbitrary, except in the choice of the number of clusters, which we do based on theoretical reasoning about having 3 groups (government schemes, SHI, and OOP) and the variables used for clustering (government schemes, SHI, and OOP, each as a % of THE). One problem with this approach is that countries may move repeatedly and in short period of times in-and-out a certain group, therefore switches lasting only 1 year have been removed. We note that inclusion in a group is reversible: countries can go from SHI to OOP or OOP to government financing, and vice versa.
It might occur that a country-year observation is classified, for example, in the government financing HFS group, even if OOP as % of THE is higher than government financing as % of THE. This is because each country-year observation is a vector of three values: government financing as % of THE, OOP as % of THE, and SHI as % of THE. Each cluster can also be thought of as a vector of government financing, OOP, and SHI, as % of THE (called centroids), which are measured as the means of all the country-year observations within that same cluster.
The k-means clustering algorithm does not formally consider which one of the three values forming the country-year observation vector (government financing as % of THE, OOP as % of THE, and SHI as % of THE) is the highest. Country-year observations are classified into each group (government financing, OOP, SHI) based on the country-year observation vector Euclidean squared distance to the cluster centroids vector. Therefore, a country-year observation with a very high OOP as % of THE might be classified as "government-financing" when its distance to the government financing cluster is shorter than the distance to the OOP cluster. However, the OOP, government-financing, and SHI cluster will show, respectively, OOP, government financing, and SHI as % of THE as the highest value of the three. This is because the k-means clustering algorithm minimizes distance within clusters' observations and maximizes distance across clusters.
The centroids of the final clusters are the average OOP as % of THE, SHI as % of THE, and government financing as % of THE for each cluster, which are shown in Table 2, Section 5.1.
These clustering concerns are substantially less relevant when we run the following robustness checks: (1) We classify country-year into HFS groups/clusters using the largest value between OOP, SHI and government financing as % of THE. In 229 country-year observations (8% of total observations) the predominant HFS defined via clustering is different from the predominant HFS defined by the "largest % of THE" method. An example of this is shown below, for all countries, Year 2017. (2) We use a minimum threshold of OOP as % of THE to classify country-year observations in the OOP-predominant group that is, a country-year can only be classified as OOP-predominant if the cluster procedure identifies it as OOP predominant and its OOP as % of THE is higher than a certain threshold (50%, 45%, 40%, etc.) We note that data is usually standardized when variables used for clustering are on different scales (e.g., age and income). In our case, all clustering variables are percentage of THE, therefore no standardization is required. Number of switches resulting from cluster analysis, with countries in brackets: There are 26 cases (1% of all country-year observations), for which the clustering analysis resulted in a 1-year long HFS switch. In such cases, the previous year predominant HFS was used: Bolivia, 2001;Central African Republic, 2015;Republic of Congo, 2009;Djibouti, 2003;Ethiopia, 2004;Gambia, 2002;Guinea-Bissau, 2002;Guyana, 2010;Jamaica, 2001;Kazakhstan, 2002;Kyrgyzstan, 2009;Laos, 2009;Lebanon, 2012;Madagascar, 2002;Madagascar, 2009;Madagascar, 2013;Moldova, 2005;Moldova, 2009;Mauritius, 2003;Mongolia, 2005;Panama, 2016;Trinidad and Tobago, 2014;Tunisia, 2013;Ukraine, 2011;Uzbekistan, 2014;Vietnam, 2001. Regarding the choice of the number of clusters, we measure the within-sum-of-squares (WSS), the ln (WSS), 2 defined as (1) and the proportional reduction of error coefficient (PRE), defined as . While for WSS, ln (WSS), and 2 the optimal cluster number is seen where there is a kink in the curve, or when the decrease becomes smaller versus previous decreases, in the PRE methodology the optimal cluster number is identified when the ( ) value is large (vs. other ( ) values). The graphs suggest that the optimal number of clusters is three, in line with the number of clustering input variables (OOP, SHI and government financing as % of THE, making 89% of THE on average). This is also in line with the literature on HFS, which often suggested using SHI and government financing predominant HFS systems (Wagstaff & Moreno-Serra, 2009), to which we have added OOP predominant HFSs. We also run a robustness check using four clusters instead of three, and find that the overall conclusion is not changed.
In the below Table we present, for Year 2017 and all countries, SHI as % of THE, OOP as % of THE, government financing as % of THE, and voluntary health insurance as % of THE, the predominant HFS defined using clustering, and the predominant HFS defined as the HFS with the largest % of THE. Countries in which there is a difference between the two HFS definitions are noted in italic (9 cases, 6% of total country observations in 2017).  (1). Robust SEs, clustered at country-level, in parentheses. Details on HFS switches are detailed in Appendix 3. All models control for all variables listed as "control" in Table 2. p-values for two-sided t-tests are reported as: ***p < 0.01, **p < 0.05, *p < 0.1.
Source: Author elaboration. (1) (3)   Note: Datasets discussed in Section 4. Models detail explained in the robustness checks section. Baseline model controls variable included in all models unless specified. In italic, in panel G, coefficients that are statistically different from the baseline model at the 10% level based on tests in Equations (4) and (9). Robust standard errors in parenthesis.
Source: Author elaboration.  (1) is estimated using different income sub-groups, including all countries available in the dataset, and finally including all countries and using a different definition for HFS (highest value), as detailed in the robustness checks section and in Appendix 2.
Source: Authors' elaboration. Datasets discussed in section 4.  (1) is estimated using additional outcomes, as detailed in the Results section. For DHS outcomes, fewer controls were used because otherwise the number of observations would drop below 100. The controls used were: GDP per capita, urbanization, control of corruption, government effectiveness, % population above 65 and % population below 14 years old, in addition to fixed effects. For DHS outcomes, under-5 mortality and infant mortality is provided as "average for the 5 and 10-year period before the survey": when it was 5-year, the observation year was recorded as "survey year minus three", when it was 10 years, the year considered was "survey year minus five". In other words, we attach the "past 5 and 10 years average reading" to the mid-year of that same 5 or 10 year period. Robust standard errors reported, clustered at country-level. P < 0.1*, p < 0.05**, p < 0.01***.

T A B L E A 3 (Continued)
Source: Author elaboration.

T A B L E A 4 Additional outcomes
Notes: these are event studies plots, for five outcomes out of six outcomes used for our main results. For CAT 10%, data limitations do not allow an event study. In the case of government financing, there are 30 countries switching between OOP and government financing predomaning HFSs. However, only 17 countries switch only once during the 2000-2018 study period. These 17 countries are included in the event study, while the remaining 13 countries who switch back-and-forth between OOP and government financing are excluded, as the interpretation of their coefficients is not possible.
The equation used for the above plots is: = + 5 ∑ =0 − + 2 +2 + + + + Adding further leads resulted in omitted variables, therefore we limited leads to only 2 years before the switch from OOP to government financing.
The baseline omitted case is the first lead, where k = 1.
Notes: these are event studies plots, for five outcomes out of six outcomes used for our main results. For CAT 10%, data limitations do not allow an event study. In the case of government financing, there are eight countries switching between OOP and SHI predomaning HFSs. Seven countries switch only once during the 2000-2018 study period. These seven countries are included in the event study, while the remaining 1 countries who switch back-and-forth between OOP and SHI are excluded, as the interpretation of their coefficients is not possible.
The equation used for the above plots is: