The impact of performance‐based financing within local health systems: Evidence from Mozambique

Abstract Most evidence on Performance Based Financing (PBF) in low‐income settings has focused on services delivered by providers in targeted health administrations, with limited understanding of how effects on health and care vary within them. We evaluated the population effects of a program implemented in two provinces in Mozambique, focusing on child, maternal and HIV/AIDS care and knowledge. We used a difference‐in‐difference estimation strategy applied to data on mothers from the Demographic Health Surveys, linked to information on their closest health facility. The impact of PBF was limited. HIV testing during antenatal care increased, particularly for women who were wealthier, more educated, or residing in Gaza Province. Knowledge about transmission of HIV from mother‐to‐child, and its prevention, increased, particularly for women who were less wealthy, less educated, or residing in Nampula Province. Exploiting the roll‐out by facility, we found that the effects were concentrated on less wealthy and less educated women, whose closest facility was in the referral network of a PBF facility. Results suggest that HIV testing and knowledge promotion increased in the whole district, as a strategy to boost referral for highly incentivized HIV services delivered in PBF facilities. However, demand‐side constraints may prevent the use of those services.

Performance Based Financing schemes implemented in LMICs are complex interventions, involving financial incentives, and often also the provision of additional resources, training and supervision (Kok et al., 2015;Kovacs et al., 2020).Those are intended to increase the capacity for delivery, and to improve quality and quantity of the services provided, translating into more widely distributed service use and consequently health benefits.Service use, and therefore the achievement of targets, can be limited by demand-side constraints, which include financial constraints raised by user fees, travel expenses, time's opportunity costs, or individual health care seeking attitudes (Singh et al., 2021).Heterogenous effects in service use typically arise either because of demand-side constraints, usually varying by socio-economic status, or because of differences in service provision.For example, services may differ in type, quality, and mode of delivery, between populations in the immediate catchment area of a PBF facility and those further away served by other facilities.
Local health systems usually comprise health administrations, which oversee and coordinate a network of local providers (Anselmi et al., 2018;Martineau et al., 2018).Health administrations are typically involved in PBF schemes, particularly in supervision, training, and performance verification.They are also rewarded when they achieve targets.Through the response of local health administrations to incentives, PBF may affect not only the services delivered by PBF facilities, but also services' organization, outreach activities, or referrals mechanisms within the local health system (Bertone et al., 2013;Singh et al., 2021;Witter et al., 2013).However, evidence on the effects of PBF beyond enrolled facilities, and their manifestations, is still scarce.With limited exceptions (De Allegri et al., 2018), most evaluations cover schemes rolled out by administrative area, typically districts, and simultaneously in all providers (Kovacs et al., 2020).Evaluators have then defined PBF exposure simply based on whether the schema was rolled-out in a given health administration (Bonfrer et al., 2014;Fichera et al., 2021;Gage & Bauhoff, 2021;Sherry et al., 2017;Van de Poel et al., 2016).This has prevented the examination of diversified mechanisms of exposure, and related heterogeneous effects, driven by the response of various actors in the system, including health administrations and those providers not directly enrolled in the scheme.
The investigation of heterogenous effects may also be limited by data availability.Many evaluations have used primary data collected in pre-designated intervention and control areas (Basinga et al., 2011;Binyaruka et al., 2015;Das et al., 2016;Eijkenaar et al., 2013;Falisse et al., 2012;Peabody et al., 2011;Renmans et al., 2016;Soeters et al., 2011;Witter et al., 2012).These include richer information on mechanisms and impact (Anselmi et al., 2017;Bertone et al., 2016;Falisse et al., 2012;Ngo et al., 2016;Peabody et al., 2011), but are limited in temporal and geographical coverage, and generally focus specifically on targeted populations and outcomes.Few studies have used health facility data from national surveys or administrative datasets, with multiple time-points, to analyze changes in the volume and quality of services delivered (Falisse et al., 2014;Ngo et al., 2016).Heterogeneity in service use or health outcomes within the served population has been studied mostly using data from national household surveys which contain information on health and care outcomes, alongside socio-economic and other individual and household characteristics (Bonfrer et al., 2014;Fichera et al., 2021;Sherry et al., 2017;Van de Poel et al., 2016).
We analyzed the effects of a PBF scheme targeting 18 healthcare indicators related to the provision of HIV and pre-and post-natal maternal and child health services in Mozambique.The scheme was gradually rolled out across districts, and facilities within districts, in Gaza and Nampula provinces between 2011 and 2017.A previous evaluation assessed the impact of the program on the volume of services delivered by health facilities using routine data up to 2013 from the two intervention and two neighboring provinces (Rajkotia et al., 2017).Most indicators responded differently in each province.At least nine of them, including prevention of mother-to-child HIV transmission (PMTCT) and pediatric HIV treatment services, increased by over 50%.The impact was visible after 18 months and was sustained afterward.No negative impact, nor spill-over effects on non-incentivized indicators, were detected (Rajkotia et al., 2017).The motivation of healthcare workers also increased (Gergen et al., 2018).
We examined the impact over a longer time-period of 5 years, and we focused on heterogenous effects on population health care use and knowledge.We used data from the Demographic and Health Surveys (DHS) 2011 and 2015 on the birth of the youngest child, and on mothers' personal and household characteristics, effectively covering births between 2007 and 2015.We linked the DHS data with the national registry of health facilities through clusters' and facilities' geolocation, and we added information on enrollment into PBF for each facility.We exploited the phased roll-out, and we applied a difference-in-difference estimation strategy to assess the effect on indicators which were either directly targeted, or potentially affected by the response to PBF incentives.We focused on the use of maternal and child care services, HIV testing during ANC and knowledge of PMTCT.
This study brings four contributions to the literature on PBF.We add evidence on the population effects of PBF, more specifically on HIV indicators and related knowledge, and on heterogeneities by wealth, education, and province.Finally, using alternative definitions of exposure, we examine effects for populations within and beyond the catchment area of PBF facilities.This allows us to examine the involvement of local health administration and non-targeted facilities in boosting demand for incentivized services through the district referral network.Testing during antenatal (ANC) increased, particularly for wealthier and more educated women, and for women residing in Gaza.Knowledge of HIV's mother-to-child transmission and its prevention increased, particularly for less wealthy and educated women, and for women residing in Nampula.The impact was limited to services provided in the whole district, and it was concentrated in the referral network of PBF facilities.This is where new HIV cases could be found and referred to PBF facilities for highly incentivized and rewarded services.Taken together, the impact on knowledge and on testing beyond the catchment area of PBF facilities, is suggestive of supply-side attempts to boost demand through HIV/AIDS case finding and behavioral change.The stronger impact on knowledge amongst less wealthy and educated women, and on testing for wealthier and more educated women, indicates that supply-side attempts to boost demand may be counteracted by socio-economic constraints to access services.

| The healthcare system
Healthcare in Mozambique is mostly publicly funded and provided.There are only few private healthcare facilities and are concentrated in the capital, Maputo City (Instituto National de Saude, 2018).Healthcare is almost free at the point of delivery.Official outpatient fees in public facilities are negligible (MZM 2 to 5 equivalent to USD 0.03-0.08),and exemptions cover the large majority of the population (indigents, children under-five, pregnant women, chronically ill, patients suffering from malaria, TB and HIV/AIDS) (MISAU, 2012).Primary and secondary healthcare are managed at the district level and in 2012 were provided by 1314 health clinics and health centers and by 66 district hospitals, in 11 provinces and 142 districts, including Maputo City (Figure A1 in Appendix).
Integrated district planning and management have been a key feature of the National Health Service architecture since 1980's, when it was first set up following the independence from Portugal.The core organisational structure remains in place (MISAU, 2002), with an increased number of facilities, more and more complex service provision, and budgets decentralized to the District Governments.National, provincial and district guidelines for planning, management and monitoring recognize the key role of district health systems, including health administrations and health facilities (DPS Cabo Delgado, 2013).Similarly to other settings where the health system is hierarchically organized, the district administrations (Serviço Detrital de Saúde, Mulher e Acção Social, SDSMAS) define and coordinate the role of health facilities within their jurisdiction.They also plan, manage and distribute financial resources, human resources and drugs, and they are responsible for training and supervision (Anselmi et al., 2018;Martineau et al., 2018).All facilities provide basic primary outpatient care.Larger health centers provide full primary care and inpatient care, while district hospitals provide secondary care, including surgery.
Primary care is the backbone of district services.Like in similar settings (Francetic et al., 2020), a hierarchical referral system guarantee the continuum of care.Larger referral facilities, typically provide secondary and tertiary care, serving as a hub for other more peripheral primary care facilities.Each district has one hospital, or a larger health center, which provides secondary care, and is usually located in the major urban center.There are generally one to three referral facilities within a district, depending on its population and geography.These include one district hospital (or larger health center) providing secondary care, and possibly few larger health centers providing primary and inpatient care.The remaining facilities provide basic primary care.The referral system is key to district functioning.Higher and lower-level facilities within the same referral network collaborate, with lower-level facilities directing patients to the higher level ones for more complex services, and serving as a base for outreach activities organized by the higher-level facilities.Coordinated and supported by district health administrations, higher-level facilities oversee planning and supervision within lower-level facilities, and organize various outreach activities, mostly for preventive services.Additionally, there are voluntary health workers supporting health facilities, particularly smaller health centers, through the provision of outreach services in the community.
Maternal and child health services are provided in all facilities.However, some clinics lack the resources to assist delivery, which tend to happen in larger health centers or district hospitals (MISAU, 2012).HIV antiretroviral treatment (ART) is provided only in few designated facilities within a district, typically the larger ones.Rapid HIV testing can be delivered in all facilities and through outreach campaigns.Despite improvements in service delivery and access, over the last decade vaccinations and institutional deliveries appeared to have decreased between 2011and 2015(MISAU et al. (2015)), likewise resources from international donors (Anselmi, L 2017).

| Performance Based Financing
The implementation of Performance Based Financing started in January 2011 in two of 11 provinces with different population and health characteristics.Gaza, in the South, was amongst the provinces with the highest HIV prevalence (about 25.1%), and Nampula, in the North, was amongst those with the lowest (of 4.6%) (MISAU, 2009).In Gaza only 18% of the population was in the three lowest wealth quintiles, compared with 62% in Nampula (Table 2).
The PBF program was funded by the United States President's Emergency Plan for AIDS Relief (PEPFAR) through the Center for Disease Control (CDC) and was implemented by the Elizabeth Glaser Pediatric AIDS Foundation (EGPAF).The CDC was already actively supporting Gaza and Nampula provinces in providing HIV/AIDS care.The health facilities enrolled in the program were selected from those serving a large population and offering HIV/AIDS services, including adult or pediatric ART and PMTCT.Those facilities had to meet minimum staffing requirements and, either have a bank account, or be under the financial supervision of the SDSMAS.The PBF scheme was rolled-out over four periods: Phase 1 starting in January 2011 (30 facilities in Nampula and in 18 Gaza); Phase 2 in March 2012 (41 facilities in Nampula and in 23 Gaza), Phase 3 in September 2013 (15 facilities in Nampula and 25 in Gaza) and Phase 4 in September 2014 (no facilities in Nampula and 21 in Gaza).No new districts were enrolled from September 2013 onwards (Figures A2 and A3 in Appendix).No change was implemented in the remaining facilities as part of the scheme.
The PBF scheme was a complex intervention which: provided incentives to improve the delivery of specific services; increased resources available; strengthened capacity; and affected local governance.The scheme incentivized 21 facility indicators, listed in Table 1 with the relative prices, which were mostly taken from the existing monitoring frameworks, and which covered maternal and child care, tuberculosis, PMTCT, pediatric HIV, adult HIV care and treatment (Rajkotia et al., 2017).Additional incentivized indicators related to the monitoring and management functions of the SDSMAS, included: health facilities supervision, district PBF committee meetings, timely submission of correct monitoring reports, efficient management of human resources, financial management aligned with set norms and procedures, assessment of health facilities quality (DPS Nampula, 2018).Some indicators were changed when targets were reached, but this happened after the period covered by this study.Performance was assessed quarterly, and jointly by EGPAF and the Provincial Directorate of Health (Gergen et al., 2018).Quality was assessed every 6 months, with check lists covering prevention and control of diseases, and maternity and HIV services (Gergen et al., 2018).Incentives consisted of a quantity-based bonus, with weighting for quality and remoteness, which could be distributed to staff as salary top-up (60% share) or used to improve facilities (40% share).Bonuses contributed up to 50% of the operating costs (Gergen et al., 2018).Targeted districts received also additional resources and training.

| Rationale for heterogeneous effects by exposure
With a roll-out by facility, changes stimulated by the incentives could affect the population in a district in different ways, depending on the response of the PBF facilities, of the health administration, and of the remaining facilities (Bertone et al., 2013).Testing the impact of the scheme for different populations, can therefore provide insights on the involvement of health administrations and facilities in the response to PBF incentives.
First, the whole district population may be affected because of the overarching activity of health administrations.Providing health administrations with additional funding, and rewarding them for their own and their health facilities' performance, may stimulate improvements in their functions.These can in turn lead to improved resources, supervision, training, and coordination in all facilities, as well as increased vaccinations, information campaigns, and outreach activities.Ultimately, we may expect improved service delivery, and increased use by the served population, particularly for targeted services, or those complementary to them.For example, we may expect increased information campaigns or HIV testing, to find HIV cases and stimulate demand for the highly rewarded ART services.
Second, improved provision of targeted services in PBF facilities can affect all population groups within a reachable distance, either by satisfying previously unmet demand, or by stimulating further demand through increased quality.Patients directly served by a PBF facility, typically those for whom that is the closest facility, are the most likely affected.However, while it would be natural for patients to seek care from their closest facility, some may bypass the referral system and seek care from facilities which are further away, but offer more comprehensive or higher quality services (Leonard, 2014).The improved reputation of PBF facilities, or the implementation of outreach activities beyond their immediate catchment area to attract patients, may stimulate bypassing.Bypassing is more likely to involve patients with lower demand-side constraints (e.g., those with lower opportunity cost of traveling further away, or those with higher perceived benefit from better care) (Singh et al., 2021).
Third, 26% of PBF facilities are referral facilities (73% in phase1, 29% in phase 2, 5% in phase 3% and 0% in phase 4).Referral facilities attract and receive patients from the catchment areas of the peripheral facilities in their network.With the support of health administrations, they also organize outreach activities, share good practices, and provide training and resources to more peripheral facilities (Give et al., 2019).These activities may intensify if PBF referral facilities actively seek to boost demand and reach the target for incentivized services, by increasing the referral of patients from the catchment areas  of peripheral facilities (Singh et al., 2021).Populations in the referral network of a PBF facility, but in the catchment area of a non-PBF facility, can therefore be affected by the scheme.
The DHS Program provides researchers with accurate and representative data for over 90 countries.These data are of the highest standard achievable in Mozambique and are used as national and international reference to monitor the progress of child and maternal health indicators.DHS surveys have been previously used for robust impact evaluations elsewhere (Bonfrer et al., 2014;Fichera et al., 2021;Gage & Bauhoff, 2021;Sherry et al., 2017;Van de Poel et al., 2016).
The DHS is a repeated cross-sectional survey, nationally representative of women in reproductive age (15-49), of their children, and of men aged 15-49 (or 59).The household questionnaire covers the roster (age sex, relationship to the head of the household, education, parental survivorship, residence, and birth registration) and various other characteristics.Household characteristics include the information used to compute the DHS wealth index, namely: asset ownership, materials used for housing construction, and access to water and sanitation (Rutstein & Johnson, 2004).The individual questionnaire cover fertility, mortality, family planning, marriage, reproductive health, child health, nutrition, and HIV/AIDS.The geographic coordinates of the cluster, village or urban neighborhood of residence at the time of the interview, are made available with a random displacement of maximum 5 Km (DHS, 2020).Women of reproductive age, who have been pregnant during the 5 years preceding the interview, are asked about their use of care prior to, during, and post-delivery, and about the care, survival, and health of their children.Information on child mortality and vaccination history is collected for all births in the 5 years preceding the interview, while information on ANC, institutional delivery and post-natal care is collected only for the last birth.All information is reported by the mother.We used information on child vaccination, and on mother's healthcare use and knowledge about HIV PMTCT.We considered the most recent birth only, and we constructed a pooled cross-sectional sample of births, and related mothers and children, covering the period between 2007 and 2015.We restricted the sample to Gaza, Nampula and their neighboring provinces Inhambane, Maputo Province and Zambezia, which exhibited parallel trends pre-intervention for the majority of outcomes.We used 3937 observations for the analysis of HIV knowledge and testing, ANC and institutional delivery, and 2759 observations (excluding those born less than 12 months prior to the interview) for immunization within 9 or 12 months.

| Service Availability and Readiness Assessment (SARA) survey
We created a census of health facilities operating between 2007 and 2015 using data on the exact geo-location, type of facility and opening year from the 2018 Service Availability and Readiness Assessment (SARA) Survey (Instituto National de Saude, 2018).The SARA includes information on various characteristics (e.g., location, opening year, type, access to water and electricity) and resources (e.g., staff, equipment and drugs availability) for every health facility, except for six located in remote islands which were not reachable at the time of the survey.Unfortunately, the SARA does not indicate the services provided by each facility, for example, HIV/AIDS anti-retroviral treatment.We checked for facility closure using a previous census validated in 2011 (Anselmi et al., 2015).No facilities were opened or closed in any PBF district between the roll-out of the scheme in 2011 and the end of the study period in 2015.We identified each facility's referral facility using information about districts from national and provincial operational plans and data on the volume of services delivered by each facility from the national health information system (Anselmi et al., 2018).We validated our list with the Ministry of Health.

| Performance based financing roll-out
The EGPAF provided information on the year and month in which each district and each facility were enrolled in PBF and started receiving bonus payment (Figures A2 and A3 in Appendix).We considered a SDSMAS enrolled when PBF was rolledout in at least one facility in the district.

| Outcomes
For each birth, we considered 11 of the available outcomes that can be affected by changes in the availability and quality of targeted services, and which are either directly targeted by PBF, or functional to increasing their demand (Table 1).

| Antenatal care and HIV testing
We used five measures of ANC use and quality, including a continuous measure of the total number of ANC visits and a set of binary indicators taking value one if the mother: (i) had at least four ANC visit; (ii) attended ANC with a qualified care professional; (iii) was offered an HIV test; (iv) was tested during ANC visits.Although testing was not directly incentivized, we hypothesized an effect because a test is required to identify potential users of incentivized HIV services.The administration of ART to HIV+ pregnant women to prevent transmission (PMTCT) (6.5 USD), and the initiation of antiretroviral treatment (ART) by pregnant women (USD 10) are two highly rewarded indicators.Their delivery requires not only identifying target mothers throuh testing, but also sensitising them to increase demand for treatment and adherence.Testing can be performed both in PBF and non-PBF facilities, during ANC consultations in a facility, or through outreach activities.

| Knowledge about transmission of HIV from mother-to-child
We used two binary indicators.The first takes value one if the mother was aware of HIV and of its transmission from mother-to-child (MTC) during pregnancy, delivery, and via breastfeeding.The second takes value one if the mother was aware of drugs that can prevent mother-to-child transmission (PMTCT).Those two variables are proxies for the intensity and quality of activities aimed at HIV prevention, which when effective can increase awareness.Because PBF targeted the provision of complete PMTCT, we hypothesized an increase in HIV awareness activities, both in PBF and non PBF facilities, or in the community, which would increase knowledge.

| Institutional delivery
We used two binary indicators for institutional delivery, the first taking value one if the most recent birth took place in a healthcare facility, and the second taking value one if the delivery was assisted by a professional.

| Immunization
We generated two binary variables taking value one if the child was fully vaccinated either within 9 months, as per international guidelines, or within 12 months, as per Mozambique guidelines.The full vaccination cycle included polio (three doses), tuberculosis (Bacillus Calmette-Guérin, BCG), diphtheria, pertussis (whooping cough), and tetanus (DPT) (three doses).Using the standardized codes provided by the DHS program, we could generate the variables only for children with a vaccination card and with valid vaccination dates reported.Vaccinations are administered through routine outpatient visits or through outreach campaigns organized by provincial and district health administrations.

| Control variables
We included a set of covariates measured at the time of the interview.We included years of education, age (15-49) in 5 years bins, and a binary indicator for Christian religion of the mother.Not only behaviors may differ by religion, but Christian faith organizations are involved in providing child and maternal healthcare with the potential of affecting outcomes more favorably in this group (Widmer et al., 2011).We controlled for household composition, including: (i) whether the head of the household was female, and their age; (ii) the number of children of age five and below, to account for experience in childcare; and (iii) the number of household members, to account for resource generation and use.We accounted for household wealth, measured by the DHS wealth index in quintiles (Rutstein & Johnson, 2004), for urban versus rural residence, and for access to services.We proxied access to services by the distance to the nearest health facility (crow fly distance in Km between DHS cluster and facility geolocation), and by car or truck ownership.

| Average effects
In line with similar evaluations (Bonfrer et al., 2014;Fichera et al., 2021;Gage & Bauhoff, 2021;Sherry et al., 2017;Van de Poel et al., 2016), we defined exposure as residence in a PBF district, at the time of conception (9 months prior to the date of birth), assuming that the mother had not moved between then and the interview.
We estimated the effect of PBF on the outcomes of interest, using a difference-in-difference (DID) estimation strategy, which compares changes in exposed (treated) and unexposed (control) households, before and after the implementation of PBF (Ashenfelter, 1978;Bonfrer et al., 2014;Fichera et al., 2021;Van de Poel et al., 2016).
We estimated Equation (1) separately for each outcome: where    is the outcome for mother (or child) i, in area l at time t. DID  varies due to the phased roll-out, and takes value one if conception happened after PBF was implemented in area l, and zero otherwise.  1 is the coefficient of interest, indicating the effect of PBF on mother's (or child's) outcome.

𝐴𝐴
Post is a set of four binary variables (three when exposure is defined by district as only new facilities were enrolled in phase four), indicating if child conception was before or after the start of each roll-out phase.   is a vector of control variables, including household's and mother's characteristics, distance and squared distance from the closest facility, accounting for linear and non-linear effects, and survey's year.
are two sets of time fixed effects for year and month of conception, controlling for unobserved time-varying heterogeneity, for example, in access to services.   are district fixed effects and    is the individual idiosyncratic error term.
We estimated Equation (1) using linear probability models, with robust standard errors clustered at the DHS primary sampling unit (cluster) and with women's individual population weights (Croft et al., 2018).We accounted for multiple hypothesis testing (11 outcomes) using the Bonferroni-Holm method to estimate family-wise p-values using 500 bootstraps.(Giacalone et al., 2018;Holm, 1979;Jones et al., 2019).

| Heterogenous effects by wealth, education, and province
Heterogenous outcomes by wealth, education, or contextual characteristics, may reflect differences in the opportunity cost of seeking or using care, and highlight the presence of demand-side constraints (Binyaruka et al., 2018(Binyaruka et al., , 2020;;Fichera et al., 2021;Van de Poel et al., 2016).
We replicated the analysis for each subgroup separately, as in similar recent studies (Fichera et al., 2021).We characterized sub-groups by mother's wealth (three lower vs. two higher quintiles), and education (up to four vs. five or more), with thresholds approximately defined by the median.We then tested heterogenous effects by province.We re-estimated Equation (1) on sub-groups including each treated province and the relative control group, first Nampula with Zambezia, and then Gaza with Maputo Province and Inhambane.

| Heterogeneous effects by exposure
We investigated response mechanisms within the district, by considering three alternative definitions of exposure to PBF, depending on the hypothesized involvement of health administration and non-PBF facilities.We re-estimated Equation (1) for each outcome, with each alternative definitions of exposure, and with facility rather than district fixed effects.
First, we considered clusters (villages) exposed, if PBF was implemented in the closest health facility, where the population would naturally first seek routine care, or from which it would receive any outreach service.A significant effect associated with this, but not other definitions of exposure, would suggest a poor response by health administrations and other facilities in the referral network, with benefits restricted to the immediate catchment area of PBF facilities.
Second, we considered that populations could seek care within a reachable distance, and we defined as exposed those clusters within a given distance from a PBF facility.We defined "reachable distance" based on the radius of the designated catchment area, which varies across facilities, and is generally determined by district planners so that each village is in the catchment area of at least one facility.The radius is larger where the population is sparser, and varies between 8 and 18 Km outside the capital.We used the radius in Gaza (15 Km), which is larger than in Nampula (11 Km) (MISAU, 2012).For each cluster, the closest facility, and possibly also others, were situated within a distance of 15 Km.A significant effect associated with this, but not other definitions of exposure, could indicate that PBF facilities sought to stimulate demand beyond their usual catchment area, inducing patients to bypass the referral system.
Finally, we considered that PBF could trigger a supply response within the whole referral network, and we defined a cluster exposed if its closest facility was in the referral network of a PBF facility.The referral network has a geographical coverage which in some cases is equivalent to the whole district.Therefore exposure based on the referral network overlaps only partially with exposure based on distance from a PBF facility.A significant effect associated with this, but not other definitions of exposure, would suggest that PBF facilities exploited their network to increase demand and referral for targeted services.

| Parallel trends
DID estimation for causal inference requires the assumption of parallel trends in the outcome variables between treated and non-treated units in the pre-intervention period (Angrist & Pischke, 2008).For each outcome, we tested for both linear and non-linear parallel trends by re-estimating Equation (1) on observations related to conceptions which happened before the rollout of PBF (January 2011).First, we introduced the interaction between future PBF exposure and conception-date linear time trends.Second, we interacted bi-annual 6-months periods with future exposure to PBF to test for difference between exposed and not exposed in each time period.We graphed the coefficient estimates with their 95% confidence bands, and we considered parallel trends to hold if the estimates were not significantly different from zero (Wichman, 2017).We tested the parallel trends assumption for each sub-group used for the analysis of heterogeneous effects, as well as for the robustness checks described in the next section.

| Robustness checks
We run several robustness checks.First, we re-estimated the models excluding all mothers living in non-PBF districts in Nampula and Gaza, to remove potential bias within province from the involvement of the Provincial Directorate of Health.Second, we re-estimated the models for heterogeneous effects associated with exposure defined either by distance or by referral facility's enrolment by excluding mothers residing in PBF districts, but not directly exposed, to avoid bias from potential effects within the districts.Third, we re-estimated the regressions for binary outcomes using non-linear models.Fourth, we tested for potential selection of districts into the PBF program using district-level data on health facilities, population health, socio-economic, and other characteristics available for the pre-intervention period (Anselmi et al., 2018).We estimated the association between PBF roll-out and district characteristics with two logistic models, using either information for 2010 or the average between 2008, 2009 and 2010.Finally we replicated the analysis for exposure by distance using a radium of 25 Km, a round-up of the distance walkable in the maximum time distance to a facility reported in national household surveys (4 h), at a human walking speed of 6 Km/hour (MISAU, 2012).

| Descriptive statistics
Table 2 presents the descriptive statistics separately for Gaza, Nampula, and the control group, which includes Maputo Province, Inhambane e Zambezia, before and after the implementation of PBF.The exposure of mothers to PBF varied in Gaza between 68%, when defined by district, and 26%, when defined by closest facility, and in Nampula between 64% and 21%.
Outcomes were better in Gaza than in Nampula before 2011.The average number of ANC visits per pregnant woman was 4.1 in Gaza and 3.1 in Nampula, and the percentage of mothers with at least four ANC was 66% in Gaza and 39% in Nampula, with most consultations done with a professional in both Provinces.The percentage of mothers offered HIV-tests and tested for HIV at ANC were 75% and 70% in Gaza and 46% and 39% in Nampula.Complete knowledge of HIV mother-to child transmission and knowledge of existing drugs to avoid transmission were 63% and 71% in Gaza and 67% and 56% in Nampula.Institutional deliveries were higher in Gaza (74%) than in Nampula (59%), and they were mostly attended by a professional.Sixty-seven percent of children in Gaza and 52% in Nampula had a vaccination card reporting each vaccination date, and were fully vaccinated within 12 months, according to the full vaccination cycle duration in Mozambique.Most outcomes tended to improve, except for vaccinations, ANC visits and knowledge of HIV vertical transmission, in Nampula.Table A1 in Appendix presents outcomes in Gaza and Nampula, before and after the implementation of PBF, and for exposed and not exposed samples.
There was variation in the control variables between the two PBF provinces and the control group.Seventy-nine percent of mothers were in the two richest quintiles in Gaza, versus only 21% in Nampula and 47% in the control group.Education was higher in Gaza, 4.3 years, compared with 3.0 years in Nampula, and 3.8 years in the control group.Distance from the closest health facility was lower in Gaza (3.1 Km) than in Nampula (5.3 Km) and in the control group (5.2 Km).Most mothers lived in rural areas: 76% in Gaza, 64% in Nampula and 54% in the control group.There were also demographic differences, for example, in the age of the mother, in the average number of children under 5 per household, in the average household size and in the gender and age of the head of household.

| Average effect
Table 3 presents the effect of PBF on each outcome.Performance Based Financing had a positive effect only on HIV test performed during ANC (13 pp, Bonferroni-Holm corrected p = 0.037), on knowledge of HIV mother to child transmission (MTCT) (19 pp, p < 0.001) and on knowledge of drugs to prevent it (PMTCT) (31 pp, p < 0.001).There were no significant effects on any other outcome.

| Heterogenous effects by wealth, education and province
Table 4 presents the effects by wealth, education and province.The increase in HIV testing was more pronounced for wealthier women (19 pp, p = 0.001), more educated women (13 pp, p = 0.096), and in Gaza (17 pp, p = 0.003).Conversely the increase in HIV related knowledge was higher for less wealthy women (28 pp, p < 0.001 for MTCT and 38 pp, p < 0.001 for PMTCT), for less educated women (29 pp, p < 0.001 for MTCT and 43 pp, p < 0.001 for PMTCT) and in Nampula (30 pp, p < 0.001 for MTCT and 37 pp, p < 0.001 for PMTCT).

| Heterogenous effects by exposure
Table 5 shows that on average there were no significant improvements for populations within a reachable distance from a PBF facility.There were increases in HIV testing offered (16 pp, p = 0.008) and performed (24 pp, p < 0.001), in knowledge about HIV MTCT (21 pp, p = 0.018) and PMTCT (25 pp, p < 0.001), but only for those closest to a facility in the referral network of a PBF facility.Table 6 illustrates heterogeneity in the effects by wealth, education and provinces for different definitions of exposure to PBF.The improvements for women exposed via referral network were stronger for less wealthy in HIV testing offered (24 pp, p = 0.014), HIV testing performed (27 pp, p = 0.005), and knowledge about HIV MTCT (32 pp, p = 0.033) and PMTCT (30 pp, p = 0.005).For wealthier women there were smaller than average improvements, and only in HIV testing performed (24 pp, p = 0.013) and in knowledge about HIV PMTCT (27 pp, p < 0.001), when exposed via referral network.There were also smaller improvements in HIV testing (16 pp, p = 0.076) and in knowledge about PMTCT (18 pp, p = 0.065) for women within 15 Km from a PBF facility, and in knowledge about PMTCT (29 pp, p = 0.001) for women closest to a PBF facility.The improvements when exposed via referral network, were higher for less educated women in terms of HIV testing offered (18 pp, p = 0.009) and performed (27 pp, p = 0.006), and knowledge about HIV MTCT (37 pp, p = 0.0154) and PMTCT (43 pp, p < 0.001).There were also increases in knowledge about MTCT (24 pp, p = 0.081) for less educated women within 15 Km from a PBF facility.There were no effects for more educated women.In Nampula there were improvements in HIV testing performed (22 pp, p = 0.006) and in knowledge about HIV MTCT (25 pp, p = 0.079), for those exposed via referral network.In Gaza there were also improvements in HIV testing performed (23 pp, p = 0.019) and in knowledge about HIV PMTCT (31 pp, p = 0.026) for those exposed via referral network.

| Parallel trends test and robustness checks
Linear and non-linear trends were parallel for most outcomes in the pre-intervention period, with few exceptions for which we prefer not to rely on causal interpretation.Detailed test results are presented in Appendix (Figures A4 to A7 and Tables A2-A8).The results presented in Tables 3-6 are reported with their corresponding unadjusted and adjusted p-values in Table A10.
Results were robust to removing non-exposed mothers within PBF provinces and within PBF districts, and to removing Maputo City from the control group.The sign and statistical significance of the coefficients remained unchanged when using non-linear models for binary outcomes.None of the district characteristics in the pre-intervention period was associated with PBF roll-out, except for the percentage of economically active population and facility staffing levels (Table A9 in Appendix).When defining exposure by distance using a radium of 25 Km, the trends in outcomes pre-intervention were often nonparallel, and no significant effects were found.

| DISCUSSION
We evaluated the effects of a PBF scheme implemented in Gaza and Nampula provinces in Mozambique, on pre-and post-natal maternal and child care, and on HIV testing and knowledge.The impact was limited.HIV testing during ANC increased, particularly for wealthier and more educated women, and in Gaza.Knowledge about MTCT and PMTCT also improved,   particularly for less educated women, and in Nampula.
Exploiting the program roll-out by facility, and using alternative definitions of exposure, we investigated the role of health facilities' referral network and of district health administrations.HIV testing and knowledge did not improve for women in proximity of a PBF facility, but only for those whose closest facility is in the referral network of a PBF facility, or within the district more broadly.Antenatal care visits and institutional deliveries increased for mothers in the referral network of a PBF facility in Gaza.These results present a complex picture, with effects mainly in indicators related to the provision of HIV services and highly rewarded by incentives.Results suggest that PBF affected the whole local health system.There are signs that PBF incentives increased supply, and particularly for services functional to referring patients from within the whole district to PBF facilities.However, demand-side constraints seemed to remain for less wealthy and less educated.
There is evidence of impact limited to increased HIV knowledge and testing, and of heterogenous effects within districts, which complements findings from Rajkotia et al. (2017), who merely assessed changes in the volume of incentivized services delivered by PBF facilities.The hypothesis that PBF facilities seek to stimulate demand for targeted and highly rewarded services through referral from lower to higher level facilities, is consistent with increased delivery of ART for pregnant women, ANC visits and PMTCT services in PBF facilities.We also found that this improvement was achieved through increased service use by women outside the immediate catchment area, but within the referral network, of PBF facilities, or within the district.This supports the hypothesis that outreach activities were carried out to stimulate demand and referral to increase service delivery in PBF facilities.Our results differ substantially for Nampula, where we did not find any evidence of increased ANC, nor institutional delivery, nor vaccinations.This difference suggests that increased service delivery may have been achieved at the intensive rather than extensive margin, or that effects may have weakened over the longer period covered in our study.
Although not directly targeted, the increase in HIV testing and knowledge may reflect attempts to identify target patients, and to boost demand and referral for highly rewarded HIV maternal and child care services.The increase in knowledge about HIV mother-to-child transmission, and about drugs that can prevent it, supports the hypothesis that efforts were made to stimulate demand and use of services in PBF facilities, rather than to increase knowledge per se.Improvements in knowledge and testing amongst the less educated are coherent with the hypothesis that outreach activities were undertaken amongst this group currently underusing services.However, the analysis by wealth reveals that there is an increase in service use, mostly testing, for wealthier women.This is consistent with evidence from other studies which highlight the role of demand-side constraints in shaping health care seeking behavior, whereby even when knowledge has improved, only those who have the means increase their use of services (Binyaruka et al., 2018(Binyaruka et al., , 2020;;Fichera et al., 2021;Van de Poel et al., 2016).Contextual demand and supply differences highlight the importance of structural demand-side barriers to access and explain different results by province.Gaza, where effects are stronger, has a higher number of facilities, notably more resourced.The population is less sparse, wealthier, and better educated.Weaker effects in Nampula, suggest that incentives alone may not produce results, unless barriers to access, on both the demand and supply-side, are addressed.
Exploiting the roll-out by facility within the same district, we examined the effects on populations beyond the immediate catchment area of PBF facilities, and the possible mechanisms driving them.HIV testing and knowledge did not improve for mothers in proximity of a PBF facility, supporting the hypothesis that outreach activities were implemented through the referral network and under the coordination of the district administration.According to information provided by local policymakers, district administrations contribute to redistributing resources, including HIV tests, across facilities, to promote fairness and to support PBF facilities in achieving their targets.Tests may have been redistributed toward more peripheral areas for case finding and referral to PBF facilities for HIV treatment.Evidence of increased testing, particularly for wealthier women, supports the hypotheses that lower-level facilities were engaged in stimulating demand, but changes were counter-acted by demand-side constraints for less wealthy women.Evidence of increased referral and outreach activities, with concentration of benefits in populations with higher socio-economic status, was found in other settings (Fichera et al., 2021;Singh et al., 2021).
Data limitations constrained our analysis.Firstly, we had a limited set of outcomes, but we could assess population effects, not only on healthcare, but also on knowledge, using data of reliable quality.We were also limited in defining proximity based on a radius around the health facility, without accounting for access to main roads and transport.Although consistently with evidence from a qualitative study (Gergen et al., 2018), we could only hypothesize increased referral activities within the district, as we don't know from the data in which facility women sought care.Covariates were observed at the time of the interview, and not specifically at childbirth or conception.This may be a concern if PBF had an immediate effect on covariates, especially wealth and education, or if the mother moved between pregnancy and the interview.However, the DHS wealth index uses a wide range of assets which gives temporal-robustness (Rutstein & Johnson, 2004).Educational status and the number of years at school are unlikely to change for mothers.We can not control for migration, like in other studies in the region, which use DHS data and making the same assumption (Bonfrer et al., 2014;Fichera et al., 2021;Sherry et al., 2017;Van de Poel et al., 2016).The DHS data do not report the facility where mothers sought care, so we relied on distance from facilities in the district to define exposure to PBF.Because the geo-coordinate of the DHS cluster are randomly displaced, clusters could randomly incorrectly fall within a district, or an area around a facility or be incorrectly associated to the closest facility.Finally, we could not disentangle the effects over time using alternative study designs, nor control for every concurrent program implemented in the PBF districts.Although the major concurrent programmes were implemented on a national scale, the expansion of ART service provision to non PBF facilities may have been intensified in PBF districts, contributing to explain our results.
The impact of PBF remained limited to highly rewarded indicators, for which changes in the supply of services may have occurred, but with the potential impact reduced by demand-side constraints impeding service use.Performance Based Financing can increase HIV testing during ANC, and the knowledge about prevention of HIV transmission from mother-to-child.These were not directly targeted by the program, but served to find patients and increase referral to PBF facilities for highly incentivized services.Increased efforts to stimulate demand may facilitate outreach to more disadvantaged populations within the whole district.However, increases in subsequent use of services, such as ANC and institutional deliveries, may be constrained by demand side or contextual characteristics.The impact may remain limited to wealthier populations, especially if patients are referred to more distant PBF facilities for incentivized services.
Further research using richer and more precise data is required to investigate the mechanisms which local health systems which could explain our results.However, we provide evidence that both health administrations and the way in which systems are organised shape the effects of PBF.We also provide evidence that structural contextual and supply factors, alongside demand-side constraints, may limit the effectiveness of incentives to increase service use.Strengthening the capacity of local health systems and designing interventions which account for demand-side constraints and for the organization of services, is key to achieving widespread access and improved service use, beyond targeted indicators and directly served populations.
Summary of targeted indicators and related outcomes in DHS 2011 and 2015.
Constant, year of birth, month of birth, distance to health facility, squared distance to health facility, mothers' wealth quintile, years of education, 5-year age-group, number of children, Christian religion, household size, female head of household, age of head of household, rural residence, car ownership and fixed effects for closest health facility.DHS population weights applied to observations.Standard errors clustered at the DHS primary sampling unit level in parentheses.*p< 0.10, **p < 0.05, ***p < 0.01, based on Bonferroni-Holm p-value.

T A B L E 4
Effects of performance based financing by wealth, education and province.Constant, year of birth, month of birth, distance to health facility, squared distance to health facility, mothers' wealth quintile, years of education, 5-year age-group, number of children, Christian religion, household size, female head of household, age of head of household, rural residence, car ownership and fixed effects for closest health facility.DHS population weights applied to observations.Standard errors clustered at the DHS primary sampling unit level in parentheses.*p < 0.10, **p < 0.05, ***p < 0.01, based on Bonferroni-Holm p-value.T A B L E 5 Effects of performance based financing by exposure.Constant, year of birth, month of birth, distance to health facility, squared distance to health facility, mothers' wealth quintile, years of education, 5-year age-group, number of children, Christian religion, household size, female head of household, age of head of household, rural residence, car ownership and fixed effects for closest health facility.DHS population weights applied to observations.Standard errors clustered at the DHS primary sampling unit level in parentheses.*p < 0.10, **p < 0.05, ***p < 0.01, based on Bonferroni-Holm p-value.T A B L E 6 Effects of performance based financing by exposure, wealth, education and province.
Constant, year of birth, month of birth, distance to health facility, squared distance to health facility, mothers' wealth quintile, years of education, 5-year age-group, number of children, Christian religion, household size, female head of household, age of head of household, rural residence, car ownership and fixed effects for closest health facility.DHS population weights applied to observations.Standard errors clustered at the DHS primary sampling unit level in parentheses.*p< 0.10, **p < 0.05, ***p < 0.01, based on Bonferroni-Holm p-value.