Improving quality of care through payment for performance: examining effects on the availability and stock out of essential medical commodities in Tanzania

Objective To evaluate the effects of payment-for-performance (P4P) on the availability and stock out rate of reproductive, maternal, newborn and child health (RMNCH) medical commodities in Tanzania and assess the distributional effects. Methods The availability of RMNCH commodities (medicines, supplies and equipment) on the day of the survey, and stock outs for at least one day in previous 90 days prior to the survey was measured in 75 intervention and 75 comparison facilities in January 2012 and 13 months later. Composite scores for each sub-group of commodities were generated. A difference-in-differences linear regression was used to estimate the effect of P4P on outcomes and differential effects by facility location, level of care, ownership and socio-economic status of the catchment population. Results We estimated a significant increase in the availability of medicines by 8.4 percentage points (95% CI: 3.0% to 13.7%; p=0.002) and an 8.3 percentage point increase (95% CI: 0.01% to 16.5%; p=0.050) in the availability of medical supplies. P4P had no effect on the availability of functioning equipment. Effects on stock out rates were similar. Effects were generally equally distributed across facilities, with effects on stock outs of many medicines being pro-poor, and greater effects in facilities in rural compared to urban districts. Conclusion P4P can improve the availability of medicines and medical supplies, especially in poor, rural areas, when these commodities are incentivised at both facility and district levels, making services more acceptable, effective and affordable, enhancing progress towards universal health coverage.


Introduction
The availability of essential medical commodities (medicines, medical supplies and equipment) is a key component of effective service delivery required for maintaining population health [1]. Shortages of medical commodities are associated with poor structural quality, or poor quality relating to the attributes of the setting in which care delivery occurs [2,3], low levels of patient satisfaction, and preventable deaths [4][5][6][7][8][9]. Medicine and supply shortages in public facilities are also responsible for a large share of the out-of-pocket payments faced by households in low and middle income settings limiting the affordability of care [1,10]. However, ensuring the availability of essential medical commodities remains a challenge for many low income country health systems.
According to the United Nations commission on life-saving commodities, Pay-for-Performance (P4P) is a strategy to improve access to life-saving commodities for maternal and child health [11,12]. P4P provides financial incentives to providers and/or health care managers based on the achievement of pre-defined performance targets and is currently being rolled out in many low income countries [13,14]. P4P could theoretically affect the availability of medical commodities by, for example, incentivising the provision of intermittent preventive treatment (IPT) for malaria during antenatal care (ANC); through facility-level bonus payments which can be used to procure commodities; and by incentivising district managers to reduce drug stock out rates.
However, empirically, only four studies have reported on the effect of P4P on the availability of medical commodities in low income countries. The effects are varied with no effects on the availability of drugs and equipment in Afghanistan [15]; no effects on patient perceptions of drug availability in Burundi [16]; an increase in patient perceptions of drug availability in the Democratic Republic of Congo (DRC) [17]; and a reduction in the availability of vaccines and equipment in another study from the DRC [18]. Only one study reports on stock out rates [18] and none of the studies shed light on the pathways through which such changes occurred. Previous studies have not examined the potential heterogeneity of effects across facilities and effects on commodities related to non-incentivised services (spillover effects). This paper examines the effect of P4P on the availability of medicines, medical supplies and equipment for reproductive, maternal, newborn and child health in Tanzania, and assesses whether these effects differed by facility location, level of care, facility ownership and socio-economic status of the facility's catchment population.

Study Setting
Since the 1990s, Tanzania began a process of decentralisation of government functions including health services, involving the transfer of power from central to local government authorities [19]. As a result, district level managers are responsible for preparing annual health sector plans and budgets to implement health programmes and renovations in facilities, and are responsible for generating and managing resources for the district. District managers are supported by a regional health management team; while health facility governing committees oversee the implementation of plans, and the management of resources at facility level. Public health facilities order medical commodities on a quarterly basis, based on an estimate of quantity needs; they submit request to the district who review and send on to the medical stores department (MSD) and distribute medical commodities to facilities (the 'pull' system) [20][21][22]. Districts and facilities can also use their own funds (e.g. insurance contributions, user fees, and P4P bonus payments) to procure commodities in case of stock-outs [22][23][24]. Non-public hospitals that are contracted by districts to deliver services on behalf of the Ministry of Health and Social Welfare (MoHSW) also receive medical commodities from the MSD. All other non-public facilities either procure commodities from the MSD, foreign or local manufacturers, privately owned accredited drug dispensing outlets (ADDOs) and pharmacies [25][26][27]. Some commodities (vaccines, Anti-retrovirals (ARVs), Vitamin A and family planning) are managed through disease-specific vertical programmes, which are financed externally, and distributed via the MSD or directly to facilities [24,28,29]. The MSD supply chain suffers from a shortage of commodities, inadequate budget allocations, inadequate tracking mechanisms and late delivery of required commodities [8,22,24,30]. As a result, facilities experience regular shortages of essential drugs and supplies especially in the public sector [22,24,30,31]. For example, out of 1297 facilities surveyed in 2012 only 41% stocked the 14 essential tracer medicines at the time of the survey [31]. An assessment in 2010 found that the MSD fulfilled 68% of hospital orders and 67% of orders from health centres and dispensaries [32].

P4P in Tanzania
In 2011, the MoHSW in Tanzania, with financial support from the Government of Norway, introduced a P4P scheme in Pwani region to improve reproductive, maternal, newborn and child health (RMNCH), which is ongoing. Pwani region is one of 30 regions in the country and has seven districts with more than 209 health facilities and a population of just over a million [33]. Financial incentives are given to health facilities, district and regional managers based on their performance on pre-defined service delivery targets (Table 1) [34,35]. Most of the targets at facility level pertain to increases in service coverage, with four that involve the provision of medicines such as Antiretroviral therapy (ART), IPT during ANC, vaccines and supplies such as partographs. District managers are rewarded for reducing the proportion of facilities in the district reporting stock-outs of essential medicines (Appendix 1a) for at least one week. Districts are required to verify facility performance reports, resulting in more frequent contact between district managers and providers which may also help reduce stock-outs. Facilities are required to open bank accounts to receive performance payments.
Facility and district performance data are verified every six months (one cycle). For dispensaries the maximum payout, if all targets are fully attained, is USD 820 per cycle; while maximum payouts are USD 3,220 and USD 6,790 for health centres and hospitals, respectively. Incentive payouts at facility-level include bonuses to staff (equivalent to 10% of monthly salary) and funds that can be used to procure drugs and supplies and for facility improvement (10% of the total in hospitals and 25% in lower level facilities). District and regional managers receive bonus payments of up to USD 3,000 per cycle based on the performance of facilities in their district or region.

Study design
This study uses data from a controlled before and after study of the P4P scheme in Pwani region, Tanzania, conducted in all seven intervention districts and four comparison districts from Morogoro and Lindi regions [34,35]. Baseline data were collected in January 2012, and 13 months later.

Data sources
The data on availability and stock out of essential RMNCH commodities within the previous 90 days were collected through a survey of 75 facilities in each study arm. In the intervention arm we included all 6 hospitals and 16 health centres that were eligible for the P4P scheme, and a random sample of 53 eligible dispensaries. A corresponding number of facilities were surveyed in the comparison arm. The facility survey also documented facility characteristics and was administered to the facility in-charge. To proxy the socioeconomic status (SES) of the facility catchment population, we used data from a survey of 1500 households of women who had delivered in the previous 12 months prior to the baseline survey, and similar sample in the follow-up survey (20 households sampled from the catchment area of each facility). More details on data sources and data collection are provided elsewhere [34,35].

Outcome Measures
Our main outcomes are the availability of RMNCH medicines, medical supplies and functioning equipment, and stock-out of medicines and supplies at the facility. If a commodity was available on the day of the survey, the outcome was coded 1 and 0 otherwise; if a commodity was out of stock for at least one day in 90 days prior to the survey the outcome was coded 1 and 0 otherwise (Appendix 1a).
Medical commodities were classified in terms of their therapeutic use as: antibiotics, antimalarials, antihypertensives, antidiarrheal, ARVs, oxytocics, vaccines, family planning, vitamin A, medical supplies and medical equipment (Appendix 1a). We differentiated between items which relate directly to a P4P target and those which do not, to examine eventual spillover effects. Items were also classified according to their beneficiary/ recipient group along the RMNCH continuum of care based on the World Health Organisation classification of priority medicines [11,12]. For each of these groupings, we generated composite scores based on an un-weighted mean score across items in the group, which can be interpreted as the mean percentage availability/stock out rate within the grouping across facilities. We measured the proportion of facilities with availability/ stock-out of the respective commodity groups. In the generation of indices we gave equal weight to each commodity item for ease of interpretation, but we acknowledge some of the items may be more effective than others in enhancing better health outcomes.

Sub-group effects
We examined whether the effects of P4P differed with the wealth of the facility catchment population to see if benefits were pro-poor, given the greater burden of out-of-pocket payments from stock-outs on poorer groups [1,10,30,36]. We also examined effects by facility ownership (public/non-public) given the differing procurement and supply systems in public and non-public sectors; level of care (dispensary/health centre or hospital) given that dispensaries are typically worseoff in drug availability [7,31,37]; and whether the facility was in an urban or rural district as facilities in urban districts are better connected by roads facilitating the distribution of commodities relative to those in rural districts.
To generate a wealth score for each household in the catchment area of the facility based on their ownership of 42 household items and characteristics using principal component analysis (PCA) [38,39] (Appendix 1c). We calculated the average wealth score of the 20 households sampled within the facility catchment area. We ranked facilities by these scores from poorest (low score) to least poor, and split them into terciles (poorest, middle and least poor).

Statistical analysis
We compared facility characteristics and outcome scores across study arms by using t-tests adjusting for clustering at the facility level. We used a linear difference-indifferences regression model to identify the effects of P4P on the availability and stock-out of medical commodities (1): where is the outcome of facility i at time t. 4 is a dummy variable, taking the value 1 if a facility is exposed to P4P and 0 if not. We controlled for time invariant determinants with facility fixed effects; and year fixed effects. The error term is . The effect of P4P on the outcome is given by 1 .
In order to examine sub-group effects, we included a triple interaction term between treatment effect ( 4 × ) and sub-grouping variable . The associated two-order interaction terms were also included. The coefficient of interest for the differential effect is 3 (2): = 0 + 1 ( 4 × ) + 2 + 3 ( 4 × × ) + 4 ( 4 × ) For each of the effects we report the confidence interval based on standard errors that are clustered at the facility level. As a robustness check, we clustered the standard errors at the district level and used the bootstrapping method to adjust for the small number of clusters [40]. We were unable to test whether the availability and stock out outcomes were parallel between study arms prior to the intervention. However, we tested and confirmed that trends in facility level utilisation for all incentivised services were parallel prior to the intervention [35,41]. All analyses were performed using STATA version 13.

Results
Baseline facility characteristics were fairly balanced across study arms (Table 2). However, facilities in the intervention arm were serving poorer populations than those in the comparison arm.
P4P was associated with an 8.4 percentage point increase (95% CI: 3.0% to 13.7%) in the availability of all 37 medicines combined (p=0.002, 13.8% increase from baseline) and an 8.3 percentage point increase (95% CI: 0.01% to 16.5%) in the availability of medical supplies, though this was only borderline significant (p=0.050, 12.9% increase from baseline) ( Table 3). P4P had no effect on the availability of functioning equipment. Effects were noted for some medicines associated with P4P targets (antimalarials, antihypertensives and oxytocics used for deliveries) and supplies (partograph), though this effect was only borderline significant, but not on vaccines, family planning and ARVs. Effects were also observed for items that were not clearly linked to service targets, but were incentivized for district managers (antibiotics).
P4P was also associated with a reduction in stock outs of medicines and medical supplies ( Table 4). Most of those items where we found a significant increase in availability, were also less likely to be out of stock. In addition, there was a borderline significant 10.2 percentage point reduction in vaccine stock outs (p=0.073, 59.6% reduction from baseline) and a 13.6 percentage point reduction in stock outs of family planning medicines (p=0.062, 29.9% reduction from baseline) ( Table 4). The effects of P4P on IPT and partograph stock outs were not significant.
P4P reduced the stock out of medicines across the RMNCH continuum of care, and that of medical supplies benefiting mothers and newborns (Appendix 1b). Effects on availability were most pronounced for maternal, newborn and child medicines and reproductive health supplies.
The effect of P4P on the stock out of medicines overall was pro-poor, with the reduction in facilities in the poorest tercile being 24.5 percentage points greater than that in the least poor tercile (p=0.019); specifically, the effects on the stock outs of antimalarials, antibiotics and oxytocics were pro-poor; effects on anti-malarial availability were also marginally pro-poor (Table 5). P4P had a greater effect on the availability of medicines and medical supplies in facilities in rural districts (by 10.4 percentage points, p=0.051, and 22 percentage points, p=0.003, respectively). Similarly, the effect of P4P on the availability and stock outs of antimalarials was greater in facilities in rural than urban districts (23.1 percentage points, p=0.020; and 23.1 percentage points, p=0.070 respectively). The effect of P4P on the availability and stock out of antihypertensives was greater in health centres and hospitals than in dispensaries (by 19.9 percentage points (p=0.020) and 26.1 percentage points (p=0.064), respectively). There were no differential effects by facility ownership.
When standard errors were clustered at the district level, the effects on the availability of antimalarials, oxytocics and delivery care drugs combined, and on the stock out of oxytocics, vaccines and delivery care drugs combined were maintained (results not shown). However, the effects on composite indices for medicines combined and medical supplies were no longer significant.

Discussion
We examined the effects of P4P on the availability and stock out rate of medical commodities for RMNCH. P4P was associated with significant improvements in availability and reductions in stock outs of medicines and medical supplies, but had no effect on the availability of equipment. Among medicines, the main effects were for drugs associated with the delivery of some incentivized services: antimalarials, drugs to induce labour and manage bleeding (oxytocics) or manage hypertension during delivery (antihypertensives). However, there was little or no evidence of effects on medicines linked to other incentivised services such as vaccines, family planning, ARVs, and supplies such as the partograph. P4P improved the availability/reduced stock outs for some of the drugs that districts were incentivised for, including antibiotics (ampicillin, amoxicillin, gentamycin, and flagyl). However, the scheme also reduced the stock out of antibiotics that were not tied to any incentive (e.g. cotrimoxazole, chloramphenicol and crystapen injection). This suggests that P4P schemes have the potential to improve drug availability beyond those drugs that are directly linked to the delivery of incentivised services. . Effects were generally equally distributed across facilities, with effects on medicine stock outs being pro-poor in many cases, and greater in facilities in rural compared to urban districts. Greater improvements in the availability/stock out reduction of antihypertensives in higher level facilities is likely reflective of the greater number of obstetric referral cases at these facilities, and associated need.
There are a variety of potential pathways to P4P effects on medicines and supplies in our study. The effect may in part be due to the provision of medicines being a pre-condition for meeting certain performance targets (for example, IPT during ANC). The financial autonomy resulting from bank accounts enabled facilities to use bonus funds and cost sharing revenue (from user fees and community-based insurance) to procure drugs and supplies, consistent with findings from a process evaluation carried out alongside this study [42]. Incentives to district managers to limit drug stock outs were also important, given the role of district managers in the procurement and supply process. By providing incentives to facilities and districts, the scheme ensured that stakeholders at all levels were working towards the same goals. The verification system under P4P also meant that district supervision was intensified, providing more opportunities for district managers to identify and address stock outs of a wider range of drugs.
A number of medicines associated with incentivised services were not affected by P4P (vaccines, ARVs and family planning). The procurement of these items depended on donor funding [24,28,29]. The average availability of vaccines was above 94% at baseline (91% for family planning), so there was also little scope for improvement. Tanzania faced a problem with shortages of ARVs during the period of this study due to the introduction of a new treatment regimen, weak procurement mechanisms, and shortages of ARVs on the global market that were outside of facilities' control [43]. The lack of effect on equipment availability may be due to the lack of incentives attached to equipment availability at the facility or the district level. The cost of equipment is also higher than that of many drugs and supplies, which may have deterred facilities from such investments.
Our study stands in contrast to a recent review from low and middle income countries concluding that P4P is not effective in improving structural quality of care [44]. However, our finding of increased availability of drugs is consistent with that reported from South Kivu province in the DRC [17], but contrary to the findings from Afghanistan [15], Burundi [16,45] and Katanga province in the DRC [18] that showed no effects. The differences in context and variation in program design likely explain the difference in effects. In Afghanistan, Burundi and DRC drugs/supplies were incentivised through service targets, and providers had financial autonomy as in Tanzania [15][16][17], and in Burundi up to 50% of the bonus could be used to procure drugs, however, this was not clearly the case in the other settings. Unlike the Pwani scheme, many schemes weight bonus payments with structural quality scores, which include the availability of drugs and supplies [15][16][17]. While facilities could channel a percentage of their bonus to districts in the DRC [46], districts were not directly incentivised, nor were they incentivised in other settings.
Despite the importance of assessing distributional effects within program evaluation [47,48], ours is the first study to examine the heterogeneity of the effect of P4P on medical commodities. The pro-poor effects on medicines are encouraging as are the pro-rural effects and these are consistent with universal health coverage (UHC) goals and efforts to meet the sustainable development goal (SDG) 3.
There are several limitations to this study. First, we used household data from the facility catchment area to proxy the SES of the facility's location based on a sample of 20 households which may not have accurately reflected the entire catchment population. Second, there was an imbalance in SES across study arms, however, our results were reasonably robust when dividing facilities into SES groups in each arm separately. Third, we were unable to control for time-varying confounding factors due to lack of data, but confounding bias due to time-invariant factors were adjusted through fixed effects estimation. Fourth, although we tested and confirmed the assumption of parallel trends in facility utilisation outcomes prior to the intervention, we failed to test with drug availability and stock-out outcomes due to lack of historical data on these outcomes. We were also unable to capture seasonal fluctuations in drug availability as this requires time series data which were not available. Finally, potential type I errors due to multiple hypotheses testing is a concern to inference, however, we used sub-groups of items to minimize the risk of this error.

Conclusion
Our study has shown that P4P when introduced with facility and district level incentives and in a context where facilities and local government authorities have autonomy over the use of funds can improve the availability of drugs and supplies making services more acceptable, effective and affordable, especially in facilities serving poor, rural populations, enhancing progress towards universal health coverage [1,10].

Funding
The Government of Norway funded the data collection for the program evaluation that was used in this paper (grant numbers: TAN-3108 and TAN 13/0005. http://www.norad.no/en/) and the UK Department for International Development (DFID) as part of the Consortium for Research on Resilient and Responsive Health Systems (RESYST) supported the data analysis and writing of this paper. This study is part of a PhD thesis at the University of Bergen for Peter Binyaruka, who is financially supported by the Norwegian State Education Loan Fund. The funding bodies had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author Contributions
PB was involved in designing this sub-study with JB and oversaw data collection, and analysed the data, and interpreted the results and drafted the manuscript. JB designed this sub-study and the impact evaluation within which it is embedded. JB designed the study tools, and was involved in guiding analysis and interpreting the data and contributed to drafting the manuscript. All authors read and approved the final manuscript.

Competing interests:
The authors of this manuscript have the following competing interests: all authors were funded by the Government of Norway to undertake the data collection associated with this research. The Government of Norway also funded the P4P programme in Pwani region of Tanzania. The funder of the study had no role in data analysis, data interpretation, or writing of the manuscript.

Ethical Issues
The evaluation study received ethical approval from the Ifakara Health Institute institutional review board (approval number: 1BI1IRB/38) and the ethics committee of the London School of Hygiene & Tropical Medicine. Study participants provided written consent to participate in this study, requiring them to sign a written consent form that was read out to them by the interviewers. This consent form was reviewed and approved by the ethics committees prior to the start of the research.   Notes: Items included for medicines combined (37), medical supplies (11) and equipment (16); "targeted" are commodities linked to services targeted/incentivized by P4P; Number of observations (N) is small for ARVs, family planning and vaccines because not all facilities stock these commodities; *The % D = (beta / baseline mean) x100, where the baseline mean of the dependent variable is for the intervention facilities; †The Beta is the estimated intervention effect controlling for a year dummy and facility-fixed effects; *** denotes significance at 1%, ** at 5%, and * at 10% level. Notes: Items included for medicines combined (37), medical supplies (11) and equipment (16); "targeted" are commodities linked to services targeted/incentivized by P4P; Number of observations (N) is small for ARVs, family planning and vaccines because not all facilities stock these commodities; *The % D = (beta / baseline mean) x100, where the baseline mean of the dependent variable is for the intervention facilities; †The Beta is the estimated intervention effect controlling for a year dummy and facility-fixed effects; *** denotes significance at 1%, ** at 5%, and * at 10% level.  Notes: Reference category in brackets: for poorest and middle SES (least poor SES), rural (urban), public (non-public), and dispensary (hospital & health centres); †The Beta is the estimated average intervention effect controlling for a year dummy and facility-fixed effects;  is the estimated differential effects of P4P controlling for a year dummy and facility-fixed effects; and statistically significant differential effects in bold (P-value <0.10). ARVs -targeted 7 Zidovudine, stavudine, lamivudine, lenofavir, nevirapine, efavirenz, and emtricitabine 6.
Incentivized essential medicines to district managers Oxytocics, Antihypertensives, Antimalarials, Antidiarrheal, Antibiotics (Gentamycin, Ampicillin, Amoxicillin, Flagyl), Vaccines, ARVs, Iron, Folic Acid, Salbutamol, Dexamethason and Family planning commodities. Notes: For measurement, if a commodity was available on the day of the survey, the outcome was coded 1 and 0 otherwise; if a commodity was out of stock for at least one day in 90 days prior to the survey the outcome was coded 1 and 0 otherwise. 9% Notes: RH=Reproductive Health; MH=Maternal Health; NH=Newborn Health; CH=Child Health; *The % D = (beta / baseline mean) x100, where the baseline mean of the dependent variable is for the intervention facilities; †The Beta is the estimated intervention effect controlling for a year dummy and facility-fixed effects; *** denotes significance at 1%, ** at 5%, and * at 10% level.