Assessment of health facility quality improvements, United Republic of Tanzania

Abstract Objective To identify contextual factors associated with quality improvements in primary health-care facilities in the United Republic of Tanzania between two star rating assessments, focusing on local district administration and proximity to other facilities. Methods Facilities underwent star rating assessments in 2015 and between 2017 and 2018; quality was rated from zero to five stars. The consolidated framework for implementation research, adapted to a low-income context, was used to identify variables associated with star rating improvements between assessments. Facility data were obtained from several secondary sources. The proportion of the variance in facility improvement observed at facility and district levels and the influence of nearby facilities and district administration were estimated using multilevel regression models and a hierarchical spatial autoregressive model, respectively. Findings Star ratings improved at 4028 of 5595 (72%) primary care facilities. Factors associated with improvement included: (i) star rating in 2015; (ii) facility type (e.g. hospital) and ownership (e.g. public); (iii) participation in, or eligibility for, a results-based financing programme; (iv) local population density; and (v) distance from a major road. Overall, 20% of the variance in facility improvement was associated with district administration. Geographical clustering indicated that improvement at a facility was also associated with improvements at nearby facilities. Conclusion Although the majority of facilities improved their star rating, there were substantial variations between facilities. Both district administration and proximity to high-performing facilities influenced improvements. Quality improvement interventions should take advantage of factors operating above the facility level, such as peer learning and peer pressure.


Introduction
Following the Astana declaration in 2018, primary health care is again high on the global health agenda and many countries are renewing their commitments to strengthening primary care. 1 However, poor care quality is often a limiting factor. In the United Republic of Tanzania in 2016, an estimated 45 000 deaths were due to poor care quality and many involved conditions that could be addressed in primary care. 2 Traditionally, quality improvement in health care has focused on micro-level approaches that rely on changing the practices of individual workers or facilities. 3,4 However, these strategies may have a limited impact in complex, adaptive, health systems. Macrolevel and meso-level strategies that affect whole systems or geographical areas are needed to address the social, political, economic and organizational structures underlying poor care quality. 3,5 In the United Republic of Tanzania, the Health Quality Assurance Division of the health ministry did a star rating assessment in 2015 as part of a government initiative to improve service delivery. [6][7][8] A data feedback approach was adopted. First, assessment teams, which comprised two independent health workers and one member of the local district's health management team, collected data on care quality from all primary care facilities. 8 Each facility was given a rating of between zero and five stars. 9 Then, health facility administrators developed a quality improvement plan tailored to tackling specific quality gaps. 8 Ratings were also discussed with district and regional health management teams. 8 Some, but not all, administrations took an active interest and supported facilities. For example, some used the assessment tool as a supportive supervision checklist or encouraged facilities to learn from one another. However, decisions on whether and how to improve quality were taken locally; there was no universal plan and no national incentives for improvement. Originally, the health ministry planned to close facilities with zero stars but too many facilities met that criterion. Facilities were reassessed between 2017 and 2018.
Although the star rating assessment was the country's flagship strategy for improving the quality of primary care, it was implemented at a time when other health system changes may have influenced quality. First, a larger government initiative for improving service delivery prioritized decentralization. 6 Fiscal responsibility was delegated to local districts in 2014 and, in 2018, decentralized further to frontline health facilities through direct facility financing. 10,11 Second, in 2015 a results-based financing programme was implemented in public facilities in eight of the country's 31 regions to address health-care quality and utilization. 12 A facility needed more than zero stars at baseline assessment or reassessment to be eligible for results-based financing. Facilities in programme regions that did not meet this criterion initially received a starter fund of 10 million Tanzanian shillings (about 4500 United States dollars). Once enrolled in the results-based financing programme, facilities' performance was evaluated using criteria that differed from the star rating assessment, Objective To identify contextual factors associated with quality improvements in primary health-care facilities in the United Republic of Tanzania between two star rating assessments, focusing on local district administration and proximity to other facilities. Methods Facilities underwent star rating assessments in 2015 and between 2017 and 2018; quality was rated from zero to five stars. The consolidated framework for implementation research, adapted to a low-income context, was used to identify variables associated with star rating improvements between assessments. Facility data were obtained from several secondary sources. The proportion of the variance in facility improvement observed at facility and district levels and the influence of nearby facilities and district administration were estimated using multilevel regression models and a hierarchical spatial autoregressive model, respectively. Findings Star ratings improved at 4028 of 5595 (72%) primary care facilities. Factors associated with improvement included: (i) star rating in 2015; (ii) facility type (e.g. hospital) and ownership (e.g. public); (iii) participation in, or eligibility for, a results-based financing programme; (iv) local population density; and (v) distance from a major road. Overall, 20% of the variance in facility improvement was associated with district administration. Geographical clustering indicated that improvement at a facility was also associated with improvements at nearby facilities. Conclusion Although the majority of facilities improved their star rating, there were substantial variations between facilities. Both district administration and proximity to high-performing facilities influenced improvements. Quality improvement interventions should take advantage of factors operating above the facility level, such as peer learning and peer pressure.

Research
Health facility quality improvements, United Republic of Tanzania Anna D Gage et al.
though with some overlap. Third, three additional regions and two districts in a fourth region received starter funds to improve quality in zero-star-rated public facilities, independently of the resultsbased financing programme. The aim of our study was to identify micro-and meso-level factors associated with quality improvements at healthcare facilities between two star rating assessment rounds in order to characterize the context in which quality improvement plans could be most effective. In particular, we determined whether improvements were related across groups of facilities by assessing how they were influenced by local district administration or geographical proximity to other facilities. Better understanding of how quality improvements are affected by the context in which a facility functions will help countries similar to the United Republic of Tanzania develop more targeted and effective strategies for quality improvement.

Conceptual framework
We adapted the consolidated framework for implementation research to make it applicable to a low-income context and suitable for a nationwide assessment. 13,14 Details are available from the data repository. 15 First, a structural environment was added to the outer setting domain and the inner setting domain was limited to the constructs of: (i) structural characteristics; (ii) networks and communications; and (iii) culture. Second, the modified framework conceptualized two pathways through which the outer setting could influence quality improvement: (i) district council administration (i.e. location within a district); and (ii) geographical proximity to other facilities. 16 Policies, management, supervision and funds are the responsibility of the local district council, which is the lowest level of government charged with health facility administration in the country. Urban administrations included town councils and municipalities and all rural administrations were district councils -we use the term district council to refer to both rural and urban administrations. 17 A facility's location and immediate surroundings may independently influence its ability to implement quality improvement plans in low-income settings where facilities are isolated because of poor communications and high transportation costs. For example, proximity to a high-performing facility may encourage peer learning.

Study sample
The 2015 baseline star rating assessment covered 6993 primary health-care facilities (i.e. public and private dispensaries, health centres and primary-level hospitals) in mainland United Republic of Tanzania. The assessment excluded: (i) facilities in Pemba and Zanzibar; (ii) national, zonal and regional referral hospitals; and (iii) stand-alone clinics, such as maternity homes and dental clinics. Reassessment took place between 2017 and 2018 and covered 7289 facilities. Our study included all facilities with star ratings from the two assessment rounds. Our analysis excluded: (i) the Dar es Salaam region because baseline assessment data were unavailable; (ii) institutional facilities, such as prisons, military and police facilities, and those with an unknown management type (2% of facilities); and (iii) facilities without geographical coordinates. Coordinates were obtained from the United Republic of Tanzania's 2019 master health facility database.

Dependent variable
The primary dependent variable was the change in a facility's star rating between baseline assessment and reassessment. Star rating assessments covered four domains and twelve subdomains, which were awarded different score weightings (Table 1); they included measures of both structural quality (e.g. medicines and equipment) and process quality (e.g. adherence to clinical guidelines and patients' experience), as assessed through facility audits, record reviews and interviews with providers and clients. 9 Dispensaries, health centres and primary-level hospitals each had their own assessment tools, which included additional items as the level increased. 9 The overall score ranged from 0 to 100%. Stars were awarded according to the lowest domain score: 0 to 19%: no stars; 20 to 39%: one star; 40 to 59%: two stars; 60 to 79%: three stars; 80 to 89%: four stars; and 90 to 100%: five stars. The analysis was repeated using the change in overall score as a secondary dependent variable.

Independent variables
We identified contextual factors that could influence a facility's ability to improve quality using the modified conceptual framework. Values for independent variables were obtained from a range of data sources (Table 2), preferably for 2015 to correspond with the time of baseline assessment. Variables obtained from Demographic and Health Surveys were calculated for individual districts and applied to all facilities in the district. As these surveys are representative only at the regional level and some districts had very small sample sizes, we smoothed the variations arising from the small sample sizes by calculating predicted values for the variables using a null, three-level, random intercept model with households nested within districts and regions. 23 As limited data were available on inner setting characteristics, we performed a secondary analysis on a subset of facilities covered by the service provision assessment carried out from 2014 to 2015 (Table 2). 22 Service provision assessments are nationally representative facility surveys that include data on facility management and provider motivation. Facilities covered by this assessment were linked to star rating data using geographical coordinates. Data on other covariates came from WorldPop, OpenStreetMap and Natural Earth. 18,20,21 Analysis Guided by the conceptual framework, we estimated the contribution of the covariates to quality improvement using a two-level, random intercept model with facilities nested within districts. The percentage variation in improvement explained by a set of covariates was calculated as the difference between the variance in the adjusted model (which included these covariates) and the null model (which did not include any covariates) divided by the null model variance. For a subsample of facilities, we repeated the calculations using a full random intercept model that included data on additional variables available only from the service provision assessment.
We examined the contribution of the geographical proximity of facilities

Research
Health facility quality improvements, United Republic of Tanzania Anna D Gage et al.
to quality improvement using spatial analyses. First, we mapped the improvement using interpolation with an inverse distance weighting and clipped to 10 km around the facility to visualize trends. We calculated Moran's I for the change in star rating between the two assessments. Moran's I provides a measure of spatial autocorrelation by comparison with the null hypothesis of complete spatial randomness. 24 An inverse distance weighting matrix (i.e. with a weighting of 1/x 2 , where x is the distance between facilities) was applied for facilities within 50 km of the index facility. Furthermore, the residuals of the two-level, random intercept model were also tested for spatial autocorrelation that was not explained by district or other covariates. Finally, as we found that the residuals were still autocorrelated, we used a hierarchical spatial autoregressive model to explicitly model spatial relationships at both facility and district levels. 25 This model included spatial lag terms at facility and district levels: these terms can be interpreted as associations between the improvement in a facility's star rating and improvements in nearby facilities and in adjacent districts, respectively. Additional details of the hierarchical spatial model are available from the data repository. 15 The National Institute for Medical Research of the United Republic of Tanzania and the Ifakara Health Institute Institutional Review Board approved the original study and the Harvard Institutional Review Board determined that this secondary analysis did not involve research on human subjects. All analyses were conducted in R v. 3.6.1 (The R Foundation, Vienna, Austria).

Results
Star rating scores from two assessments were available for 5595 facilities that met inclusion criteria. Overall, 81% (4534/5595) were public facilities and 85% (4777/5595) were dispensaries ( Table 2). Facility performance at baseline was poor: 34% (1927/5595) scored zero stars and 52% (2892/5595) scored one star. In total, 15% (835/5595) of fa-cilities participated in the results-based financing programme and a further 11% (637/5595) were public facilities in programme regions that were ineligible because they had zero stars. There was an average of 47 facilities per district and 15 facilities per 100 000 people. Of 672 facilities with data available from the service provision assessment, 76% (508/672) had received external supervision in the past 6 months and 15% (99/672) had undergone routine quality assurance before star rating assessments started in 2015. Fig. 1 shows the proportion of facilities with a changed star rating between baseline and reassessment. Overall, 3% (181/5595) had a lower star rating at reassessment, 25% (1386/5595) received the same score, 45% (2531/5595) improved by one star and 27% (1497/5595) improved by two or more stars. There was no difference in improvements between dispensaries, health centres and primary-level hospitals. Public facilities improved more than for-profit and non-profit private facilities. There was a strong association with baseline rating: facilities with zero stars at baseline showed the largest improvements. Facilities with a lower score at reassessment than baseline more often had a score of two or higher at baseline and were more often a for-profit or nonprofit private facility. Although scores decreased across all domains in facilities with lower reassessment scores, the largest declines were in the service charter fulfilment and accountability domain. Fig. 2 shows the baseline star rating and improvement in star rating for all 5595 facilities assessed. At baseline, star ratings were best in Arusha and Kilimanjaro regions and poorest in Kigoma and Mtwara regions. Facilities improved most in the Pwani region and in regions surrounding Lake Victoria, except the Mara region. Facilities improved least in the Mara, Tanga and Ruvuma regions. There was significant geographical clustering of both baseline ratings (Moran's I: 0.17; P < 0.01) and star rating improvements (Moran's I: 0.18; P < 0.01). Table 3 shows the results of the random intercept models based on the full study sample of 5595 facilities. With the null model, 20% of the variance in facility improvement was due to the variance between districts, whereas 80% was due to the variance between facilities within districts (see footnotes to Table 3). With the outer setting model, 29% of the total variance was explained by all covariates included in the model. With the full model, which also included inner setting variables with data available for all facilities, 33% of the total variance was explained by the covariates. Relative to baseline performance, primary-level hospitals and health centres improved more than dispensaries, and public facilities improved more than for-profit and non-profit private facilities. In addition, higher population density around a facility and the facility's proximity to a major road were significantly associated with greater improvements. Participation in the results-based financing programme was associated with an average 0.37-star improvement (95% confidence interval, CI: 0.27 to 0.46), whereas being ineligible for the programme was associated with a 0.60-star improvement (95% CI: 0.50 to 0.70). Receipt of a starter fund was not significantly associated with an improvement. For every one-star increment in a facility's baseline score, there was a 0.69-star decrease (95% CI: −0.78 to −0.60) in performance, which indicates that initially low-performing facilities improved more than others, irrespective of other contextual factors.
The results of models based on the subsample of 672 facilities covered by the service provision assessment are presented in Table 4 (available at: http:// www .who .int/ bulletin/ volumes/ 98/ 12/ 20 -258145). With the null model, 18% of the variance in facility improvement was due to the variance between districts (see footnotes to Table 4). With the full model, which included all variables in the full model in Table 3, the covariates explained 39% of the total variance. With the final model, the addition of data on variables included in the service provision assessment contributed only one percentage point to the explained variance. However, this model identified two additional variables associated with a greater improvement in star rating: the number of full-time health workers in each facility and routine use of a quality assurance system. Models based on the change in overall score gave similar findings. 15 With the random intercept models, Moran's I for the residuals was 0.05 (Table 3), significantly lower than the unadjusted value of 0.18 (P < 0.05), which indicates that a large portion of the spatial autocorrelation was due to the facilities' district. However, the sig-nificant autocorrelation of the residuals suggests that spatial factors other than district may be associated with quality improvement. The results of the hierarchical spatial autoregressive model that included spatial lag terms at both levels are also presented in Table 3. Both lag terms were large and significantly associated with quality improvement. In this model, covariates accounted for 38% of the total variance (see footnote to Table 3). In addition, the proportion of women in a district who gave birth in a facility and the baseline rank of a facility relative to other facilities in the same district were also significantly associated with the improvement in star rating. Facilities with a low baseline rank showed greater improvements.

Discussion
The success of quality improvement interventions in health facilities depends on processes within facilities and the context in which they operate. Our study in the United Republic of Tanzania found that both district and proximity to high-performing facilities influenced a facility's ability to improve care quality, as assessed using the star In addition, baseline star rating, facility type and participation in, or ineligibility for, results-based financing were also important predictors of improvement. For example, facilities that were ineligible for results-based financing because of a low star rating improved more than facilities that received funding through results-based financing. The incentive of becoming eligible may have had a greater effect than incentives provided by results-based financing itself. Also, district councils may have put pressure on ineligible facilities to pursue additional funding. Primarylevel hospitals and public facilities were more likely to improve than dispensaries or privately managed facilities. Private facilities may have felt less pressure to improve given their independent funding or received less support from district council administrations. Dispensaries may have had fewer financial and human resources to devote to improvement than primary-level hospitals.
More research is needed to understand the causal mechanisms behind our findings. Qualitative interviews with facility managers conducted by the Ifakara Health Institute confirmed the importance of context (Sanam Roder-DeWan, Ifakara Health Institute, unpublished observations, 2019). For example, managers noted that star ratings stimulated competition between neighbouring facilities, with some facilities being envious of surrounding facilities' high baseline scores. Managers also thought district council administration was critical for clarifying and strengthening facilities' accountability, both to the council administration and the community.
This study has several limitations. First, limited data were available, particularly for adapting the consolidated framework for implementation research to a national programme in a low-income setting. 13 Ideally, it would be helpful to have more data on: (i) interactions between district councils and facilities; (ii) facilities' culture and readiness to change; and (iii) health worker characteristics, such as self-efficacy and knowledge of the intervention. Supplementary data from the service provision assessment were still limited in these areas. Second, the star rating tool is limited in the way it assesses quality: (i) it does not consider health outcomes data; and (ii) it includes a large number of input variables, which may have overemphasized their importance relative to process variables. Moreover, data on users' experiences may have been influenced by their varied and growing expectations of the health-care system. 26 The health ministry is revising the star rating tool to put less emphasis on inputs in future. Finally, the collection of star rating data by health workers affiliated with district health management teams had two consequences: (i) we were unable to disentangle the variance due to district council administration from the variance due to data collectors; and (ii) we were unable to tell if some council health management teams had inflated their ratings during reassessment to show greater improvements.    Table 2 with data available for all facilities. d The full model included inner and outer setting variables listed in Table 2 with data available for all facilities. e The natural log of this variable was used in the analytical models. f Distance was expressed in 10-km units. g Spatial lag terms at facility and district levels represent associations between improvement in a facility's star rating and improvements in nearby facilities and adjacent districts, respectively. h With the null model, the proportion of the variance associated with districts was 20% (i.e.

Research
Health facility quality improvements, United Republic of Tanzania Anna D Gage et al.
Despite these limitations, our study has implications for policy. First, our findings suggest that the data feedback strategy did not operate simply at the micro level, where only facility-level characteristics influence quality improvements. Instead, meso-and macrolevel factors, such as district council administration, peer learning and pressure from neighbouring facilities, were also important. Health system interventions that target districts could be explicitly designed to take advantage of the influence of health facilities and networks within those districts. 27 In the next star rating assessment in the country, facilities will be given certificates to post publicly to promote social accountability and encourage peers. Second, the strong association we found between quality improvement and baseline performance indicates there was a strong floor effect, or regression to the mean: most improvements occurred in low-performing facilities. When many facilities have one or two stars, new strategies may be required to promote further improvement. Finally, the difference in improvement between facilities ineligible for, and enrolled in, a resultsbased financing programme indicates that incentives for such financing could be redesigned. 28  des variables associées aux améliorations de score entre chaque période d'évaluation. Les données sur les établissements ont été collectées auprès de plusieurs sources secondaires. La proportion de la variance d'amélioration observée à l'échelle de l'établissement et du district ainsi que l'influence des établissements à proximité et de l'administration locale ont été mesurées à l'aide de modèles de régression multiniveaux pour l'un, et d'un modèle autorégressif spatial hiérarchique de l'autre. Résultats Le score a augmenté dans 4028 des 5595 (72%) établissements de soins de santé primaires. Parmi les facteurs liés à cette amélioration, citons: (i) le nombre d'étoiles attribuées en 2015; (ii) le type d'établissement (un hôpital par exemple) et son propriétaire (l'État par exemple); (iii) la participation à ou l'éligibilité à un programme de financement fondé sur les résultats; (iv) la densité de population locale; et enfin, (v) la distance par rapport à un axe routier majeur. Au total, 20% de la variance d'amélioration de l'établissement était inhérente à l'administration du district. Et d'après le regroupement géographique, l'amélioration d'un établissement était également tributaire de celle des établissements à proximité. Conclusion Bien que la majorité des établissements aient amélioré leur score, des écarts considérables subsistent entre eux. Tant l'administration de district que la proximité d'établissements hautement performants ont eu un impact sur leur progression. Les interventions visant à renforcer la qualité devraient tirer profit des facteurs situés hors des murs de l'établissement, tels que l'enseignement mutuel et la pression exercée par les pairs.

Resumen
Evaluación sobre las mejoras de la calidad de los establecimientos sanitarios, República Unida de Tanzania Objetivo Identificar los factores contextuales de las mejoras de la calidad en los establecimientos de atención primaria de salud de la República Unida de Tanzania entre las dos evaluaciones de clasificación por estrellas, en las que se presta especial atención a la administración local de los distritos y a la proximidad de otros establecimientos. Métodos Se realizaron evaluaciones de clasificación por estrellas a los establecimientos en 2015 y entre 2017 y 2018; la calidad se clasificó de cero a cinco estrellas. El marco consolidado de la investigación sobre la aplicación, que se adaptó a un contexto de bajos ingresos, se utilizó para identificar las variables relacionadas con las mejoras en la clasificación por estrellas entre las evaluaciones. Los datos de los establecimientos se obtuvieron de varias fuentes secundarias. El porcentaje de la varianza en la mejora de los establecimientos que se observó a nivel de establecimiento y de distrito y la influencia de los establecimientos cercanos y de la administración del distrito se estimaron por medio de los modelos de regresión de multinivel y de un modelo autorregresivo espacial jerárquico, respectivamente. Resultados La clasificación por estrellas mejoró en 4028 de 5595 (72 %) establecimientos de atención primaria. Los factores de la mejora incluyeron: i) la clasificación por estrellas de 2015; ii) el tipo de establecimiento (por ejemplo, hospital) y la propiedad (por ejemplo, pública); iii) la participación en un programa de financiamiento por resultados o la posibilidad de obtenerlo; iv) la densidad de población local; y v) la distancia a una carretera principal. En general, el 20 % de la variación en la mejora de los establecimientos se asoció a la administración del distrito. La agrupación geográfica indicó que la mejora de un establecimiento también estaba asociada con las mejoras de los establecimientos cercanos. Conclusión Aunque la mayoría de los establecimientos mejoraron su clasificación por estrellas, se registraron variaciones significativas entre los establecimientos. Tanto la administración del distrito como la proximidad de los establecimientos de gran rendimiento influyeron en las mejoras. Las intervenciones de la mejora de la calidad deben aprovechar los factores que funcionan por encima del nivel del establecimiento, como el aprendizaje entre iguales y la presión social.