Can patients improve the quality of care they receive? Experimental evidence from Senegal

Highlights • We use standardised patients to test if quality of care increases when patients share more information in rural Senegal• Providers are 27% more likely to correctly manage patients who volunteer more information at the start of the consultation• Low performance when patients share less information is not due to providers’ lack of knowledge or clinical skills• Instead, low motivation may limit providers’ ability to adapt their effort to patients’ inputs in the consultation.• Encouraging patients to be more active in consultations could improve the quality of healthcare in low-income settings.

SPs were asked to indicate how much they paid for the consultation. Finally, the debriefing questionnaire included a section on drugs dispensed or prescribed. This section was completed by team leaders, who were trained health professionals (nurses and midwives) -based on information retrieved from prescriptions given to SPs or drugs dispensed by providers. All drugs provided, and prescriptions given to SPs were photographed. Team leaders listed all of the drugs that were given or prescribed. The same was done for any tests or follow-up exams.

SP case overview
SPs were given a detailed script that documents their personal and medical history, which is shown below.
• Personal history Your job: You work on buildings sites in Dakar as a mechanic. When you do that, you live in a room with other builders (8 per room).
You have just come back after working in Dakar for 2 months, repairing cars. You were filling in for somebody and now you are waiting for the next building site to start.
How you arrived in the local area: In the village, you live with a cousin that you wanted to visit because the building site you were working in in Dakar is completed.

• Medical history
Your health in general: You don't drink, you don't smoke, and you did not have any specific health problems in the past (except for malaria, which you contracted several time). Everyone in your family is globally in good health, both your parents are alive and in good health and your brothers and sisters are also in good health.
The problem that brings you to the facility: In the past 2 weeks, you have not done much because you felt very tired, you did not really have an appetite and you coughed a lot. For about a week now, you wake up at night, sweating. You thought that this would pass on its own, which is why you did not go to see a doctor before, but your wife told you to come because you were not getting better.

• Standardised answers to questions
SPs were trained to give standardised answers to a list of questions that were providers were likely to ask. During the training workshop, SPs were taught to provide exactly these responses. SPs were also taught to improvise in cases where providers asked unexpected questions. Whenever SPs were asked about a symptom or problem not listed below, they indicated that they did not have the symptom or did not experience the problem. Did you consult someone else (traditional healer, another provider) before?

No
How do you live?
In Dakar, I share a room with 8 people. In the village, I live with my family.
Is the place you live well aerated? Not in Dakar, but in the village.

Data
We use three measures of provider effort: an index score based on IRT, the proportion of relevant history questions asked and physical examinations performed and the duration of the consultation.
The IRT score as well as the simple score capturing the proportion of items done are based on the history questions and physical examinations shown below. As providers in the high-information (treatment group) are told by patients that they have blood in sputum and have lost weight, these two items are not used for calculating effort for these providers. •

Methods
Item response theory (IRT), first applied in the context of psychometrics and educational testing, is a model-based approach to developing and scoring tests (or checklists) for estimating latent individual traits (such as ability, or in this case, effort). Using maximum likelihood methods, the IRT score assigns greater weight to items that are more difficult (less likely to be completed) and that discriminate better among providers in terms of their unobserved effort (Das and Hammer 2005).
In this application, employ a two-parameter ('2pl') logistic function for modelling the probabilities that providers successfully complete individual checklist items, given their unobserved effort level, using the irt 2pl command in STATA (STATA, 2021). The twoparameter approach allows for variation in difficulty and discrimination levels. In our analysis, items (history questions and physical examinations) are binary, in the sense that they are either correct (1 = completed by the provider) or incorrect (0 = not completed by the provider). Items are assumed to be equally discriminating and only vary in terms of their level of difficulty.
In constructing the IRT score, we separately run the analysis for providers in the treatment and the control group. We exclude the items that were manipulated in the experiment for providers in the treatment group, to avoid unnecessarily rewarding (or punishing) providers for (not) asking for information they already received from patients. All other items listed in Appendix A2 are included.
As described in further detail in STATA (2021), we assume that !" is the yet to be observed outcome for item i from person j -where !" is the observed value of !" . When !" = 1 this is referred to as correct, whilst !" = 0 as incorrect.
Based on the IRT parameterisation, the probability of person j with latent trait level " (clinical competence) providing a correct response to item i, is given by (STATA, 2021, p.37): Here, a refers to discrimination and b to the difficulty of the item. To identify these parameters, we fit a model using the slope-intercept form (STATA, 2021, p.37).

Results
The table below shows output from IRT analysis for the low-information group.

Restricting analysis to most discriminating items
We construct an alternative version of Figure 3 in the main text, where we re-measure effort as the proportion of items done by providers, including only the most discriminatory 50% of items identified through IRT analysis presented above. We regress the quality of case management on provider effort, separately for patients in the treatment and control group and plot predicted probabilities.
The left-hand graph in Figure A2 measures effort as the proportion of items done by providers, including only the most discriminatory half of items, identified through IRT analysis. The right-hand graph in Figure A2 uses the duration of the consultation.
The list of items identified as most-discriminatory by the IRT analysis for the lowinformation group are:

A.4 Measuring structural quality
We create a simple aggregate score of structural quality that indicates the proportion of essential drugs and equipment available in facilities. The list of essential drugs and equipment is shown below.  Notes: Results from OLS regressions are reported. All models control for facility and provider characteristics (type of facility, target population, competition, distance to higher-level facility, facility participation in a results-based financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). *** p<0.01, ** p<0.05, * p<0.1

A.7 Vignette experiment
The 119 healthcare workers who indicated that they conduct curative adult consultations were asked to complete the clinical vignette task. Providers were randomised to assess a hypothetical patient using either the generic introduction (lowinformation vignette, n=67) or the detailed introduction (high-information vignette, n=52) -based on a random draw. As shown in Table A5 below, we find that the randomisation was successful, as providers in each group are balanced on observed characteristics.   Table 1 (type of facility, target population, competition, distance to higherlevel facility, facility participation in a results-based financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). The IRT score has a mean of zero and a standard deviation of one. *** p<0.01, ** p<0.05, * p<0.1 Notes: Results from OLS regressions are reported. All models control for facility and provider characteristics (type of facility, target population, competition, distance to higher-level facility, facility participation in a results-based financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). Provider knowledge is measured in terms of: whether providers correctly managed an identical case of TB in the vignette task as well as a competence index representing the weighted proportion of clinically indicated history questions asked and examinations performed in the vignette (aggregated using item-response theory) *** p<0.01, ** p<0.05, * p<0.1 Notes: Results from OLS regressions with robust standard errors are reported. Facility and provider characteristics are as shown in Table 1 (type of facility, target population, competition, distance to higherlevel facility, facility participation in a results-based financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). All models control for SP fixed effects.

A.8 Heterogeneous effects
The IRT score has a mean of zero and a standard deviation of one. *** p<0.01, ** p<0.05, * p<0.1. Notes: Results from OLS regressions with robust standard errors are reported. Facility and provider characteristics are as shown in Table 1 (type of facility, target population, competition, distance to higherlevel facility, facility participation in a results-based financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). All models control for SP fixed effects.
The IRT score has a mean of zero and a standard deviation of one. *** p<0.01, ** p<0.05, * p<0.1.

A.9 Robustness checks
A.9.1 Alternative mechanisms  Notes: Results from OLS regressions are reported. The outcome in all models is the probability of correct case management. All models control for SP fixed effects. Facility and provider characteristics are as shown in Table 1 (type of facility, target population, competition, distance to higher-level facility, facility participation in a results-based financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). Model 2 -Model 3 control for the number of patients waiting when SPs arrived and for whether providers had been informed of SPs visits beforehand. *** p<0.01, ** p<0.05, * p<0.1 Notes: Results from OLS regressions with robust standard errors are reported. The outcome in all models is provider effort (IRT score). Facility and provider characteristics are as shown in Table 1 (type of facility, target population, competition, distance to higher-level facility, facility participation in a resultsbased financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). The IRT score has a mean of zero and a standard deviation of one. *** p<0.01, ** p<0.05, * p<0.1 Notes: Results from OLS regressions with robust standard errors are reported. The outcome in all models is the duration of the consultation. Facility and provider characteristics are as shown in Table 1 (type of facility, target population, competition, distance to higher-level facility, facility participation in a resultsbased financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). *** p<0.01, ** p<0.05, * p<0.1    Notes: Results from OLS regressions with robust standard errors are reported. The outcome in all models is provider effort (IRT score). Controls were selected using Post-Double-Selection (PDS) LASSO. The PDS LASSO procedure did not select any controls. *** p<0.01, ** p<0.05, * p<0.1 Notes: Models show log-odds after firth logistic regressions. The outcome in all models is the probability of correct case management. All models control for facility and provider characteristics (type of facility, target population, competition, distance to higher-level facility, facility participation in a results-based financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). All models control for SP fixed effects. Model 2 -Model 3 control for the number of patients waiting when SPs arrived and for whether providers had been informed of SPs visits beforehand. *** p<0.01, ** p<0.05, * p<0.1.  194 Observations 197 197 191 Notes: Marginal effects from probit regressions are reported. The outcome in all models is the probability of correct case management. Facility and provider characteristics are as shown in Table 1 (type of facility, target population, competition, distance to higher-level facility, facility participation in a results-based financing scheme, proportion of essential drugs and equipment available; provider gender, skill, experience, training). All models control for SP fixed effects. Model 2 -Model 3 control for the number of patients waiting when SPs arrived and for whether providers had been informed of SPs visits beforehand. *** p<0.01, ** p<0.05, * p<0.1