Paying for efficiency: Incentivising same-day discharges in the English NHS.

We study a pay-for-efficiency scheme that encourages hospitals to admit and discharge patients on the same calendar day when clinically appropriate. Since 2010, hospitals in the English NHS are incentivised by a higher price for patients treated as same-day discharge than for overnight stays, despite the former being less costly. We analyse administrative data for patients treated during 2006-2014 for 191 conditions for which same-day discharge is clinically appropriate - of which 32 are incentivised. Using difference-in-difference and synthetic control methods, we find that the policy had generally a positive impact with a statistically significant effect in 14 out of the 32 conditions. The median elasticity is 0.24 for planned and 0.01 for emergency conditions. Condition-specific design features explain some, but not all, of the differential responses.


Introduction
Many healthcare systems reimburse hospitals through prospective payment systems (PPS) in which the price for a defined unit of activity, such as a Diagnosis Related Group (DRG), is set in advance and is equal across hospitals (Paris et al. 2010). Economic theory predicts that hospitals will expand activity in areas where price exceeds marginal costs and minimise activity in areas where they stand to make a loss. 1 This form of reimbursement should encourage hospitals to engage in efficient care processes and cost reduction strategies to improve profit margins (Shleifer 1985;Ellis and McGuire 1986;Ma 1994;Hodgkin and McGuire 1994).
One way to reduce costs is by reducing length of stay (LoS), this being an important cost driver. For some patients it may be possible to reduce overnight stays to zero, specifically those for whom care can be provided safely 2 within a setting in which patients are admitted, treated and discharged on the same day ('same day discharge' (SDD)). Not only may an SDD be less costly to provide, it might also be beneficial to some patients if they can recover in the comfort of their own home and are less exposed to potentially infectious hospital environments. Increasing SDDs for these patients generates a welfare improvement driven by lower provider costs and unaltered or improved health benefits for patients. The British Association of Day Surgery (BADS) (2006) has recommended the adoption of SDD for 157 types of planned surgery and the British Association for Ambulatory Emergency Care (BAAEC) (2014) has identified a range of 34 conditions that require urgent care but where a subsequent overnight stay for observation is generally considered unnecessary. Implementing these recommendations is also in the financial interest of hospitals reimbursed according to the English form of PPS, which pays the same amount for SDD admissions and for admissions with an overnight hospital stay, despite the cost of providing SDD care being lower (Street and Maynard 2007). 3 Therefore, hospitals can improve profits by increasing the proportion of patients treated on an SDD basis rather than keeping them in hospital overnight.
Despite these recommendations and financial incentives, SDD rates are lower than is clinically recommended for a wide range of conditions (Department of Health 2009)(see also Figure 1). The 1 (Semi-)altruistic providers may be willing to treat patients for which marginal costs exceed price as long as the financial losses are offset by sufficient patient benefit. The extent to which this is possible depends on the potential for cross-subsidisation within the organisation, and whether they face a soft budget constraint (Brekke et al. 2015). 2 As early as 1985, the Royal College of Surgeons of England (1985) noted that "it should be clear to all concerned, the surgeon, the nursing staff, and in particular the patient, that day-surgery is in no way inferior to conventional admission for those procedures for which it is appropriate, indeed it is better." (Royal College of Surgeons of England 1985). 3 For example, in 2013/14 the average cost of planned surgery carried out as a day case in the English NHS was £698 compared to the average cost of £3,375 for overnight stays. (https://www.kingsfund.org.uk/blog/2015/07/ day-case-surgery-good-news-story-nhs) reasons for these low rates may relate to financial constraints on hospitals that limit their ability to invest in dedicated same-day facilities or reluctance by doctors to change established working practices. One way to encourage hospitals and doctors to increase uptake of SDD care is to increase the SDD price. This has been the approach taken in England under a payment reform known as the SDD bonus policy (Monitor & NHS England 2014). Hospitals receive an SDD bonus on top of the base DRG price for treating a patient as an SDD compared to an overnight admission. Starting in 2010, the reform has been progressively applied to 32 different conditions.
Our analysis of this policy reform makes two main contributions to the literature. First, it contributes to our understanding of economic incentives in the health sector by exploiting unique features of the SDD policy that relate to the economic importance of the bonus and the focus on efficiency (as opposed to other dimensions such as quality or overall volume). It is designed to incentivise technical efficiency, by paying hospitals extra to reduce length of stay and use of care inputs, such as staff time and hospital beds, by shifting care delivery from more expensive overnight wards to less costly same day settings. A distinctive feature of the SDD bonus policy is that the incentive scheme is high-powered, in that it pays more for the less costly SDD treatment. This contrasts with the common form of PPS in which prices are set at average cost (Shleifer 1985), either pooled across SDD and overnight stay (e.g. as in England), or separately for each admission type (e.g. as in Norway where the price is lower for an SDD than an overnight stay in line with the different average costs). In England, the cost advantage varies across the 32 conditions from 23% to 71% lower for SDD than for an overnight hospital stay in the pre-policy period. The SDD bonus compounds this advantage and is also economically significant, varying from 8% to 66% more than for an overnight stay. We are able to exploit this heterogeneity in the size of the incentive to assess whether it predicts changes in behaviour.
We also contribute to analytical studies that employ relatively new synthetic control (SC) methods and compare these to more traditional difference-in-difference (DID) methods. To evaluate the effectiveness of the policy we exploit the fact that incentives have been applied to 32 conditions, using non-incentivised conditions as control groups. SC methods are a potentially useful addition to the analytical armoury in situations where it is possible to draw on a large number of potential control groups. Following the pioneering work by Abadie and Gardeazabal (2003) and Abadie et al. (2010), SC methods are receiving increasing attention in the wider economic literature (Billmeier and Nannicini 2013;Bharadwaj et al. 2014;Green et al. 2014;Acemoglu et al. 2017). Within health economics, SC methods have been applied to study the effect of co-payments (Olsen and Melberg 2018), tax incentives (Fletcher et al. 2015;Bilgel and Galle 2015), public health interventions such as malaria eradication (Barofsky et al. 2015), and expansion of health insurance (Hu et al. 2018;Hernaes 2018). SC methods have been very rarely applied to provider incentives. We are only aware of one study by Kreif et al. (2016), which applies SC methods to evaluate the effect of a regional pay-for-performance (P4P) scheme in England on mortality rates. These studies all consider a single policy initiative with associated idiosyncrasies, which provides limited evidence on the general applicability of SC methods for policy evaluations typically considered in health economics. In contrast, we evaluate 32 policy variants of a particular payment reform following a common analysis plan (e.g. sample period, unit of assessment, criteria for selecting suitable control groups, etc.).
This yields insights into whether DID and SC methods generate consistent conclusions in terms of point estimates and statistical inference under a range of different scenarios.
Our key findings on the effectiveness of the policy are as follows. We find that the policy led to a statistically significant increase in SDD rates of 5 percentage points (pp) for planned conditions and 1pp for emergency conditions. However, there is considerable heterogeneity across conditions with eight out of 13 planned conditions showing statistically significant positive effects in DID analysis. Estimated effects range from -2 to +22pp changes in SDD rates. Results are more mixed for emergency conditions, where we find that the policy had a statistically significant positive effect on six out of 19 emergency conditions but caused reductions in SDD rates for two conditions. The range of estimated effects is also narrower (-6 to +6pp) and more centred around zero. The median elasticity of SDD rates to price is 0.24 for planned conditions and 0.01 for emergency conditions (overall median = 0.09). Elasticities are larger for conditions with larger post-policy price differences between SDD and overnight care, and, for planned conditions only, with bigger profit margins. In relation to the methods employed, our analysis suggests that DID and SC methods provide similar point estimates when there is a large pool of potential control conditions to choose from, as is the case for planned conditions. However, even in such favourable instances, inference from SC methods are still considerably more conservative, resulting in fewer statistically significant findings than in DID analysis.
Our analysis relates to two strands of the literature within the broader area of hospital incentive schemes (Chandra et al. 2011). First, we contribute to studies that focus on the effect of changes in prices designed to encourage hospitals to reduce LoS. It is well established that PPS encourages reductions in LoS compared to either fee-for-service or global budgeting arrangements, by making hospitals more cost-conscious than the alternative funding regimes. This was examined in pioneering work by Rosko and Broyles (1986), Salkever et al. (1986), Long et al. (1987), and Lave and Frank (1990) and others in the US Medicare and Medicaid systems, and has subsequently been confirmed in a range of other countries (e.g. Shmueli et al. 2002;Farrar et al. 2009;Moreno-Serra and Wagstaff 2010;O'Reilly et al. 2012). As well as finding general reductions in LoS, Farrar et al. (2009) estimated that the introduction of PPS in the English NHS led to an 0.4 to 0.8% increase in SDD rates for planned surgery. Much less is known about the ability of payers to influence LoS through deliberate price setting within a PPS arrangement. Shin (2019) exploits the 2005 Medicare change in its definition of payment areas that generated exogenous area-specific price shocks. The study found that the higher price did not affect volume, LoS and quality of services but it induced shifting patients into higher-paying DRGs. This is in line with Dafny (2005), who found that a 10% increase in price due to the removal of an age criterion in the allocation of patients to DRGs led to upcoding without significant change in LoS. Verzulli et al. (2017) study the effect of a one-time price increase for a subset of DRGs in the Emilia-Romagna region of Italy. They find evidence that hospitals expand the provision of surgery in response to more generous reimbursement but this has no effect on waiting times or LoS. More closely related to our setting, Januleviciute et al. (2016) examine the choice of SDD care versus overnight stay in the Norwegian context, where prices are differentiated by admission type. They find no evidence that hospitals respond to intertemporal variation in the price mark-ups for overnight stays relative to SDD care by changing their discharge practice.
In none of the above-mentioned settings were prices set with the explicit aim to reduce LoS.
A noteworthy exception is the study by Allen et al. (2016), who considered the impact of the SDD bonus policy in England on a single incentivised condition, cholecystectomy, within a DID framework with a control group of all non-incentivised procedures recommended for SDD care.
This study found an increase in SDD rates of 5.8 percentage points in the first 12 months following the policy introduction. As well as comparing DID and SC methods, we extend this earlier analysis to 31 additional conditions, allowing us to examine the generalisability of the previous result and study the determinants of the potentially heterogeneous responses to the SDD bonus. Furthermore, we examine longer-term effects, up to five years after the introduction of the bonus, allowing us to examine whether short-term effects are maintained over time.
Our study also contributes to a second strand of literature evaluating P4P programmes. A recent study reviews 34 hospital sector P4P schemes in high-income countries (Milstein and Schreyögg 2016). Most of the P4P schemes reviewed focus on incentivising quality, either through rewarding health outcomes or process measures of quality, and involve small or moderate bonuses of 5% or less (Cashin et al. 2014). Effects are generally modest in size, short-lived and sometimes associated with unintended consequences. In contrast to the existing P4P literature, the policy we evaluate has two distinct features. First, few P4P schemes incentivise technical efficiency directly, so this study contributes to the small literature on what we label "pay-for-efficiency" (P4E) schemes. Second, the SDD bonus policy is much more high-powered than previous P4P schemes and, therefore, our analysis can shed light on whether limited responsiveness to P4P schemes as documented in the literature is simply due to insufficient financial incentive, as has been hypothesised (Milstein and Schreyögg 2016).
The study is organised as follows. Section 2 provides the institutional background and the SDD pricing policy. Section 3 describes the data. Section 4 outlines the empirical methods. Section 5 describes the results. Section 6 is devoted to discussion and concluding remarks. in some cases, other characteristics such as age (Department of Health 2002;Grašič et al. 2015).

Institutional background and behavioural predictions
Initially limited to a small number of planned conditions, PPS has been extended progressively over time and now covers most hospital activity.
Before the SDD policy was introduced, the HRG payment was the same for both same day and for overnight stays across planned treatments 4 . This was not the case for emergency care, where 4 Hospitals also receive additional per diem payments for each additional night a patient stays in hospital beyond a HRG-specific long-stay trim point. This trim point is set at the 75th percentile plus 1.5x the interquartile range of the LoS distribution in the HRG. Such long-stay adjustments are not relevant to our study since the SDD policy is directed at the low end of the LoS distribution.
the payment for same day treatments was lower than for overnight stays (to reduce the incentive to admit less severe patients for overnight observation).
From 2010, the English Department of Health has gradually introduced explicit incentives in the form of the SDD bonuses, which give a stronger financial incentive to reduce LoS. For patients allocated to the same HRG, the policy involved increasing the payment for someone treated on an SDD basis, with an offsetting reduction in the base HRG price for those who stay overnight. The difference between these two prices constitutes the SDD bonus. To qualify for the bonus payment, the patient has to be admitted and discharged on the same day.
In addition, for planned treatments, the care has to be scheduled as SDD in advance of admission.
New conditions to be incentivised are announced six months in advance of introduction.
Since the introduction of the SDD bonus policy the price for same day discharge is systematically higher than for overnight stay across the 32 SDD conditions. As an example, in 2010 hospitals were paid £329 (or 24%) more for cholecystectomy (gall bladder removal) provided as SDD (Department of Health 2009). The absolute and relative size of the price differential varies considerably across 5 In some cases, additional exclusion criteria are applied to limit the scope of the SDD bonus to non-complex patients. In these cases, the group of patients with incentivised prices attached is a subset of those given in relevant directories and recommended rates can be considered a lower bound of what is clinically appropriate. 6 An exception is 'simple mastectomy' which has been incentivised since 2011 despite an annual volume of about 4,000 patients.
the 32 incentivised conditions, ranging from 8% to 66% of the overnight admission price. Once introduced, bonus differentials are fairly stable over time 7 . Table 1 provides an overview of the incentivised SDD conditions, the financial year in which the incentive was introduced 8 , the price with and without the SDD incentive, the average cost of care reported by NHS hospitals in the year prior to the policy, as well as the SDD rate and the number of patients eligible in the twelve months prior to announcement of the incentive for that condition.
Notice that in the pre-policy period hospitals already had a financial incentive to treat planned patients as SDD up to the recommended rate given that the cost of SDD is nearly always lower than the cost of an overnight stay. But as shown below in Section 3, hospitals had very low planned SDD rates in the pre-policy period, and always well below the RR. This could be due to the motivations of the doctor providing treatment or the constraining features of the hospital in which the doctor works, which we discuss in turn.
As regards low motivation, slow uptake of SDD may reflect poor dissemination about best practice. Doctors may have established practices and be reluctant to engage in disruptive innovations or simply may not be aware of or doubt the evidence that SDD is as safe as traditional practice involving overnight admission for the conditions concerned. They may also struggle to identify the patient population that is suitable for SDD, particularly if it is not recommended for all patients, i.e. RR < 100%. Greater uptake of SDD may also require some re-training (e.g. in laparoscopic surgical techniques) that carries monetary and time costs for doctors.
The hospital in which the doctor works may be constrained in its ability to extend SDD to more patients. To a limited extent, SDD treatments can be offered in a normal hospital setting.
However, scaling-up the provision of SDD treatment requires dedicated physical space and facilities.
The hospital may have to invest in a dedicated facility, either by opening up new buildings or by engaging in re-organisation of existing wards. This would involve fixed costs which would be justifiable to senior managers only if it offers the prospect of long-term financial returns. Hospitals may not undertake this investment, particularly if they face borrowing constraints that restrict their access to capital funds (Marini et al. 2008;Thompson and McKee 2011). Moreover, managers faced with the various day-to-day issues of running a hospital may find it difficult to allocate the necessary time and resources to engage in more strategic re-organisations. Paying a bonus for 7 The bonus as a percentage of base price changed by more than 5% from introduction to the financial year 2014/15 for six out of 32 SDD conditions. This variation arises due to changes to the base price that reflects year-on-year variation in the reported cost data used for price setting rather than because of purposeful policy refinement. 8 Financial years run from 1st April to 31st March of the following calendar year. If incentive applied to more than one HRG within a condition, the price and cost information shown are weighted averages according to volume. Pre-and post-policy refer to the 12 months before or after the policy start, respectively. The pre-policy SDD rate is calculated in the 12 months prior to the policy announcement and therefore not affected by anticipatory effects.
activity conducted on an SDD basis may be sufficient to overcome both clinical and managerial resistance.
More formally, denote the pre-policy period with α = 0 and the post-policy period as α = 1. The price for a HRG (g) in year (k) in the pre-policy period (P 0,k,g ) is proportional to the average cost of care reported across all English NHS hospitals for patients (admitted as planned or emergency) who were treated three years before, . . J denotes the hospital, N k−3,j,g is the number of patients for a given hospital j, and C k−3,j,g is the average cost of patients in hospital j 9 . Prices are further adjusted to account for inflation (I) and expected general technical efficiency improvement (E) factors 10 . Therefore, the pre-policy price is P 0,k,g =C k−3,g × I k × E k with I k > 1 and E k < 1. For most planned treatments, hospitals are paid the same for patients admitted and discharged on the same day (SDD) or overnight stays (ON). Therefore, P 0,k,g = P SDD 0,k,g = P ON 0,k,g if treatment is planned. However, a short-stay adjustment is applied to patients admitted as an emergency and discharged on the same day. The adjustment takes the form of a factor 0 ≤ λ ≤ 1 which takes the value 1 if the national average length of stay for the HRG is less or equal to two nights and increasingly smaller values as average length of stay increases. Therefore, emergency care including at least one overnight stay has a price constructed equivalently to planned care P ON 0,k,g = P 0,k,g while P SDD 0,k,g = λP 0,k,g .
We compare the financial incentives that hospitals faced before and after the policy. To keep the presentation simple, we suppress the HRG and year notation (g and k) and also assume that (i) each hospital has a total volume of patients treated (either as SDD or overnight) equal to N and that this is constant over time, (ii) each hospital has identical costs, therefore also suppressing j, but average costs can vary over time before and after the policy (for example as a result of the change in case-mix arising from a change in the proportion of patients treated as overnight admission).
In summary, the price pre-policy is P 0 and post-policy is P SDD 1 for same-day discharge and for an overnight stay. Hospital incentives are driven not only by differences in prices but also differences in costs. Define C ON 0 and C SDD 0 as respectively the average cost of an overnight stay and a same-day discharge in the pre-policy period (and C ON 1 , C SDD 1 in the post-policy period).
9 All NHS hospitals provide detailed reference cost information to the Department of Health on an annual basis. These data are collated in the reference cost schedule and provide information on the average cost of production across hospitals, further broken down by admission type. 10 The base price is further adjusted for hospital-specific factors such as local cost of capital and labour and specialist hospital status. As the policy evaluated is national and applies equally to all hospitals, these hospital-specific adjustments do not affect the incentives created.
The profit function for planned SDD activity, denoted π, in the pre-policy and the post-policy period is given respectively by and the difference in profit before and after the policy is: Under the assumptions outlined above, the first term is positive and gives the additional revenues for every treatment which is provided as SDD. The second term is negative and is given by the reduction in revenues due to a reduction in the overnight price. The third term is positive if the SDD price induces an increase in the SDD rate, which is less costly (evaluated at pre-policy costs).
The fourth and last term, in square brackets, relates to changes in the average costs, which can be due to patient composition or external factors, the sign being generally indeterminate. We could argue, for example, that patients who are treated as SDD after the policy are at the margin more severe, so that this will translate into an increase in the average cost of SDD and a reduction in the average cost of an overnight stay (see Siciliani (2006) and Hafsteinsdottir and Siciliani (2010) for more formal theoretical models). However, we assume that the increase in average costs for SDD is relatively small, so that an increase in SDD rates leads to a reduction in overall costs (i.e. the sum of the third and fourth term is positive).
The analysis highlights that the SDD pricing policy generates a financial incentive for hospitals, > 0, to increase planned SDD treatments, but the overall effect on profits also depends on the reduction in the base price. A similar analysis holds for emergency care where the only difference is that pre-policy the price was higher for overnight treatments, i.e. P SDD Differentiating equation 3 with respect to the number of SDD treatments, N SDD 1 , we obtain the financial incentive to treat an additional patient as an SDD. This is given by (P SDD , which is always positive whenever the cost of SDD activity is lower than the cost of an overnight admission. The expression suggests that, potentially, hospitals have a strong financial incentive to increase the number of SDD patients.

Data
We use data from Hospital Episode Statistics (  the potential for growth. Observed SDD rates may change over time due to unrelated changes in medical technology which facilitates SDD treatment for specific subpopulations of patients. To account for this, we apply an indirect standardisation approach to calculate risk-adjusted quarterly rates of SDD for each hospital and condition in our dataset, holding the relationship between patient characteristics and the probability of SDD constant over time. We construct a set of risk-adjustment variables from HES including patient age (coded as a categorical variable in 10-year bands with separate categories for 19-24 and >85), gender (male = 1), number of Elixhauser comorbidities (coded as 0, 1, 2-3, 4-6 and 7+) (Elixhauser et al. 1998) and whether the patient had any past emergency admissions within 365 days (yes = 1). As a measure of socio-economic status, we use the income deprivation score of where Y i is a binary indicator that takes the value of one if the patient was admitted and discharged on the same calendar day. As our primary concern is changes in the risk relationship over time that are common to all hospitals, we do not include hospital fixed effects in this equation.
The predicted probabilitiesŶ ijt for patients i in hospital j in quarter t are then used to derive the risk-adjusted hospital-quarter rateŶ Equations 4 and 5 are estimated separately for each of the 191 conditions in our sample. Note that, as long as the same case-mix adjustment model is used for all periods, our choice of Quarter 2 (April-June) 2006 as the base quarter is arbitrary. Further, since the prediction model forŶ jt is based on large numbers of patients, we can safely ignore sampling uncertainty in parameter estimates used to adjust for case-mix differences.
Hospitals are consulted on any changes to the payment system -including the introduction of SDD bonuses applied to other conditions -approximately six months prior to the change. This gives them time to adapt to the new policy before the actual implementation, which may bias observed pre-policy rates. We therefore exclude data for the six months prior to the condition being incentivised. For some conditions eligibility criteria were refined over time to restrict the incentive to a more tightly defined patient population in which case we apply the criteria that were valid when the financial incentive first applied to ensure consistency throughout the study period.
The overall sample includes 11,336,138 patients with incentivised conditions and 21,121,500 patients with non-incentivised conditions. Descriptive statistics for case-mix variables by incentivised condition are available in Table A1 in the Appendix. Each hospital is observed for up to 34 quarters per condition. The number of hospital-quarter observations varies across the incentivised conditions and ranges from 3,022 (#5 Endoscopic prostate resection) to 9,245 (#7 Hernia repair). 12 We use a logit regression model to avoid predicting outside the probability range of 0 to 1. This is less of an issue when drawing inference about DID regression coefficients as described in section 4.1.

Methods
Our empirical analysis seeks to estimate the causal effect of the SDD bonus policy on the probability that a patient admitted with an incentivised condition is discharged on the same day as admission 13 .
We perform separate analyses for each of the 32 incentivised conditions. For each incentivised condition, we estimate DID and SC models, both of which aim to control for common exogenous shocks and underlying time trends by means of a comparison with a control condition. We consider as potential control conditions all non-incentivised conditions from the BADS / BAAEC directories that: (i) follow the same admission pathway (planned or emergency); (ii) have an RR ±15pp of the incentivised condition to avoid differential ceiling effects 14 ; (iii) have SDD rates that are no more than 30pp apart at the start of our sample period (Q2 2006); and (iv) have at least, on average, 300 admissions per quarter over the pre-policy period.

Difference-in-difference analysis
Our DID approach relies on selecting a single control condition that is not affected by the SDD bonus policy but satisfies the parallel trends assumption that it responds similarly to the same external influences, for each incentivised condition. If more than one potential control condition satisfies these considerations, we select the one which minimises the difference in trends in the proportion of SDDs prior to the introduction of the pricing policy (i.e. matching on pre-trends), where pre-policy trends for each condition are estimated from separate linear regressions ofŶ jt on a continuous measure of time as well as hospital and seasonal fixed effects.
For each incentivised condition, we then estimate the following DID model: whereŶ cjt is the risk-adjusted rate of SDD in hospital j in quarter t and for condition c ∈ [0, 1], where 1 denotes the incentivised condition, ϕ ct is a vector of condition-specific seasonal effects (spring, summer, autumn, winter), and ν cj is a vector of condition-specific hospital fixed effects, 13 Our analysis focuses on the intensive margin. Hospitals may also respond to the financial incentive by increasing the volume of incentivised activity. However, we do not observe faster annual growths in volume of activity after the introduction of the SDD bonus (pre: 6.5% vs. post: 2.3%, p = 0.264). Furthermore, the growth in non-incentivised conditions over the 9 year period (mean = 13.3% per year) exceeds that of the incentivised conditions (mean = 5.4%). Appendix Table A2 shows annual volumes of activity for the incentivised conditions. 14 See also Allen et al. (2016). While it is possible mathematically for SDD rates to approach 100%, we expect the RR to act as a natural ceiling that is unlikely to be breached.
which capture unobserved time-invariant differences amongst hospitals (e.g. management quality, local demand) in the propensity to discharge patients on the same day as admission 15 .
The dummy variable SDD c takes the value of 1 if condition c is incentivised by the SDD bonus and 0 otherwise and D t is a dummy variable that takes on the value of 1 after the introduction of the SDD bonus in t = t , and zero otherwise. The coefficient of interest is τ , which denotes the average treatment effects on the treated (ATT) over the post-policy period. ω cjt is an idiosyncratic error term.
We also identify separate ATTs τ k for each of the post-policy years k = 1 . . . K by replacing the single dummy variable of D t with a vector of dummy variables, each taking the value 1 for a specific post-policy year k. These models thus allow for a delayed impact of the SDD policy which may be because clinical processes take time to be reorganised. Alternatively, positive policy effects may fade over time due to increasing marginal costs of further improvements.
All models are estimated as linear probability models with standard errors clustered at hospital level.

Synthetic control analysis
The validity of our DID estimates may be compromised by two challenges. First, in our study, we consider a large pool of potential control conditions, several of which may be suitable to model the counterfactual outcome. The results of the DID analysis may be sensitive to the choice of control condition, for example because of idiosyncratic shocks or measurement error in the control condition.
Second, while we select DID control conditions based on pre-policy trends, the assumption of parallel trends applies to unobserved counterfactual outcomes and can therefore never be tested Furthermore, by matching on levels, the SC method provides reassurance that the synthetic control 15 We allow for hospital fixed effects to vary between the intervention and the control condition to account for any differences in a hospital's relative propensity to discharge patients with different clinical conditions on the same day. For example, a hospital may be 5pp more likely than the average hospital to discharge patients with the incentivised condition on the same day and 12pp more likely to do so for patients with the control condition. In this case, forcing a common hospital fixed effect for both groups would be inappropriate.

condition is well matched to the incentivised condition on time-invariant unobservables and that
both have similar scope for improvement (and, in this study, a similar risk of ceiling effects).
The SC method requires a panel data structure with the same units of observation being followed over time. We aggregate the risk-adjusted hospital-quarter data to national SDD rates at the level of condition-quarters based on hospitals' quarterly volumes of patients. The pool of potential control conditions is the same as for the DID analysis. Each potential control condition is assigned a non-negative weight (which together sum to 1) according to a loss function that minimises the discrepancy of the incentivised and SC conditions in terms of pre-policy SDD rates, expressed as the root mean squared prediction error (RMSPE), and a set of average pre-policy patient characteristics (see Section 3). The difference between observed and counterfactual outcomes provides an estimate of the ATT and can be evaluated over different time periods to recover both τ k and τ 16 .
The SC method applies a different inference framework than standard econometric analysis, which poses a challenge for comparative inference. As there is only a single observation per condition and time point it is not possible to construct traditional standard errors. Instead, we adopt the approach of placebo tests originally proposed by (Abadie et al. 2010). We estimate a set of SC models, as described above, but treat each potential control condition in turn as if it was the incentivised condition, with the incentivised condition added to the pool of potential control conditions. In each iteration, we calculate the ratio of RMSPE in the pre-and post-intervention periods. P-values are constructed as the proportion of RMPSE ratios that are at least as large as that of the original model for the incentivised condition. 17 We convert these placebo p-values to standard errors through a normal approximation. The quality of this inference framework relies on the number of potential control conditions; for example, with only 19 potential control conditions, the smallest p-value that could be calculated is 1 1+19 = 0.05. Note that no standard errors can be computed if p = 1.
All computations are performed using the user-written synth command in Stata 14. 16 The estimated treatment effects are approximately unbiased under two key assumptions: a linear relationship between the covariates and the outcome variable and a sufficiently long pre-policy time period relative to the variance of the error term. 17 Because the main estimate is also compared against itself, the numerator of this ratio is always ≥ 1 and the denominator is V + 1, where V is the number of potential controls.  Overall, for both methods and diagnostic statistics, the fit of the control condition is better for planned care, where there is a large number of potential control conditions to choose from (16 to 85), than for emergency care (2 to 7). The small number of emergency control conditions also limits the scope for inference after SC estimation. Only eleven out of 32 incentivised conditions have a set of at least 20 potential control conditions necessary to generate p-values <0.05.

Policy effect on SDD rates
Tables 3 and 4 present the results of our DID and SC analysis. Figures 2 and 3 summarise the main quantities of interest, the estimated ATT over the post-policy period (τ ) and associated 95% confidence intervals, in the form of forest plots. Results are presented for all 32 conditions, with light grey, dashed confidence intervals flagging control conditions with trend divergence of >1pp per year, i.e. where we deem the underlying identification assumptions to be less clearly met.  Notes: Slope and Level indicate the trend in SDD rates and the average SDD rate in the intervention group prior to the policy introduction. ∆Slope and ∆Level denote the difference between intervention and control groups. All slope estimates are per quarter.
* Number of potential control conditions with a non-zero weight assigned to them.     where a statistically significant increase in SDD rates can be ascribed to the policy. This is shown as an example in Figure 4.
For emergency conditions, the DID analysis identifies statistically significant positive effects for six conditions and negative effects for two conditions. The size of the effects is generally smaller than those estimated for planned conditions, with no point estimate exceeding 6pp. Given the small number of potential control conditions, the SC estimates are less reliable and deviate substantially from the DID results. Moreover, the placebo tests cannot reject the possibility that these results reflect chance variation, as evidenced by very wide confidence intervals.
The pooled effect across conditions according to our DID results are a 5.3pp increase in the probability of SDD for planned patients, and a 1.4pp increase for emergency patients (Tables 3  and 4), both of which are statistically significant at p<0.001. 19 These DID results translate into approximately 28,400 additional patients (95% CI: 23,297 to 33,502) admitted, treated and discharged on the same day in a year across all incentivised conditions ( Figure 5). 20 Most of these additional patients receive treatment for chest pain, where a small change in SDD rates applies to a large patient population. There is no single pattern to these developments with all possible permutations present.

Robustness checks
We conduct two robustness checks to rule out alternative explanations of our results which are presented in Table 5. First, the introduction of incentives to increase SDD rates for some conditions might lead to changes in SDD provision more broadly. These spillovers might be positive, for example if clinicians apply their new skills to non-incentivised clinical conditions, or negative, for example if increasing the provision of SDD care requires resources which might be in demand for other patients, such as specialised day surgery beds. Spillover effects are most likely to occur within the same clinical department, as departments are where hospital resources such as clinical personal and beds are managed on a day-to-day basis. To test for spillovers, we re-estimate our analyses excluding potential control conditions that are performed in the same clinical department as the incentivised condition. 21 We find our results to be substantively unchanged, suggesting that spillovers are unlikely to drive our main estimates.
Second, for planned conditions, hospitals only receive the higher SDD price if they both schedule and provide SDD care. Hospitals that are already achieving high SDD rates prior to the policy but record poorly whether they have scheduled that care in advance to be delivered on the same day, may therefore be able to increase their payment simply by better recording scheduling plans.
If so, observed changes in the incentivised outcome may not reflect changes in patient care but 19 The overall effects are calculated as weighted averages, where the weights (

Association with incentive design features
Thus far, our results have demonstrated that the response to the SDD bonus policy varies substantially across incentivised conditions. We now investigate if this variation is associated with features of the design of SDD incentives. Since the 32 conditions incentivised by the policy vary in the size of the price differential P SDD whereȲ P re is the observed outcome for the incentivised condition in the year before the announcement period. Focussing on the DID estimates, we find a median elasticity of 0.24 across the 13 planned conditions, and 0.01 across the 19 emergency conditions. Five conditions show an elasticity above 1.
As there are just 32 conditions, it is not possible to conduct multivariate regression analysis of incentive design features that may affect the elasticity of the policy response. We therefore resort to univariate correlation analyses which are presented in the form of scatter plots in Figure 7. Hospitals may respond more strongly for conditions offering relatively higher financial returns.  Figure 7c shows the association between the policy response and the total incentive, capturing both price and cost differences between SDD and ON, the latter being approximated by information on average costs in the year prior to the policy introduction.
We find suggestive evidence that larger elasticities are concentrated in conditions with higher SDD prices, but not with larger price differences. Moreover, elasticities appear to increase in the size of the total incentive ∆(P − AC) = (P SDD We also explore whether responses appear to be driven by clinical reasons. We hypothesise that responses to the SDD bonus are more pronounced if SDD pre-policy rates are lower and the gap to the RR is higher, therefore giving more scope for improvement. Figure 7d provides some support that larger elasticities occur for planned conditions with lower pre-policy SDD rates. However, somewhat counterintuitively, Figure 7e suggests a negative relationship between the elasticities and the gap between existing practice (i.e. pre-policy SDD rate) for planned SDD care. One potential mechanism for this finding is that the size of gap between existing practice and recommended rate is larger when the costs or other limitations to higher SDD rates discussed above are larger. In such cases, the additional incentive created by the policy may still be insufficient for a larger number of hospitals, reflected in a lower national response.

Conclusions
We have assessed the long-term impact of a generous pricing policy designed to encourage hospitals to treat patients as a 'same day discharge', involving admission, treatment and discharge on the same calendar day. Despite being considered clinically appropriate and having lower costs, English policy makers have been frustrated by the low rates of SDD for many conditions. Consequently, in order to encourage behavioural change by doctors and hospitals, policy makers have set prices for SDD that are well above average costs and are also higher than the price for patients allocated to the same DRG who have an overnight stay.
Economic theory predicts that a significant price differential would result in greater provision of treatment on an SDD basis. An early study into the policy impact for one condition, cholecystectomy, suggested that the SDD pricing policy met short-term policy objectives (Allen et al. 2016). Since this study, the policy has been rolled out to 31 more conditions. Our study set out to assess how far these earlier findings would be generalisable to these other conditions, whether short-term impacts  across all incentivised conditions. Indeed, for two conditions the response is negative: despite the enhanced price advantage, fewer SDD treatments are provided post-policy than predicted. For others there is no apparent response. Nor are we able to identify any general temporal pattern in the policy response, with both rapid and delayed uptake of SDD practices being observed. These mixed results mirror those of the literature on P4P, which provides inconclusive evidence for the effectiveness of using financial incentives to drive quality (Milstein and Schreyögg 2016).
This lack of generalisability cautions against drawing firm conclusions from a single analysis.
Indeed, cholecystectomy turns out to be the condition exhibiting the second greatest positive response among the 32 conditions. Moreover, while Milstein and Schreyögg (2016)  whereas such concerns are less prominent when care is scheduled in advance. Also, emergency admissions occur at unpredictable points in the day, making it difficult to achieve SDD for some patients; particularly those admitted late in the evening. This may limit the scope for rapid increases in SDD rates in emergency conditions compared to planned conditions.
It has been argued that the limited impact of P4P schemes is due to incentives being too small (Milstein and Schreyögg 2016). In this study, for all conditions, the price incentive was more high-powered than that typically associated with P4P schemes. But there was significant variation across the conditions in terms of the relative size of the incentive, and we exploit this to investigate the association of incentive size and the estimated clinical response across 32 conditions. There is suggestive evidence that the response to the incentive was greater for conditions with higher SDD prices post policy and with lower SDD rates pre policy. There does not appear to be an association between the size of the price differential, i.e. the marginal reimbursement that hospitals attract from adopting SDD care, and the size of the response. However, there is a positive association, especially for planned conditions, when both price and cost advantages of SDD care are taken into consideration.
On the methodological side, our study highlights an important shortcoming of the SC method compared to more traditional DID analysis in a policy evaluation context commonly encountered by applied health economists. Because the SC method aims to make inference about a treatment based on a single treated unit followed over time, the scope for statistical inference is limited to placebo tests. There are two important limitations to our study that should be addressed by future research.
First, while we do not find evidence of spillovers from incentivised to non-incentivised SDD conditions, we cannot rule out that spillovers among the 32 incentivised conditions contribute to the limited overall policy effect that we observe. For example, hospitals may find it difficult to increase SDD rates for a condition that starts to be incentivised if dedicated inputs (e.g. day beds on specialised wards) are limited and have already been allocated to another condition where the incentive has been in place for longer. Our analysis treats all 32 incentivised conditions as independent and therefore cannot detect such spillovers. To address this, future research would need to develop a more complex model of inter-hospital allocation of resources that also incorporates the changes in incentive structure over time, which goes beyond the scope of the current paper. Second, our analysis focusses on changes in discharge behaviour and does not analyse effects on patients' health outcomes. The assumed welfare effects of the SDD policy are predicated upon the clinical consensus and existing evidence (e.g. Gilliard et al. (2006), Marla andStallard (2009), Vaughan et al. (2013), and NICE (2014)) that SDD care is as safe and effective as care involving overnight stays. Future research should seek to confirm this assumption.
In conclusion, we find some evidence that hospitals respond to price signals and that payers, therefore, can use pricing instruments to improve technical efficiency. However, there appears to be substantial variation in hospitals' reactions even among similar types of financial incentives that is not explained by the size of the financial incentive or the clinical setting in which it is applied. It has been said that a randomised controlled trial demonstrates only that something works for one group of patients in one particular context but may not be generalisable (Rothwell 2005). Similarly, a pricing policy that appears to work as intended in one area may not be effective when applied elsewhere, hence the need for continued experimentation and evaluation.  174,494 173,899 185,860 197,229 199,249 197,419 196,163 199,559 198,755 1.6% -0.2% 1.7%