Evaluating Hospital Efficiency Adjusting for Quality Indicators: An Application to Portuguese NHS Hospitals

The increase in healthcare costs is a source of concern in most health systems, and the adoption of measures to control costs is at the center of current health policies in most European countries. Efficiency has become a key concern for hospital managers, who are under pressure to provide more services at a lower cost. Associated with this concern, there is an increasing interest of hospital managers and health administration authorities in designing methods to evaluate hospital performance. The ability to rank efficient hospitals over their inefficient counterparts provides a benchmark for hospital managers to discover and reduce potential inefficiencies, and provides health administration authorities with measures that may be used to reward good managers. With this increasing interest in hospital performance, a vast academic literature has emerged on measures and comparisons of hospital efficiency, where quantitative measures of inputs are compared with quantitative measures of outputs, usually using Data Envelopment Analysis (DEA) or Stochastic Frontier Analysis (SFA).


Introduction
The increase in healthcare costs is a source of concern in most health systems, and the adoption of measures to control costs is at the center of current health policies in most European countries. Efficiency has become a key concern for hospital managers, who are under pressure to provide more services at a lower cost. Associated with this concern, there is an increasing interest of hospital managers and health administration authorities in designing methods to evaluate hospital performance. The ability to rank efficient hospitals over their inefficient counterparts provides a benchmark for hospital managers to discover and reduce potential inefficiencies, and provides health administration authorities with measures that may be used to reward good managers. With this increasing interest in hospital performance, a vast academic literature has emerged on measures and comparisons of hospital efficiency, where quantitative measures of inputs are compared with quantitative measures of outputs, usually using Data Envelopment Analysis (DEA) or Stochastic Frontier Analysis (SFA).
As this focus on efficiency has increased, so have the concerns that this may induce managers to neglect service quality. Since health outcomes depend on the quantity, but also on the quality, of the healthcare services provided, it is possible that a hospital may increase its measured efficiency with a deterioration of health outcomes, if there is an efficiency/quality trade-off. Several authors have investigated the existence of a efficiency/quality trade-off, but all tended to focus on a limited set of quality indicators, related to only a part of hospital production, due to data availability (for instance, [1,2]). Hospitals are multi-product units, providing inpatient care, outpatient visits, emergency care and ambulatory surgery interventions, and measures of quality that relate to only one of these lines of production are not good measures of the quality of the services hospitals provide.
The objective of this paper is to develop a methodology to incorporate measures of quality for the main lines of hospital production in efficiency analysis, applied to Portuguese NHS hospitals, in order to assess whether there is a trade-off between efficiency and quality in Portuguese hospitals. We develop a methodology to compute DEA technical efficiency scores adjusted for output quality, for a sample of Portuguese NHS hospitals in 2009. The quality of Portuguese hospital healthcare services is measured by a set of indicators based on data from a 2009 survey of patients, designed by the ACSS (Administração Central do Sistema de Saúde) and Universidade Nova, whose main goal is to provide an independent system of regular evaluation of patient satisfaction and of hospital quality, as perceived by users of Portuguese NHS hospitals, the "Sistema de Avaliação da Qualidade Apercebida e da Satisfação dos Utentes dos Hospitais EPE e SPA 2009". The rest of this paper is organized as follows. In section 4 we provide a brief review of the relevant literature. Section 5 describes the data used in this paper, while the methodology is presented in section 6. Our results are in section 7. Section 8 concludes.
When DEA efficiency scores are adjusted for output quality, the decision making units that lie on the technical efficiency frontier remain largely unaltered, even if a great weight is given to quality indicators over quantity indicators of output. Nevertheless, we find that outside of the frontier adjusting for quality does have an impact in efficiency scores.
We conclude that the empirical evidence is not sufficient to identify a clear trade-off between efficiency and quality in the hospitals under review, implying the possibility that efficiency gains may be achieved without a significant sacrifice of service quality. Nevertheless, there is enough evidence to conclude that analyzing hospital efficiency without consideration of differences in quality of service will generate biased results. When perceived quality is brought to the analysis, the gap between efficient and inefficient units tends to widen.
by ISEGI-UNL, is based on data collected from phone interviews from July 2009 until March 2010 on a sample of 28669 individuals 2 . As a result four satisfaction indices were created covering inpatient visits, outpatient visits, emergency episodes and ambulatory surgery interventions. The methodology employed is compatible with the American Customer Satisfaction Index (ACSI).

DEA input measures
Following several papers on hospital efficiency, we use physical inputs data as a proxy for the labor and capital factor [3]. The labor proxies are: (i) Doctors -number of doctors, (ii) Nurses -number of nurses and, (iii) OtherStaff -remaining staff at the units' service. As a proxy for capital we use the number of beds (Beds). Alternatively, in the robustness analysis, we also look at total costs, in million euros, as described in the hospitals' financial accounts.

DEA output measures
The output measures are comprised by inpatient visits, outpatient visits, emergency episodes, and surgery interventions (ambulatory plus non-ambulatory).
The interesting element of using this set of physical outputs is that we can get a direct correspondence of most of the outputs with the perceived quality/patient satisfaction indices. Table 1 provides the data descriptive statistics.
Inpatient visits are very heterogeneous; therefore, we adjust this variable by weighting it with a case-mix index (CMI), computed using a length of stay base case-mix index for 2009. To construct this index, we follow, by using the length of each inpatient visit as a proxy for the complexity of each case [8].
This index uses the length of stay as a proxy to the resource use associated with each diagnosis. The underlying idea is that hospitals with a higher incidence of diagnosis with high treatment duration have a higher case-mix index. To do so, the mean length of stay (los) of each main diagnosis m=1, …, M over all Portuguese hospitals i=1, …, N is computed: Then, using equation (1), the los of each diagnosis is compared with the overall average los of all diagnoses:  productivity of health care organizations provides a review of 317 published papers on frontier efficiency measurement in healthcare, concluding that even though there is an increasing use of parametric techniques, such as stochastic frontier analysis, around three quarters of the papers use nonparametric data envelopment analysis [3]. Studies of hospital efficiency are the most common (52% of the papers reviewed in [3]).
However, very few papers take into consideration that output quality may vary across hospitals, and that traditional quantitative measures of output do not capture all the relevant dimensions of efficiency. The importance of incorporating quality measures in DEA hospital efficiency analyzes has been recognized in a few recent papers. Include multiple quality indicators as outputs in a DEA efficiency study of 667 American hospitals, concluding that lower technical efficiency is associated with poorer risk-adjusted quality outcomes in the study hospitals [4]. Use DEA and a sample of Virginia hospitals to examine performance measures of quality and relate them to technical efficiency, having concluded that some of the technically efficient hospitals were performing well as far as quality measures were concerned [1]. Apply DEA to input and output data from 1377 urban hospitals, include nurse-sensitive measures of quality, and conclude that higher quality in some dimensions of care need not be achieved as a result of higher costs or through reduced access to healthcare [5]. More recently, analyze the evolution of efficiency and quality in Andalusian Hospitals during the years 1997-2004 and rule out the existence of an efficiency-quality trade-off [2]. Investigating the relationship between hospital efficiency and structural quality for 348 Turkish hospitals, and examining the trends of productivity, efficiency and quality changes of hospitals in Shenzhen city over the period 2006-2010, conclude that the efficiency-quality trade-off does not exist for large hospitals, but could exist for small hospitals [6,7]. The existing evidence seems to link poor quality outcomes to higher cost. However, we are not aware of any study where the quality proxies have a direct relationship with the quantitative output measures. This is the main contribution of this paper. Our paper is the first one, to our knowledge, to apply qualityadjusted DEA efficiency measures that incorporate quality indicators for each of the hospital outputs, providing a better measure of whether an efficiency/quality trade-off exists in the Portuguese hospital sector. Furthermore, our analysis allows us to investigate whether abstracting from a qualitatively oriented approach produces biased results.

Data Data sources
We use cross-sectional data from a sample of 37 Portuguese general hospitals for 2009. Although data were available also for oncology and psychiatric hospitals, they were not included given the highly specialized nature of these units. The choice of this particular sample rested on data availability for physical inputs and outputs. The sample is quite representative, since the hospitals in it were responsible for a very high share of the total production of Portuguese hospitals in the previous year (e.g., the hospitals in the sample were responsible for 80% of the inpatient visits, of emergency episodes and of outpatient visits and 95% of the surgery interventions).
To measure the impact of perceived quality/patient satisfaction in the DEA efficiency scores we resort to the report produced by the  In the construction of the 2009 LCMI, certain specialized units, such as oncology centers, were removed for the sample. Once the index obtained in equation (4) is calculated, the risk-adjusted inpatient admissions can be calculated by its product with the effective production.

Methodology
This paper aims to answer two research questions: (i) Does a trade-off between efficiency and quality exist in Portuguese NHS hospitals?
(ii) Are DEA efficiency scores biased when strictly quantitative outputs are considered?
In this study we follow the majority of the vast academic literature on measures and comparisons of hospital efficiency, and use Data Envelopment Analysis (DEA) to compare quantitative measures of inputs with quantitative measures of outputs (section 4). DEA is a nonstochastic, nonparametric estimation method that determines a "bestpractice" frontier given the available data. Hospital efficiency scores are then computed with respect to this reference. We use radial outputoriented efficiency measures, i.e., we estimate what is the maximal proportional expansion achievable of the vector of outputs (for a more complete description of distance function based efficiency measures) [9]. The lower bound of these efficiency scores is 1.00, which indicates the decision making unit (DMU) is output-based efficient, and a score greater than the unity points towards efficiency improvement opportunities. Our research interest focus on the emphasis given to the procedures taken by the hospital management towards improving patient satisfaction holding fixed the unit's resources. Therefore, we adopt an output orientation, i.e., what is the maximum expansion that hospitals can achieve holding inputs fixed, In the description of the methodology, we follow the notation used by [10]. The production function is represented by the correspondence of the vectors of outputs ( ) x that can be produced by using the vectors of inputs ( ) x as follows: where z is the vector of weights that generate the convex combinations of inputs (where K is a n-by-k matrix) and outputs (where M is a n-by-m matrix). Under this formulation, the technology exhibits variable returns to scale (VRS), as imposed by the restriction on the summation of the elements of z). Given this characterization of the productive technology, the Farrell output based efficiency measure can be computed by solving the following linear programming problem for each DMU, where the scalar , max z θ θ is the technical efficiency score: Given the purpose of our analysis, we define a perceived quality index (PQI) that allows us to adjust the quantitative output measures. The index is defined as follows: Where ij PQ is the value of the index for DMU i and for output j, ij PQ is the perceived quality indicator designed by ACSS and ISEGI-UNL for DMU i and for output j and j PQ is the average value of the indicator for output j in our sample. Clearly, as we can see by looking at equation (7), a value greater (lower) than one indicates the DMU i is providing a service with perceived quality standards above (below) the average of the DMUs included in the analysis. Once the index is obtained, it is used to adjust the quantitative output measures. This adjustment is performed simply by multiplying the PQI by the quantitative output measures, such that those DMU that provide a service with quality standards above average are producing more quality adjusted outputs and thus obtain higher efficiency scores.
The objective is to obtain a set of efficiency scores that does not take into account the perceived quality indicators (baseline -BL scores) and another one that does (perceived quality adjusted -PQA scores). Then, we compare the two sets to determine the effect of the perceived quality adjustment. In order to do so, we conduct a Wilcoxon matched-pairs test. This is a nonparametric test that analyzes the relation between two paired groups. It is used to test whether a treatment has an effect in the population; it does so by looking at the median difference given the set of differences shown by these two groups. Table 2 presents the output oriented efficiency scores 3 descriptive statistics.

Results
The high percentage of units in the frontier reflects the limited number of observations and the broad definition of the technology. Nevertheless, under this specification, the perceived quality adjustment 3 Choosing an input orientation would not change significantly the results.  The results shown in the previous table demonstrate that the efficient units remain in the 'best-practice' frontier regardless the treatment given to perceived quality indicators. This could happen for two reasons: (i) the perceived quality indicators do not provide useful information with respect differentiating the decision making units; or (ii) the most efficient units are also the ones that operate under the highest perceived quality standards. The empirical evidence, as given by the results of the one-tailed Wilcoxon matched-pairs test, seems to favor the second alternative. If alternative (i) were to stand against our estimation results, then the underlying null hypothesis of the one-tailed Wilcoxon matched-pairs test could not be rejected at the usual significance levels, which is not the case. Although we see some significant shifts in the rankings of the several units (Appendix 1 and 2), the most pronounced effect is the increase in the skewness of the results' distribution ( Figure 1). This indicates that the perceived quality effect leads to a decrease in the efficiency scores of the units located below the frontier. Therefore, we find evidence that not taking quality into account leads, in general, to an overestimation of the efficiency scores of inefficient hospitals. Some exceptions do exist, though.
The results presented above show that the empirical evidence is not sufficient to identify a clear trade-off between efficiency and quality in the hospitals under review, implying the possibility that efficiency gains may achieved without a significant sacrifice of service quality. Nevertheless, there is enough evidence to conclude that analyzing hospital efficiency without consideration of differences in quality of service will generate biased results. When perceived quality is brought to the analysis, the gap between efficient and inefficient units tends to widen, as confirmed by the one-tailed Wilcoxon matched pairs test.
The high number of units deemed efficient, in the specification chosen previously, recommend some caution in their interpretation. The reduced dimension of the sample, unfortunately, does not allow for a higher degree of freedom. Nevertheless, a specification with physical inputs presents a more complete description of the technology and, therefore is the main focus of our analysis.
To check the robustness of our results, we look at total costs as a measure of input. The perceived quality adjustment produces a decrease in five percentage points in the percentage of efficient units, and passes the Wilcoxon matched test at a level of 5% of significance, as we can see in Table 3.
Again, we find that the most of the units deemed efficient in a purely quantitative specification remain so when the perceived quality indicators are taken into account. The same happens for the generalized decrease in the inefficient units technical efficiency scores. Therefore, we conclude that our findings are robust to the input specification.

Conclusion
The purpose of this paper is to answer two questions. The first one is whether a trade-off between efficiency and quality exists in Portuguese NHS hospitals. The second one is whether DEA efficiency scores are biased when strictly quantitative outputs are considered.
To answer the first question, we use a set of indicators based on data from a 2009 survey of patients, whose main goal is to provide an independent system of regular evaluation of patient satisfaction and of hospital quality, as perceived by users of Portuguese NHS hospitals. Our analysis suggests that a trade-off between efficiency and quality does not seem to exist. Therefore, strictly quantitative output specifications tend to provide a complete picture for those units deemed efficient.
To answer the second question, we use a full set of quality indicators covering the hospitals' four main lines of production, contrary to previous literature that tended to focus on quality indicators related to only part of the hospitals' activity. We find that for those units deemed inefficient, abstracting from quality adjustments may lead to the overestimation of their technical efficiency scores.
We conclude that the empirical evidence is not sufficient to identify a clear trade-off between efficiency and quality in the hospitals under review, implying the possibility that efficiency gains may achieved without a significant sacrifice of service quality. Nevertheless, there is enough evidence to conclude that analyzing hospital efficiency without consideration of differences in quality of service will generate biased results.
When DEA efficiency scores are adjusted for output quality, the decision making units that lie on the technical efficiency frontier remain largely unaltered, even if a great weight is given to quality indicators over quantity indicators of output. Nevertheless, we find that outside of the frontier adjusting for quality does have an impact in efficiency scores. Our analysis of quality adjusted efficiency scores reveals that, using a Wilcoxon matched pairs test, the median efficiency scores are statistically different at the usual levels of significance.
We conclude that the empirical evidence is not sufficient to identify a clear trade-off between efficiency and quality in the hospitals under  review, implying the possibility that efficiency gains may achieved without a significant sacrifice of service quality. Furthermore, there is enough evidence to conclude that analyzing hospital efficiency without consideration of differences in quality of service will generate biased results.