Robustness of efficiency scores in data envelopment analysis with interval scale data

Our paper focuses on a robustness analysis of eﬃciency scores in the context of Data Envelopment Analysis (DEA) assuming interval scale data, as deﬁned in A. Dehnokhalaji, P. J. Korhonen, M. Köksalan, N. Nasrabadi and J. Wallenius, “Eﬃciency Analysis to incorporate interval scale data”, European Journal of Operational Research 207 (2), 2010, pp. 1116–1121. We ﬁrst show that the deﬁnition of the eﬃciency score used in our paper is a well-deﬁned measure according to Aparicio and Pastor (J. Aparicio and J. T. Pastor, “A well-deﬁned eﬃciency measure for dealing with closest targets in DEA”, Applied Mathematics and Computation 219 (17), 2013, pp. 9142–9154.). Next, we characterize how robust the eﬃciency scores are with respect to improvements and deteriorations of inputs and outputs. We illustrate our analysis with two examples: a simple numerical example and a more complex example using real-world data. © 2021 The Authors. Published by Elsevier B.V. This is


Introduction
The concept of robustness (stability) or sensitivity has been widely studied in Data Envelopment Analysis literature.It considers a variety of changes in the data set, such as adding or omitting a Decision Making Unit (DMU), adding or withdrawing some input/output measures, or changing input/output values.It investigates whether the efficiency scores obtained from DEA models are sensitive or robust (stable) against such changes in data.Charnes, Cooper, Lewin, Moray, & Rousseau (1984) did some early work on sensitivity analysis in DEA.They assumed occurrence of perturbations for a single output for an arbitrary unit and found ranges of variation to preserve the efficiency status of that unit.Charnes, Haag, Jaska, & Semple (1992) ; Charnes, Rousseau, & Semple (1996) developed some metrics, such as "distance" or "length" of a vector in order to measure "radii of stability" which are the thresholds within which data modifications will not change the efficiency/inefficiency status of units.Their work was continued by Seiford and Zhu ( Seiford & Zhu, 1998, 1999a, 1999b) .Thompson, Dharmapala, & Thrall (1994) considered the case where changes happen in data of all DMUs simultaneously and they developed sensitivity analysis for efficiency measures.They applied their idea to Kansas farming and Illinois coal mining.
Furthermore, the sensitivity analysis framework is not only restricted to deterministic methodological approaches.Several studies have been conducted to treat data changes by statistical methods.For instance, the DEA estimators of the best practice production functions have been proposed for the case of multiple inputs-one output, when the one-sided deviations from that kind of production function are considered as stochastic variations in technical inefficiency.On the other hand, bootstrap methods were proposed to deal with multiple inputs-multiple outputs, where the sensitivity of the efficiency score can be tested by repeatedly sampling from the original samples.See Banker (1993) , Banker & Natarasan (2004) , Korostolev, Simar, & Tsybakov (1995a,b) , Simar (1996) and Simar & Wilson (1998, 2004) for more details.Zhu (1996) proposed a method for sensitivity analysis in DEA by solving linear programming problems whose optimal values yield particular regions of stability.They provided necessary and sufficient conditions for upward variations of inputs and for downward variations of outputs of an (extremely) efficient DMU, which keeps its efficiency score unchanged.Dehnokhalaji, Korhonen, Koksalan, Nasrabadi, & Wallenius (2010) proposed an efficiency analysis for interval scale data.Interval scales are numeric scales, in which not only the order of values, but also the differences between the values are known.Common examples are the Celsius-scale and time measured in hours and seconds.We formulated a mixed integer programming model based on the idea of finding a hyperplane which separates better units from non-better ones.We defined the efficiency score as the ratio with the numerator equal to the number of non-better units and denominator equal to the number of all units.Two questions arise.Is the defined efficiency measure a well-defined efficiency measure?Moreover, how stable is this efficiency measure to data alteration?
In our current study we prove that the defined interval scale efficiency measure is a well-defined efficiency measure based on the criteria provided by Aparicio & Pastor (2013) .Moreover, we develop a stability analysis approach to investigate how robust the obtained efficiency scores are regarding data changes.We formulate two multi-objective mixed integer programming models separately to measure the robustness of improvements and deteriorations and then provide an interval for each input and output value, resulting in an unchanged efficiency score for all efficient and inefficient DMUs.We show that the modelling can be done via linear programming formulations in case that changes occur just for one input or output, and the stability region reduces to an interval for each individual input and output.
This paper unfolds as follows.Section 2 provides some preliminary DEA considerations followed by our DEA-based model for incorporating interval scale data.In Section 3 we conduct a stability analysis for the interval scale data setting.Section 4 gives a simple numerical example, which illustrates the proposed stability analysis model.A real application is provided in Section 5 and Section 6 concludes the paper.

Basic concepts of DEA
which consists of all feasible inputs and outputs.We assume less is better for inputs and more for outputs, as usual.The concept of efficiency is defined as follows.
Definition 1.A point ( y * , x * ) ∈ P is said to be efficient in set P iff (if and only if) there does not exist (y , x ) ∈ P such that y ≥ y * and x ≤ x * and (y , x ) = ( y * , x * ) .
The concept of efficiency can also be defined equivalently in the context of vector optimization.To do so, consider the following vector maximization problem defined on set P as: " max " ( α, β) Clearly the point ( y * , x * ) ∈ P is efficient in set P if and only if ( α * , β * ) = (0 , 0) is the only Pareto optimal solution of problem (2) .
Suppose that X is an m × n matrix and Y is a p × n matrix with input and output vectors as their columns, respectively.Also, let x j = (x 1 j , x 2 j , . . ., x m j ) and y j = (y 1 j , y 2 j , . . ., y p j ) denote the input and output vectors of DMU j , respectively, for j = 1 , 2 , . . ., n .
In DEA, the production possibility set is constructed as ( Cooper, Seiford, & Tone (20 0 0) ): where the definition of depends on the technology assumptions.Specifically, for a variable returns to scale technology, we have = { λ ∈ R n + | 1 λ = 1 } , and the corresponding set P is denoted by P V RS .
Similarly, for a constant returns to scale technology we have = R n + and the set P is denoted by P CRS , correspondingly.
To simplify our notation, we denote u j = y j −x j and call it an input-output vector.Also, let U = Y −X .We call the set of all possible input-output vectors T , i.e. we have: Especially, the corresponding set T for variable and constant returns to scale technologies is denoted by T V RS and T CRS , respectively.
We Envelopment Form (a ) Multiplier Form (b) (5) It is well-known that u o is efficient if and only if the optimal value of the basic additive model ( 5) is equal to zero.While the basic additive model with variable returns to scale (5) is translation invariant w.r.t both inputs and outputs, its main drawback is that it does not provide a measure of (in-)efficiency.Now, consider an arbitrary input-output vector ū ∈ R m + s .If ū does not belong to T V RS , then there exists a hyperplane in R p+ mspace that separates ū and the convex set T V RS .In other words, there exists ρ ∈ R p+ m + and scalar η ∈ R such that −ρ T U + η1 ≥ 0 and −ρ T ū + η < 0 .This explains the idea of the following lemma which provides a necessary and sufficient condition for an inputoutput vector to belong to Proof.By the definition of T V RS , we know that ū ∈ T V RS iff the following system has a solution: Now, by Farkas's lemma, the above system has a solution iff the following system has no solution: and the proof is complete.
Assume that J = { 1 , . . ., n } and B ⊆ J is a subset of all observed units with the property that there exists (ρ, η) Then the production possibility set corresponding to set B and the associated input-output set T are denoted by P (B ) and T (B ) , respectively.It is clear that for each j ∈ J such that j / ∈ B , we have u j / ∈ T (B ) , or equivalently ( y j , x j ) / ∈ P (B ) .With this notation, the original production possibility set P defined in (3) is denoted by P (J) and the corresponding set T defined in (4) is denoted by T (J) .

A DEA-Based model for interval scale data
As mentioned before, one of the main assumptions of the basic DEA model is that all the variables are measured on a ratio scale.Dehnokhalaji et al. (2010) proposed an efficiency analysis approach that would work both for interval scale and ratio scale data.The idea of their model is to find a hyperplane passing through a unit under evaluation, u o , to classify the observed units into two different categories: non-better and better units than the unit under assessment.They adopted the idea from Koksalan, Buyukbasaran, Ozpeynirci, & Wallenius (2010) , where they assigned weights to input and output entries to maximize the number of units each DMU being evaluated could outrank.In other words, they formulated a model which finds the maximal subset of the observed units in which u o is efficient.
The main difference between the traditional DEA models and the ones in Dehnokhalaji et al. (2010) approach is that in the latter paper the authors did not measure the distance of u o from the efficient frontier.Instead, they obtained the minimum number of units to omit from the entire set of DMUs to make u o efficient.If u o is not efficient, some of the inequalities ρ T U + η1 ≥ 0 , are not satisfied.They used the big M method to minimize the number of violated inequalities where violating an inequality constraint implies the corresponding unit is not required to be omitted to make u o efficient.To this end, they solved the following mixed integer programming model: where M is assumed to be a sufficiently large positive number.The aim of the model ( 9) is to find the minimum number of units that should be omitted from the set of observations J = { 1 , . . ., n } in order to make u o efficient.To illustrate, note that if u o is efficient then none of the units need to be omitted from J and therefore the optimal value of model ( 9) is equal to zero.On the other hand, if u o is inefficient then for each hyperplane some of the inequalities −ρ T u j + η ≥ 0 , j ∈ J are not satisfied.Therefore, the model uses the big-M method to minimize the number of violated inequalities.Actually, by adding an additional term Mz j , to each inequality −ρ T u j + η ≥ 0 , where M is a sufficiently large positive number and z j ∈ { 0 , 1 } , one can be sure that all constraints −ρ T u j + η + Mz j ≥ 0 are satisfied for suitable values of z j .In fact, if u j needs to be omitted, we have z j = 1 , and if u j is not required to be omitted we have z j = 0 .In this regard, the objective function n j =1 , j = o z j suitably reflects our aim which is to find the minimum number of units that should be omitted.
Generally, assuming that (ρ * , η * , z * ) is an optimal solution for model ( 9) the set R o = { j ∈ J | z * j = 1 } determines the minimal subset of units which are to be omitted in order to make u o efficient and therefore J o = { j ∈ J | z * j = 0 } gives the maximal subset of units such that u o is efficient within J o ∪ { o} ( Dehnokhalaji et al., 2010) In other words, R o can be interpreted as the units which are better than u o and γ o = n j=1 z * j gives the minimum number of units which have to be omitted from the set of observations J in order to make u o efficient.
Based on model ( 9) the efficiency score of u o is defined as it has been proved that this efficiency score is translation invariant ( Dehnokhalaji et al., 2010 ).
Note that model ( 9) can be considered as a modification of the original multiplier BCC model in DEA ( Banker, Charnes, & Cooper, 1984 ) 1 We may present the basic idea of Dehnokhalaji et al. (2010) based on the basic additive model ( 5) .Therefore, we formulate the following model for evaluating u o : where M is assumed to be a sufficiently large positive number, as before.The main difference between models ( 9) and ( 10) is in their normalizing constraint.Model ( 9) is formulated based on the basic BCC model, which is intrinsically radial-oriented, whereas model (10) can be considered as a modification of the non-radial additive model ( 5) .Both models provide the same efficiency score for a unit under assessment.Hereafter, we apply model (10) for efficiency analysis and robustness.This model is more helpful and straightforward when investigating the concept of robustness in this paper.The sets J o and R o and the efficiency score e o are defined in model ( 10) similar to the ones we have in model (9) .

Main properties of the efficiency score
Aparicio & Pastor (2013) listed four main properties that a welldefined efficiency measure should meet.Definition 2. Assume that a performance evaluation model defines an efficiency score eff o for DMU o .The score eff o is a well-defined efficiency measure iff it satisfies the following properties: The following theorem verifies the validity of the efficiency measure obtained from model (10) .
Theorem 1. Assume that e o is the efficiency score of DMU o obtained from model (10) .Then: 3. e o is unit and translation invariant.4. If DMU q dominates DMU o , then e q > e o .
Proof.See Appendix.As it can be observed, parts 1, 2 and 3 in Theorem 1 clarify the fact that e o satisfies properties (P1), (P2) and (P3) of a welldefined efficiency measure.Moreover, part 4 means that the efficiency measure e o is strictly increasing w.r.t. the partial order "domination" in the set J = { 1 , . . ., n } , i.e. (P4) holds for each two distinct observed units.However, it is worthwhile to note that the efficiency score of u o may still remain unchanged even if its inputs and outputs are improved/deteriorated.The main reason is that e o does not depend on the distance of u o from the efficiency frontier and a hyperplane passing through u o which classifies the set J into two subsets J o and R o , in such a way that the cardinality of R o is minimal, determines its efficiency score.In this regard, the efficiency score of u o may not be affected by small perturbations in its input/output vector.To investigate this issue, we consider the following example.
Example 1.Consider a set of 10 units, each producing two outputs y 1 and y 2 using the same amount of a single input x = 1 .The data set is given in Table 1 and the production possibility set in output space is described in Fig. 1 .Solving model (10) for each unit, the corresponding efficiency scores are presented in the last row of Table 1 .
Model (10) is coded in C++ and solved using the callable library of IBM ILOG CPLEX 12.6.The second constraint set in model ( 10) is formulated using indicator constraints feature of CPLEX 2 .
The indicator constraint states that if z j is equal to 0, then −ρ T u j + η must be greater than or equal to 0.
Notice that the interval scale efficiency scores classify units into efficient and inefficient units and the concept of weak efficiency does not apply in our context.Now, to examine the stability of the obtained efficiency scores, we solve model (10) when we have some perturbations in the data set.For example, let E be the unit under evaluation and assume that its first output increases by 0.5.It can be seen that with this new data set, the efficiency score of E is still equal to 0.7, i.e. it remains unchanged after this perturbation.A similar story exists for a decrease of 0.2 in the second output.This fact is illustrated in Fig. 2 .The main reason is that both of the hyperplanes H E and H E passing through E and E , respectively, exclude the same number of units as H E , passing through E, does.
In the above example measure e o obtained from model ( 10) is relatively stable w.r.t.small perturbations in u o .In what follows, we find the stability region for maintaining the efficiency score of u o .It is worthwhile to note that the efficiency score of u o may also be affected by small perturbations in the input-output vector of (some of) the other units.However, this issue is not going to be investigated here.Actually, we just investigate stability of the efficiency score of u o when perturbations occur in its own inputoutput vector u o .

Robustness of efficiency scores with interval scale data
The aim of this section is to find maximal admissible perturbations in input-output vector u o for DMU o in order to keep the efficiency score e o unchanged.The perturbations may occur in inputs, outputs, or both.Moreover, from this point of view, two types of perturbations will be considered for u o : improvement and deteri- where u o ± stands for the (improved/deteriorated) perturbed unit and e ( u o ) = e o .As mentioned before, in this study we assume that u o is replaced with a virtual perturbed unit u o ± .By doing so, the number of all units is still equal to n , and therefore we can perform our sensitivity analysis on the optimal value of model (10) , i.e. on γ o , since we know that e o = 1 − γo n .Therefore, problem (12) finds the maximum admissible changes some parameters of model ( 10) such that its optimal value remains unchanged.In the following, we solve model ( 12) based on the structure and the interpretation of the corresponding model ( 10) .Also, we investigate improvements and deteriorations, separately.3

Stability region for improvements
In this section, we obtain the maximal admissible improvements in input-output vector u o such that its efficiency score remains unchanged.We know that J o is a maximal subset of J − { o} with the property that u o is efficient within J o ∪ { o} .This means that for each Therefore, a sufficient condition for γ ( does not exceed the frontier of some T (J ) for J ⊆ J − { o} and In other words, so long as u o + falls into the intersection of all sets T (J ) with the above property one can conclude that its efficiency score remains stable.Accordingly, we can formulate the following vector optimization problem to calculate the maximal admissible improvements which keep the efficiency score of u o unchanged: " max " s.t.
where γ o is the optimal value of model (10) .
To illustrate the constraints of model ( 13) , consider unit E in Example 1 , with | J E | = 6 .It can be easily verified that there exist three sets J with the property that where there exists a hyperplane strongly separating T (J ) and the remaining units J − J .
The corresponding production possibility sets T (J 1 ) , T (J 2 ) and T (J 3 ) are depicted in Figure ( 3 ).Moreover, the intersection of these sets, i.e. the dashed region gives the feasible region of model ( 13) .Accordingly, the black region is the stability region for improvements for unit E.
The following theorem clarifies the validity of model ( 13) in determining the maximal admissible improvements for u o .
Theorem 2. Suppose that is a weakly Pareto optimal solution for model ( 13) and γ (u o + ) is the optimal value of model ( 10) evalu- ating u o + .Then, (i).for each ≥ 0 such that ≤ and = , we have is a Pareto optimal solution, we have γ ( Proof.See Appendix. Part (i) of Theorem 2 provides a region in which u o + can be located and still has the same efficiency score as u o .As it was explained before, the efficiency score increases by 1 /n as soon as u o + reaches the frontier of T (J ) , for some J ⊆ J − { o} with Assume that changes are only admissible for the t-th output.Then we have = δ e t , where e t is the t-th unit vector in R m + s .Therefore, Model (13) can be written as: max δ s.t.(14) or equivalently, as: min δ s.t.
and this is equivalent to the model below: min δ s.t.
where cl(A ) denotes the closure of A and A c is the complement of A .According to previous section, one can establish the following equivalent form consequently: Model ( 17) is not a mixed-integer linear one because of the constraint −ρ T ( u o + δ e t ) + η ≤ 0 , which includes multiplication of two variables.But it is easy to prove that this constraint is binding at optimality.Also, the normalization constraint || ρ|| = 1 can be written as ρ t = 1 .Hence model ( 17) can be written as the following mixed-integer linear program: min (18)

Stability region for deteriorations
In order to find the maximal deteriorations for u o which keep its efficiency score unchanged, we adopt our previous idea for improvements.We know that J o is a maximal subset of J − { o} such that u o is efficient w.r.t.J o ∪ { o} .Then, so long as u o − does not belong to T (J ) , for some ) .Based on this, to find the maximal admissible deteriorations we formulate the following vector optimization problem: " min " s.t.
To illustrate the above model, again consider unit E in Example 1 with | J E | = 6 .It can be easily seen that there exist just two sets J with the property that J ⊆ J = { A, B, . . ., J} − { E} , | J | = 6 , and T (J ) ∩ (J − J ) = ∅ which are The corresponding production possibility sets T (J 1 ) and T (J 2 ) are depicted in Fig. 4 .As discussed, the intersection of the above sets, i.e. the dashed region, is the feasible region of model ( 19) and the black region is the stability region for deteriorations.The following theorem clarifies the validity of problem ( 19) .
and this is equivalent to the model below: where cl(A ) denotes the closure of A and A c is the complement of A .According to previous section, one can establish the following equivalent form: 23) is not a mixed-integer linear one because of the constraint −ρ T ( u o − δ e t ) + η ≤ 0 which includes multiplication of two variables.This constraint is binding at optimality.Also, the normalization constraint || ρ|| = 1 can be written as ρ t = 1 .Hence model ( 17) can be written as the following mixed-integer linear program: max In what follows we provide a numerical example and investigate the robustness of efficiency scores by obtaining the stability interval of each individual input/output, by solving the mixedinteger linear programming optimization problems ( 18) and (24) , separately.

Numerical example
Consider the data set presented in Table 1 for Example 1 .We find the maximum admissible perturbation for each individual output for each unit under evaluation, which keeps the efficiency score unchanged.Based on the theory provided above, we consider improvements and deteriorations separately.
Let E be the unit under evaluation with y 1 = 5 , y 2 = 3 .5 , x 1 = 1 and γ E = 3 .In order to find the maximal admissible improvements for output t ( t = 1 , 2 ), we write model (18) for this unit as follows: min Similarly, model ( 24) is written for unit E to find the maximal admissible deteriorations for output t ( t = 1 , 2 ), as follows: Both models ( 25) and ( 26) are solved for t = 1 and t = 2 to find the maximum admissible improvement and deterioration for the first output and the second output of unit E, respectively.
The optimal solutions of models ( 25) and ( 26 Finally, the admissible perturbations for all units are presented as intervals in Table ( 2 ).The terms δ i , δ i denote the values of admissible improvements and deteriorations for output i , respec-tively, i = 1 , 2 .Also, SI i stands for the stability interval of output i .
As can be seen from Table 2 , for efficient units D, G, I and J, the admissible improvements corresponding to outputs are unbounded, i.e. δ1 = δ2 = ∞ .The reason is that all these units are efficient and therefore any increment in each output keeps them efficient provided that that they are feasible.

Application
Consider the problem of evaluating the quality of research in different departments of University of Eastern Finland (former University of Joensuu), Finland, which has been investigated in Dehnokhalaji et al. (2010) .The data set consists of 26 departments assessed on seven criteria by using a scale where 1 represents "Poor" and 7 represents "Excellent".These criteria were used as outputs in our approach.Identical inputs of 1 were assumed for all units.The original data set and interval scale efficiency scores obtained by model ( 10) are reported in Table ( 3 ).
Running models ( 18) and ( 24) for each individual output, we can obtain the maximum admissible improvements and deteriorations for each output, separately.The results are presented in Table ( 4 ).Here we provide the stability interval for each individual output for all units.Note that for each unit these values are obtained under the assumption that the data for all other units remains unchanged.The terms δ i , δ i denote the values of admissible improvements and deteriorations for output i , respectively.To il- lustrate the results, consider unit 4. As the table shows, if output 1, output 2, output 3, output 5, output 6 or output 7 of unit 4 is decreased by 4, 1, 3, 3, 4, and 3 units, respectively, the efficiency score of this unit remains unchanged.However, any deterioration in output 4 will change the efficiency score of unit 4. By a similar argument, we observe that any improvement in output 1, output 2, output 6, or output 7 of unit 4 will change the efficiency of this unit and if output 3, output 4, or output 5 of unit 4 is increased by 1, 1.5, and 0.333 respectively, the efficiency score of unit 4 remains unchanged.Units 12 and 26 are efficient and as expected, the admissible improvements corresponding to all outputs for both units are unbounded.
The optimal objective function obtained by models (10), (18) , and (24) are sensitive to the selection of M for some DMUs with the lowest efficiency scores, and so we solved the three models using the indicator constraints in CPLEX callable library instead of using the Big-M constraints in order to obtain more accurate results.For instance, for DMU 21, the efficiency scores obtained by model ( 9) in Dehnokhalaji et al. (2010) was reported to be equal to 0.231, while the indicator constraints resulted in the efficiency score of 0.269 as you can see from Table (3) .
Also, as can be seen from Table 4 , the interval scale efficiency scores are more sensitive to improvements than to deterioration of output variables for most units.However we cannot conclude that this happens for all data sets, since our defined efficiency score is dependent on the location of the units in the production possibility set.

Conclusion
In this paper we performed a robustness analysis on the interval scale efficiency measure proposed by Dehnokhalaji et al. (2010) .We first showed that our interval scale efficiency measure is well-defined.Then we investigated the robustness of this efficiency measure and formulated a multi-objective mixed integer program, which finds the admissible region for perturbations.In the simple case, where perturbations are considered for an individual input/output, we solve two single-objective programs, which result in an allowable interval for the corresponding input/output.We ran the model on a simple numerical example and also on a real-world data set consisting of 26 Decision Making Units.
Recall that our proposed efficiency measure depends on the number of units being evaluated.As the number of units increases, the obtained scores for all units are more diverse and can better differentiate among Decision Making Units.In general, the stability intervals get shorter when the number of units increases.

Fig. 2 .
Fig. 2. Some perturbations in unit E in the output space.
Moreover, as soon as u o + reaches the frontier of T (J ) , for some J ⊆ J − { o} with | J | = | J o | + 1 , its efficiency score is increased by 1 n (Recall that the efficiency score of u o is defined as e o = 1 − γo n = | J o | +1 n , where | J o | denotes cardinality of J o .When the numerator increases by 1, the efficiency score increases by 1 n ).

Fig. 3 .
Fig. 3. Stability region for improvements for the data set of Example 1 .

Fig. 4 .
Fig. 4. Stability region for deteriorations for the data set of Example 1 .
and (ii).If ˜ ≥ 0 is Pareto optimal, then for each such that ≥ ˜ and = ˜ , we have γ ( u o − ) > γ ( u o ) .Proof.See Appendix Again, let = δ e t , where e t is the t-th unit vector in R m + s .Therefore, Model (19) can be written as: min δ s.t.

Table 1
Data set for Example 1 .
Fig. 1.Decision Making Units in the output space.

Table 2
Stability intervals for individual outputs .

Table 3
Research assessment exercise data set at the University of Eastern Finland and the efficiency scores .

Table 4
Robustness of efficiency scores for each output perturbation .