A COMMON SET OF WEIGHT APPROACH USING AN IDEAL DECISION MAKING UNIT IN DATA ENVELOPMENT ANALYSIS

. Data envelopment analysis (DEA) is a common non-parametric frontier analysis method. The multiplier framework of DEA allows ﬂexibil- ity in the selection of endogenous input and output weights of decision making units (DMUs) as to cautiously measure their eﬃciency. The calculation of DEA scores requires the solution of one linear program per DMU and generates an individual set of endogenous weights (multipliers) for each performance dimen- sion. Given the large number of DMUs in real applications, the computational and conceptual complexities are considerable with weights that are potentially zero-valued or incommensurable across units. In this paper, we propose a two-phase algorithm to address these two problems. In the ﬁrst step, we deﬁne an ideal DMU (IDMU) which is a hypothetical DMU consuming the least inputs to secure the most outputs. In the second step, we use the IDMU in a LP model with a small number of constraints to determine a common set of weights (CSW). In the ﬁnal step of the process, we calculate the eﬃciency of the DMUs with the obtained CSW. The proposed model is applied to a numerical example and to a case study using panel data from 286 Danish district heating plants to illustrate the applicability of the proposed method.


1.
Introduction. Data envelopment analysis (DEA) (Charnes et al., [7]) is a widely used non-parametric frontier analysis method, implemented in linear programming, for comparing the inputs and outputs of a set of comparable decision making units (DMUs). The flexibility of the DEA models, i.e. the absence of a priori assumptions on the production technology, is implemented through a minimal spanning hull. This naturally implies that the weights (multipliers) vary across the frontier for a non-degenerate dataset, making each unit appear in its most favorable light. The results in terms of technical efficiency, however, may be counterintuitive and overly conservative when weights are heavily biased towards inputs or outputs of minor importance in an economic or preferential sense. This flexibility in combination with the number of input and output dimensions normally render a relatively high proportion of the DMUs fully technically efficient, in particular for limited datasets. This lack of discriminatory power has been subject to extensive work in the DEA literature.
In problems with a small number of DMUs, the number of efficient DMUs is normally high and the obtained results have no practical value in discriminating among the DMUs. Therefore, Li and Reeves [22] suggested utilizing multiple objectives, such as minimax and minisum efficiency in addition to the standard DEA objective function in order to increase discrimination between DMUs. Friedman and Sinuany-Stern [14] developed a Canonical Correlation Analysis (CCA) model based on the canonical correlation analysis and used linear discrimination to find a score function that ranked DMUs. There are other DEA approaches aiming at finding a common set of weights (CSW) which deal with the efficiency scores that result from the weights that are proposed. For example, some approaches are based on the idea of minimizing the differences between the DEA efficiency scores and those obtained with the CSW (Despotis [13]; Kao and Hung, [19]).
Cook et al. [12] examined various conditions that are imposed on the multipliers in DEA and suggested an approach for breaking ties on the frontier in each case. Andersen and Petersen [3] developed a modified version of DEA (also known as superefficiency) based on an adjusted reference set, where the unit compared is excluded from its reference set. The resulting model allows for technical efficiency scores higher than one, corresponding to the radial distance from the production space less the DMU under evaluation. Bogetoft [6], laying the foundations for the theory of DEA models in economic regulation, showed that the adjusted reference set model (superefficiency) was necessary for the incentive properties of DEA in regulation to hold. Alternative formulations to address numerical infeasibilities (e.g. for the variable returns to scale (VRS) case and for zero-valued dimensions) in the original Anderson and Peterson [3] model have been proposed by Thrall [37], Mehrabian et al. [25], Saati et al. [31], Tone [38], Jahanshahloo et al. [18] and Li et al. [21], among others.
Sexton et al. [33] proposed a DEA extension approach known as the crossevaluation method to identify best performing DMUs and to rank DMUs using cross-efficiency scores that were linked to all DMUs. Sueyoshi [34] used the modified slacks-based model to rank efficient units and overcome the problem of instability in superefficiency models. Kao and Hung [19] proposed a compromise solution approach for generating a CSW under the DEA framework. The efficiency scores calculated from the standard DEA model were regarded as the ideal solution for the DMUs to achieve. The model derived a CSW minimizing the vector difference between the CSW-efficiency scores and the endogenous weight scores. The Kao and Hung [19] solutions have uniqueness and Pareto optimality properties previously not warranted by the CSW models.
Liu and Peng [23] introduced CSW to determine the single most favorable CSW for DMUs on the DEA frontier with regards to maximizing the group's efficiency score. They showed that each DMU determines the efficiency score under its most favorable weights attached to its input indices and output indices. A similar approach, but based on economic foundations and game theory is developed in Agrell and Bogetoft [2], where a set of regulated entities agree on a common set of weights as a proxy for model specification in a normative regulation setting. The dimensions attributed with zero weights are interpreted as excluded from the consensus model and the implementation among the agents is assumed to be made through side payments. Liu and Peng [24] proposed a modified procedure where the endogenous CSW maximizing aggregate efficiency can be adjusted by exogenous weight restrictions from one of the DMUs. The procedure is claimed to be more (preferentially) robust. Wang et al. [42] proposed a methodology where an a priori subjective choice of the number of frontier units predetermines LP models for endogenously finding a corresponding CSW.
Wang and Chin [40] proposed a neutral DEA model for cross-efficiency evaluation, differing from the classical game theoretical formulations where the DMUs seek to either maximize or minimize the scores of the reference set. Jahanshahloo et al. [16] proposed a variant of an additive frontier model, in which the "ideal DMU", defined as the closest point weakly dominating the production possibility set, is used as a new projection point. HosseinzadehLotfi et al. [15] defined some artificial units called 'aggregate units' in an approach based on the effects of deleting an efficient DMU relative to the other efficient DMUs. Wang et al. [41] proposed a methodology based on regression analysis to seek a CSW that is easy to estimate and producing a full ranking for all DMUs. As in Kao and Hung [19], the CSW is derived in minimizing the vector distance between the technical efficiency scores.
In this paper, we propose a two-phase procedure to reduce the computational complexities and costs as well as overcoming pitfalls in the DEA including the zerovalue and differing weights and multiple solutions. Initially, we define an ideal DMU (IDMU) which is a hypothetical DMU consuming the least inputs to produce the most outputs. We then use the IDMU in a LP model with a reduced number of constraints to determine a CSW. Finally, we calculate the efficiency of the DMUs with the obtained CSW. The proposed models are applied to a numerical example and panel data from 286 Danish district heating plants subject to sector regulation.
The remainder of this paper is organized as follows. In Section 2 we present the basic DEA models. In Section 3 we show how to obtain a desirable performance measure by imposing weight restrictions. In Section 4 we present the details of our proposed method and Section 5 contains a numerical example and the case study in energy regulation. The paper is closed in Section 6 with conclusions and some suggested future research directions.
2. Basic DEA models. Let us assume that there are n DMUs to be evaluated where each DMU j , (j = 1, 2, ..., n) convert m inputs into s outputs. Suppose that x ij (i = 1, . . . , m , j = 1, . . . , n) and y rj (r = 1, . . . , s , j = 1, . . . , n) are the ith input and the rth output of DMU j , respectively. The relative efficiency of DMU p , p ∈ {1, . . . , n}, is defined as the maximum value of W p and can be obtained by using the following programming model (for constant returns to scale, CRS) proposed by Charnes et al. [7]: where ω i and µ r are the input and output weights assigned to the ith input and rth output and ε > 0 is a non-Archimedean element smaller than any positive real number. The prior fractional model is normally transformed by applying the Charnes and Cooper's [9] method into the linear program (2): The optimal value of (1), W * p , constitutes a radial technical efficiency measure for DMU p . If W * p =1, DMU p is efficient, otherwise, DMU p is inefficient. Note that model (2) is called a DEA weights model (multiplier). The relative efficiency of DMU p (W p ) is determined by assigning weights to the inputs and outputs of the DMU and maximizing the ratio of the weighted sum of outputs to the weighted sum of inputs. The only underlying assumption for the weights of the inputs and outputs is non-negativity (called "total weights flexibility"), which is recognized as either a weakness or strength of the traditional DEA models. The strength of allowing such flexibility that no a priori assumptions are necessary with respect to the underlying production technology. The weakness of the flexibility is that the technical efficiency estimates may rely upon widely different weights, some of which may be economically or preferentially unrealistic. In addition, with respect to weights flexibility, some of input and output weights take the epsilon values that are infinitesimal. As a result, the relative efficiency of a DMU may not adequately reflect its performance owing to some of inputs and outputs being completely neglected by DMUs.
3. Weights. In the previous section, we discussed the shortcomings of the total weight flexibility in the DEA models. To obtain a desirable performance measure, we can deal with the total flexibility of weights by imposing exogenous weight restrictions or the weights of the input and output could be equalized via determining an endogenous common set of weights (CSW).
3.1. Weight restrictions. Preference information (also called "weight restrictions") introduced by decision makers may appear in different forms, therefore requiring various treatments. The number of variables and DMUs used is directly linked with the discriminating power of DEA models and also with the potential number of zero weights. It is clear that if we have many variables in a DEA assessment, a DMU is probably able to find at least one factor and to ignore all other factors for being in the best possible light. Furthermore, in some situations, the evaluation of a small number of DMUs with a given number of inputs and outputs may not reflect the desired degree of discrimination between DMUs. Incorporation of preference information on the production space by means of restrictions on inputs or outputs can deal with this difficulty.
The complete flexibility in the selection of weights is considered an advantage of DEA. However, many value judgement approaches have been proposed in the literature. Weights restriction is the most straightforward method for incorporating preference information in DEA. The first weights restrictions method in DEA was introduced by Thompson et al. [36].
The most popular weights restriction method in DEA is the assurance regions (ARs), which impose ratios between weights to be within certain ranges (Khalili et al. [20]). Sarrico and Dyson [32] introduced the concept of ARs into weights restrictions. They showed that the use of the AR is preferable to proportional weights restrictions. Bernroider and Stix [5] further studied the interaction between bound setting in the assurance region method and the validity of ranking outcomes in DEA. Another weights restriction method used commonly to bound the DEA multipliers is the cone ratio approach (Charnes et al., [11]; Charnes et al., [10]). Several different cone ratio models have been proposed for particular behavioral goals. Weight constraints are used in the literature based on more objective information such as price ranges (Thompson et al., [35]), or more subjective information such as individual or group judgements or preferences (Paradi et al., [26]), or a combination of objective and subjective information (Asmild et al., [4]).
Model (3) is known as the simplest form of incorporating weights restrictions (called absolute weight bounds) in the constant returns to scale (CRS or CCR) model.
where U l r and U u r are positive lower and upper bounds on the output weights while V l i and V u i are positive lower and upper bounds on the input weights, respectively. This formulation enables the model to determine the most favorable endogenous inputs and outputs weights within certain defined common bounds. In other words, controlling the weights may reflect the preferences of the evaluator or the organization, but careful analysis is necessary in defining weight restrictions in order to safeguard the economic interpretation of the result, to avoid infeasibility in (3) and ultimately to justify the exogenous ad hoc intervention in the performance evaluation. An advantage of the use of weight restrictions is to increase the discriminatory power of the model.

3.2.
Common set of weights. In some cases, widely differing weights for the same factor may not be relevant or acceptable in the evaluation of the DMU. This problem is related to extremely large or small weights assigned to given inputs or outputs. In such a case, it may be worthwhile to diminish the dispersion in the optimal weights assigned to the inputs and/or outputs by each DMU. The CSW procedure proposed by Roll et al. [28] is the extreme case where no flexibility is allowed in the selection of the input and output weights. In fact, CSW is a special case of weight restrictions stated in the previous sub-section i.e., when there is no inter-unit weight flexibility but weights are still endogenously determined from the production set.
The DEA literature reports on several developments for determining CSW (Chiang et al., [8]; Jahanshahloo et al., [17] ; Ramón et al., [27] ; Saati, [29]; Saati and Memariani, [30]). Saati [29] proposed a method to specify a CSW in the DEA assessment. Here, we briefly review Saati [29]'s method because: (1) the method was the impetus for the CSW approach proposed in this study; and (2) the numerical example in Saati [29] is used in Section 5 to demonstrate the applicability and exhibit the efficacy of our procedures and algorithms.
The model (4) below is used to obtain the upper bound of output weights and the analogous model can be applied to attain the upper bound of input weights.
We can use the following equations for determining the upper levels of weights: We should note that the denominators of each equation in (5) This model defines a set of bounded constraints on the inputs and outputs weights according to the central value approach. Note that φ can take values between 0 and 0.5. For more clarification, if φ = 0, U l r ≤ u r ≤ U u r and V l i ≤ v i ≤ V u i , and if φ = 0.5, u r = (U l r + U u r )/2 and v i = (V l i + V u i )/2 i.e., all weights are placed in the middle of their respective bounds. Assuming that U l r are equal to zero in model (6) and U u r are calculated by (5), then the following simplified model results: max φ s.t : After determination of a CSW by (7), the efficiency of each DMU can be calculated as: where v * i and u * r are the optimal weights obtained from (7) that are assigned to the ith input and rth output, respectively. 4. Proposed method. There is a direct correlation between the number of DMUs and the number of instances of model (2) needed to solve a DEA problem. In addition, the number of constraints in model (2) is equal to the sum of the number of DMUs plus one (i.e., n + 1). It is clear that this number, even for small problems, is large enough to increase the computational complexity. Measuring the efficiency of DMUs by DEA model (2) corresponds to the estimation a "best practice frontier". As a matter of fact, the efficiency frontier represents the "production possibility set" because the efficiency frontier is with maximum output levels for given input levels or with minimum input levels for given output levels. Hence, if the efficiency frontier is determined without solving LP models, then the above-mentioned computational complexities can be substantially reduced in the multiplier DEA problems.
Let us proceed with our earlier assumption that there are n DMUs under consideration where each DMU j , (j = 1, ..., n) use m inputs to produce s outputs. The following two-phase procedure is used to extend the DEA model. In the first phase, we define an ideal DMU (IDMU) which basically is a vector norm applied to the empirical production space. The IDMU is an artificial DMU, weakly dominating all real DMUs. The input and output of the IDMU denoted by x i and y r , respectively, are defined as: In the second phase, after determination of the IDMU, we use model (7) for this DMU as follows: In contrast to (7), model (10) does not consider all DMUs in the assessment. Note that the first constraint can be changed from inequality to equality as well as using formula (5) to calculate the upper and lower bounds. As a result, model (10) can be rewritten as: The DMUs actually in the reference set are then simply evaluated using the CSW obtained through model (11). (11) is always feasible and their optimal values are bounded.
Proposition 2. All DMUs are dominated by the IDMU-efficient frontier.
Note that propositions 1 and 2 are necessary conditions for calculating an efficiency metric of the DMUs. Once we identify a CSW via model (11), the raw technical efficiency scores of the DMUs are directly obtained from (8) with the optimal values of model (11).
The raw scores do not necessarily contain unit-valued efficiency. We therefore can normalize the efficiencies as follows: 5. Illustration of the proposed method. In this section, we use the proposed method and demonstrate the applicability of our framework and exhibit the simplicity and efficacy of the procedures with two examples. In the first illustration, we address a problem with 10 hypothetical DMUs introduced in Saati [29]. The second example is the application of our method to a real case study taken from Agrell and Bogetoft [1] with 286 Danish district heating plants.

5.1.
A simple example. Consider ten DMUs, using four inputs to produce three outputs, presented in Table 1. The efficiencies of DMUs by solving 10 times CCR models (2) with ε = 0 are shown in the second column of Table 2. It is clear that the discriminatory power is weak because 70% of DMUs are technically efficient i.e., their efficiency scores are equal to unity. As reported in Table 2, a large number of input and output weights take zero-value and different weights are obtained for each DMU. Note that the weight of the third input is always zero, implying that this dimension effectively is ignored in the efficiency analysis.
In order to deal with these problems, we use the CSW method proposed in this paper. We first calculate the ideal DMU using (9) as shown in the last row of Table  1. Then, model (11) follows as: The above model includes eight variables and fifteen constraints. The optimal value of the objective function φ is 0.43 and the common weights (the optimal solutions of (13)) are v * = {0.001, 0.209, 0.006} and u * = {0.088, 0.120, 0.401, 0.282}, respectively. The maximum raw efficiency is equal to 0.612. We recalculate the efficiency values using (12) as reported in the third column of Table 2. Contrary to the CCR endogenous weights, the CSW calculated are all non-zero. In terms of technical efficiency, the conventional CCR method classifies 7 DMU of 10 as efficient, whereas the CSW yields only one technically (adjusted) efficient, see the third column of Table 2. Thus, the proposed CSW method increases the discriminatory power. Furthermore, the common weights obtained from Saati [29] are v * = {0.001, 0.198, 0.006} and u * = {0.040, 0.058, 0.329, 0.181}. It is clear that the input weights of the proposed CSW method in this study and Saati [29]'s method are almost identical. Also, the preference for the output weights in the proposed method is u * 1 < u * 2 < u * 4 < u * 3 and it is easy to see that we have the same preference for the output weights in Saati [29]'s method. However, the proposed CSW method in this paper is computationally economical.

5.2.
A case study. In this section, we illustrate the proposed models by the panel data in Agrell and Bogetoft [1] of district heating plants in Denmark. The dataset contains 286 DMUs for 1998/99 with two inputs and four outputs, presented in Table 3.
The operating expenditure X 1 in kDKK and the primary fuel input X 2 in GJ were selected as the input parameters. The heat energy delivered Y 1 in GJ , the electrical energy delivered Y 2 in GWh , the heat capacity utilized Y 3 in MW and the total length of pipelines Y 4 in km were selected as the output parameters. We first use the CCR model to measure the efficiency of plants. Table 4 provides details for the CCR model. Note that the problems of the CCR models mentioned in the previous example concerning the extreme and zero-valued weights are more drastic owing to the number and diversity of the DMUs in a real dataset. The application here is for regulated utilities under a cost-plus regulation, meaning that they can charge their actual costs to captive consumers. In Agrell and Bogetoft [1], the regulatory authority demanded an efficiency analysis of the sector as to determine whether the utilities abused their monopoly rights in causing excessive costs. A naïve application of DEA using endogenous weights would overestimate the efficiency of these homogeneous plants, as plants with small individual differences in input (fuel) and output (heat vs. electricity) profile will claim individual weights that have no relevance with respect to the real value of inputs and outputs. However, a normative cost-efficiency analysis using an exogenous set of weights (i.e. costs) would require the regulator to know a priori the relative value of distribution costs (pipelines) versus heat deliveries Y 1 , an action that likely would be challenged by the firms. Thus, we have a compelling case for a common set of weights, since all firms should be compared on equal ground in determining whether the sector is inefficient or not.
We obtain the ideal DMU by using the fourth and fifth columns of Table 3 for inputs and outputs, respectively. Next, we apply the proposed CSW approach to this example. The common weights for inputs and outputs are v * = {4.83E − 03, 6.45E−02} and u * = {2.09E−04, 3.20E−05, 1.19E−03, 1.39E−04} respectively. We can use these weights to calculate the efficiency of all DMUs. The raw technical scores are adjusted using the transformation in (13) as shown in the third column of Table 4. The conventional multiplier CCR needs to solve 286 LPs in measuring the efficiency of DMUs, whereas our method only solves one common weighted DEA model. Indeed, the proposed method solves 285 LP models fewer than the common method and it shows that the model is computationally economical. As shown in Table 4, CCR almost disregards the effect of the operating expenditure in evaluating the plants. In addition, 9% of plants are technically efficient, which represents the weakness in its discriminatory power.
Given that the application aims at determining sets of plants to form a reasonable cost norm, sets of plants to benefit from state subsidies and plants that potentially should have their licenses revoked, we divide the results into broader categories using color code analogies. The classes are given in Table 5. According to this classification scheme, if multiple plants fall in the same class, their relative efficiency values are used to rank them within that class. The results are summarized in Table 6 for each class. The second column of Table 6 shows the percent of DMUs placed in each class. This classification does provide insightful information since the regulator notes that the efficient cost norm in Denmark is determined by a very limited set of firms, mostly larger multi-fuel plants, or plants using industrial heat. According to Table 6, two percent and seven percent of the plants are likely implementing best-practice respectively, whereas 29% is are severely inefficient. The latter category is mainly constituted of small monofuel plants with short networks and high heat-losses. Under competitive settings, these plants would probably never have been built. In addition, 25% and 37% of the plants are suggested for restructuring and operating cost support, respectively. 6. Conclusions and future research directions. DEA has been widely used to measure the relative efficiency of a group of homogeneous DMUs with multiple inputs and multiple outputs. The standard DEA model gives individual DMUs the utmost flexibility in selecting the weights for inputs and outputs. A shortcoming of this flexibility is that it hampers a common base for comparison. The weights of the inputs and outputs show each DMU in its most favorable light as long as the efficiency scores of all DMUs calculated from the same set of weights do not exceed 1. These sets of weights are typically different for each of the particular DMUs. Moreover, standard DEA models often produce a large number of weights which are zero, implying that the associated dimensions do not enter into consideration for the given DMU. DEA involves the solution of a LP problem to fit a non-stochastic, non-parametric production frontier based on the input-output data. The number of LP models in DEA problems is directly dependent on the number of DMUs under consideration. Consequently, the computational complexities and costs increase exponentially as the number of DMUs increases in a DEA problem.
We propose a two-phase procedure that determines the efficiency frontier without solving the original DEA LPs. Although the proposed method is computationally efficient, it has unavoidable limitations. While the conventional DEA problem creates the minimal piece-wise linear envelopment for the empirical production possibility set, our approach uses an exterior projection equivalent to the maximum set that contains the production possibility set in a hypercube tangential to the extreme observations in the sample. The CSW obtained by this maximum set is not defined by a priori preference information, but given through some selected properties for the production possibility set. Naturally, a maximal set in a hypercube cannot provide the same detailed dual information as a tight piece-wise linear envelope, thus the economic interpretations and valid decompositions for the ranking scores obtained remains to be investigated.
The applications of computationally efficient CSW models are manifold; in this paper we demonstrate how the Danish energy regulator could have obtained additional and useful information about the overall state of the regulated sector by solving a single linear program. Similar settings are found not only in regulation, but also in human resources and supply chain management, where the evaluator frequently ignores the preference or cost weights for the performance dimensions but nevertheless strives to avoid obviously non-sensual relative valuations. The simple approach suggested in this paper can thus be used both ex post, to rank individual units, and ex ante, to get an idea of how a sector values different dimensions in best practice.