ffi ciency Analysis of Public Transportation Subunits Using DEA and Bootstrap Approaches – Dakar Dem Dikk Case Study

Transportation is a sector which plays an important role in the process of development of countries around the world. A crucial step in transportation planning process is the measure of the efficiency of transportation systems in order to guarantee the desired service. This paper investigates the relative efficiencies of lines of the main public transportation company Dakar Dem Dikk (DDD)1 in Dakar (Senegal). The objective is to apply Data Envelopment Analysis (DEA) and bootstrapping approaches in order to identify opportunities for improvement. In this study, we examine technical efficiency for the 24 lines of DDD using Constant Returns to Scale (CRS) and Variable Returns to Scale (VRS) DEA output oriented models. We apply bootstrap approach for bias correction and for confidence intervals creation of our estimates. Finally, we examine the returns to scale characterization of lines. The results establish that there exist possibilities for improvement for the lines and also shown that there are potential for restructure for some lines.


Introduction
Throughout history, transport has been the backbone of economic growth worldwide.The ability to improve performance every time and the capacity to serve people through shared passenger infrastructures are among the key factors for building effective transport systems (Swift, 2014).With respect to such a framework, public transportation appears as the most important component, especially in large cities of developing countries, like Dakar, the capital of Senegal.The population of Dakar is large and continuously growing.With a very high mobility demand, moving people through urban and suburban areas constitutes a real challenge (due to the lack of adequate infrastructure and high level of congestion).Perceived as a sustainable mobility solution, public transport has gained much interest from the Senegalese government with the creation of Dakar Dem Dikk (DDD)2 company in 2001 whose objective is to serve urban and suburban areas of Dakar.In order to provide a satisfactory service level for the population, DDD company must not only abide to standard conformity guidelines (including safety, reliability, cleanness, regularity, availability, timetable (scheduling), etc.) but also respect their budgetary commitments.The importance of the latter requirements is stressed with the creation in 2014 of a new unit, PPSE (Pôle Pilotage et Suivi Evaluation), whose role is monitoring the activities of DEX (Direction de l'EXploitation). 3In addition to the general public, DDD company provides specialized transport services like, the transport of school children, pensioners, state's agents, etc. Innovative ways of increasing revenue ought to be investigated within a competitive market, using more efficiently the existing resources.Instead of dealing with performance indicators of the whole company, a more judicious approach consists in looking at the system as though the transportation lines are autonomous decision units (which is true in practice).Therefore, the analysis of the lines' efficiency enables the managers to monitor the activities of each line separately.
In this paper, we carry out a performance analysis of the lines of DDD company.Conducting a performance evaluation of the lines is a necessary step of developing a meaningful set of benchmarks for best practices and successful businesses.We use the Data Envelopment Analysis (DEA) to determine the efficiency scores of each transportation line, identify potential sources of inefficiency and, hence, alerte the management of low performing lines and suggest ways for improvement.
The DEA bootstrapping approach is also applied for bias correction and confidence interval construction for the efficiency scores.In our knowledge, this study is the first attempt to evaluate performance of the lines of a public transportation company in Senegal.Such a focused study can help stakeholders to determine current competitive position of DDD company, in addition to supporting decisions pertaining to the improvement of operational performance and resource reallocation.
The rest of the paper is organized as follows.In section 2, we review a part of literature on the application of DEA besides DEA bootstrap in transport.Section 3 is devoted to the adopted methodology to conduct our study.Section 4 presents the data used for the study, followed by an analysis of the results in section 5. Finally, we conclude with prospective recommendations in section 6.

Literature Review
Data envelopment analysis (DEA) is a powerful method for evaluating performance of decision making units (DMUs) that consume multiple inputs (resources) to generate multiple outputs (products).DEA has been widely used to analyze the performance of public transport systems.Some studies applying DEA or/and Bootstrap methods, in recent years, are summarized in Table 1.The bootstrap method is used to construct confidence intervals and apply bias-correction for efficiency estimates.

DEA
All the aforementioned studies are concerned with the analysis of entire systems, except Barnum et al. (2007) and Devaraj et al. (2015).Barnum et al. (2007) evaluate the efficiency of subunits within a single organization rather than entire units.A similar approach is adopted in Devaraj et al. (2015) but, due to small sample size, bootstrapping is used for bias correction.Von Hirschhausen and Cullmann (2010) use also the bootstrap method to assess the robustness of the efficiency estimates and test the hypothesis of global and individual constant returns to scale for entire transportation units.
In our study, we address comparable objectives but within a different contextual setting.We consider subunits within a single company.

Data Envelopment Analysis
The DEA was developed by Charnes, Cooper and Rhodes (1978).DEA is a tool for measuring the «relative» efficiency of organizations, called Decision Making Units (DMUs), via weights associated to input and output factors.Based on linear programming techniques, DEA can be used to measure technical efficiency, allocation effectiveness of inputs and outputs, and economic performance of production means (Banker, Charnes & Cooper, 1984).DEA has been used in many contexts and areas (Beasley, 1990;Beasley, 1995;Beasley, 2003;Le Floc'h & Mardle, 2005;Cullinane, Ji, & Wang, 2005).To achieve technical efficiency, one can possibly be interested in either input reduction (input orientation) or output augmentation (output orientation).In the input orientation, the objective is to produce the observed outputs with minimum resources and, as a result, inefficiency is treated in terms of input excess.In the output orientation, the objective being to maximize output production without exceeding the given inputs levels, inefficiency is instead addressed via output slacks.See, e.g., Charnes et al. (1978), Banker et al. (1984), Zhu (2002), Cooper and Seiford (2007), Cooper, Seiford and Zhu (2011) for more details on DEA method.Since DDD company is operating within a competitive environment, the key objective for each line is to increase the company's profits.Hence, we use output oriented models to evaluate the efficiency of the lines.Assuming n lines, each line j consuming m inputs x i j to produces s outputs y r j , the efficiency ω 0 of line 0 can be measured through the following CCR 4 (1) output oriented envelopment model. Max n ∑ j=1 y r j λ j ≥ ω 0 y r0 , r = 1, ..., s λ j ≥ 0, j = 1, ..., n where λ j are the weights of the lines j that are included in the benchmarking set of the line under evaluation 0, ω 0 measure also the feasible expansion of the output levels of line 0. The CCR model (1) assumes Constant Returns to Scale (CRS).The VRS model, also called BCC5 (Banker et al., 1984) model, is obtained by adding the convexity condition Assuming VRS allows the efficiency measure to be independent of scale inefficiency.It separates the pure technical efficiency from scale efficiency.The constant returns to scale (CCR) assumption may result in efficiency measures influenced by scale efficiency when not all DMUs are operating at optimal scale.At the optimal solution of model ( 1), we will have ω * 0 ≥ 16 .The output technical efficiency of the line under evaluation is given by 1 ω * 0 .In model ( 1), input and output slack values may exist.After solving (1), we have where s − * i and s + * r , are the input and output slacks respectively, ω * and λ * j are the optimal values from (1).To be efficient, a line must satisfy the two following conditions (Charnes et al., 1978).
If any slack value is positive at the optimal solution of model ( 1), the expansion of the corresponding output level to the proportion ω * 0 can improve the efficiency of the line under evaluation.We use a new approach introduced by Carlos, Bana, João Carlos, Soares and Lidia (2016) to draw the CCR and the BCC efficient frontiers and the positions of the lines in relation to these frontiers in a bi-dimensional graph.In both cases, the efficient frontiers are reduced to a straight line from the origin that bisects the quadrant.As in the standard DEA graphical representation, the efficient lines are on the efficient frontier and those inefficient are below the frontier.Such a graphical representation will clearly indicate how far the inefficient lines are from the efficient frontier.Moreover, the efficiency scores can be determined geometrically in the graph as in the standard CCR and BCC models.These graphical representations can be used as a support for decision makers in order to have a global view of the lines with respect to the efficient frontiers but also their positions relative to each other.

DEA Bootstrapping
The DEA efficiency scores are upward biased by construction because they are based on the empirical frontier and not on the true unknown production frontier (Roets & Christiaens, 2015).Moreover, DEA estimates are highly sensitive to sampling variations and errors in the data.To overcome these shortcomings, we apply a bootstrap method (Simar & Wilson, 1998;Simar & Wilson, 2000;Simar & Wilson, 2007;Bogetoft and Otto, 2011).The bootstrapping concept is based on the idea that simulating the sampling distribution of interest is possible by mimicking the data-generating process (DGP).Under the assumption that the original data sample is generated by the DGP, the DEA efficiency scores are re-estimated with the «simulated» data.Through multiple replications of this process, a Monte Carlo approximation of the sampling distribution is derived from the empirical distribution of the bootstrap values (Oukil, Channouf, & Al-Zaidi, 2016).
Let us consider a data set of size n, x 1 , x 2 , ..., x n .The idea of the bootstrap method is to sample observations from this data set with replacements and create a new random data set of the same size as the original (Bogetoft & Otto, 2011).Assume we have a sample of p elements, x 1 , x 2 , ..., x p .Resampling this p elements with replacements from the original sample yields a bootstrap sample, x b of size p.The statistic of interest t(x b ), called replicate, can be estimated using the bootstrap sample x b .One repeats this process B times, creating thereby a sample of B replicates t(x b ) (b = 1, 2, ..., B).A bootstrap estimate of the standard error of t(x) with B replications is obtained as is the mean of the B replications of the statistic in question.
In our analysis we apply the bootstrap algorithm presented in Bogetoft and Otto (2011) to estimate the bias and variance of the DEA efficiency scores and to construct confidence intervals.
Let B be the number of sampling replicas achieved and ω * b (b = 1, ..., B) the bootstrap estimate of ω for each of these samples, where ω is the DEA-estimated efficiency for the initial sample.The mean efficiency score can be determined for each line as: and the variance of the bootstrap estimate is: The bootstrap estimate of the bias is: bias * = ω − ω.
A bias-corrected estimator of ω (the true but unknown efficiency) is then The confidence intervals are determined by with T the estimated DEA technology; ĉ α 2 and ĉ1− α 2 the estimated upper and lower quantiles, respectively.

Case Study
In service efficiency studies related to transport, three basic inputs are generally considered, namely, labor, gas, and capital (Sampaio et al., 2008).In the meantime, there is not an exhaustive list of outputs since the latter varies depending on specific transportation systems.For urban transit systems, Devaraj et al. (2015) provide a non-exhaustive list of input and output variables that are usually employed in DEA-based studies.
To conduct our study, data were collected from the DDD operations management department for a total of 24 lines, which include 12 urban lines and 12 suburban lines.Although both line categories are operating under similar conditions, the latter lines might need to leave the urban areas to serve the peripheral suburbs.Three inputs are considered, that is, fuel, number of buses and line length.The fuel measures the number of gas oil liters consumed on each line.Since this input is implicitly a proxy of air pollution, it reflects perfectly the dictum «less is better» (Cook & Joe, 2015).The number of buses available on each line can be treated as a dual variable, depending on whether the objective is customer satisfaction or cost reduction.In the year 2014, DDD company was operating only 74% of the buses, which affected more the quality of service offered to customers.The number of buses and line length are used as physical measures for labor and capital.
The outputs that are strongly influenced by these inputs are Distance, Receipts and Passengers.The distance refers the total distance traveled on each line, which depends on the number of buses that are effectively operating on the line.The receipts and passengers indicate, respectively, the total receipts collected and the number of passengers serviced on each line.The Table 2 presents descriptive statistics of the data used.

DEA-CCR
We solve the CCR envelopment model (1) for DDD company.The numerical tests were performed by using the commercial MILP solver, namely IBM-CPLEX (2013).The numerical experiments were executed on a computer: 5×Intel(R) Core(TM)4 Duo CPU 2.60GHz, 8.0Gb of RAM, under UNIX system.Table 3 shows an average efficiency score above 98% for the year 2015, which suggests that, overall, DDD's lines are performing well.Apparently, 7 out of 24 lines are efficient, that is, only 29.17% of the lines.Moreover, four of these lines are suburban (lines 2, 5, 16, 219).Among the efficient lines, line 2 is a benchmark for 17 inefficient lines (Table 3).
Being the most frequent benchmark, line 2 is actually a well performing line which is likely more suitable to use as role model for less efficient lines (Thanassoulis, 2001).Thus, line 2 deserves a particular attention from DDD's managers.All lines whose efficiency score is less than 1 are inefficient and, to improve their efficiency, they are required to imitate the corresponding benchmarks.For instance, line 1 needs to copy its peers, which are lines 2, 5, 7 and 16, and increase all its outputs by a factor of 1.0092 using the current input levels.
The potential input reduction or output expansion for inefficient lines are identified through slack analysis.The results in Table 4 show that, for all the inefficient lines, the numbers of receipts and passengers can be increased while the number of buses and the line length are decreased.For instance, line 232 has significant potential for increase in receipts generation (105.97%), in passengers transportation (93.35%), and reduction in number of buses (78.80%) and in length (71.39%).
There is no leftover in fuel consumption and no excess in distances traveled.In order to detect and monitor these trends, DEA experts and decision makers could work together (Barnum et al., 2007).
When a slack value is present at the optimal solution, the corresponding input or output constraint is nonbinding and the dual variable associated with the constraint equals 0. The opposite occurs when a slack value is zero.It can be observed from Table 4 that fuel and distance traveled slacks are zero everywhere.This is because in the multiplier model corresponding to the dual of model 1, very high weights are attributed to fuel and distance variables and very low or zero weights are given to other variables.This is one of the drawbacks of DEA self evaluation analysis.It's why theories like cross-efficiency analysis or incorporating weight restrictions are developed in DEA to deal with these problems (Sexton, Silkman, & Hogan, 1986;Doyle & Green, 1994;Podinovski, 2016;Simar & Wilson, 2007).However, the weight restrictions approach allows incorporating expert opinion regarding the relative importance of the particular inputs and outputs in the production process.Each ligne chooses its weights according to a kind of inputs its can convert into production of a kind of outputs.For instance, line 232 attributed all its input weights to fuel and all its output weights to distance traveled while line 1 distributed its input weights between fuel, number of buses, and length and its output weights between distance traveled and receipts.Lines choosing the same weighting schemes have likely the same speciality in converting specific inputs into production of specific outputs.In this figure all the lines appear to be very close to the efficient frontier.This could be guessed regarding the lines' efficiency scores.In order to visualize the distances between inefficient lines and the efficient frontier, providing an idea of how inefficient these lines are, we have zoomed a part of Figure 1 in Figure 2. From this figure it can be observed that the majority of the lines seem to be concentrated around a same area indicating the proximity of their operating activities.
Figure 2. A zoomed part of the CCR efficient frontier

Bias Correction and Confidence Intervals
As mentioned above, DEA efficiency scores are upward biased since they are sensitive to sampling variations among other.To overcome this upward bias in efficiency estimation, we use the bootstrap method (Simar & Wilson, 1998;Simar & Wilson, 2000;Simar & Wilson, 2007;Bogetoft & Otto, 2011).The efficiency results8 from the original model and the bootstrap approach along with the confidence intervals for the efficiency are shown in Table 5.It can be observed from Table 5 that the bootstrapping efficiencies are comparatively less that the original DEA efficiency.This is due to the presence of upward bias.After bias correction all lines turn to be inefficient.Since the number of DMUs in this study relative to the number of inputs and outputs, satisfies the rule of thumb (Ozbek et al., 2009) which is that the number of DMUs should be at least two times the product of the number of inputs and number of outputs.Thus, the suspicion of presence of bias due to the sample size is avoided.
The mean efficiency score move from 0.9896 to 0.9836 for original DEA efficiency and bootstrapping efficiency respectively.Since the variances are quite low for all the individual estimates, we consider our results individually and globally robust (see Table 5).

DEA-BCC
As mentioned above, when efficiency score is measured with the CRS assumption the results can be influenced by scale inefficiency.Thus, let solve this time the DEA BCC model in order to identify scale inefficiency.Table 6 presents the results relative efficiency scores and benchmarks for inefficient lines.
From the BCC model, 16 lines are efficient and 8 are inefficient (see Table 6).We observe that lines gain in efficiency and we remarked that average efficiency is relatively high from both assumptions.The inefficient lines are: line 4, line 8, line 11, line 12, line 18, line 23, line 217, and line 227.Hence, 66.67% of the lines are efficient.Among the inefficient lines, four lines are suburban (lines 11, 12, 217, 227) and four lines are urban (lines 4, 8, 18, 23).The line 2 serve, this times also, as a benchmark for all lines (see Table 6).It can be remarked that the CCR model discriminates more than the BCC model.The number of times an efficient line serve as benchmark for the inefficient ones is presented in Table 6.We can observe in Table 6 that lines 5, 15, 121, and 228 are efficient but don't serve as benchmarks for any inefficient line.The line 2 is at the top being a benchmark of eight inefficient lines followed by lines 6 and 16, for four.Lines 7, 10, 13, 218, and 232 are benchmarks each for three.The lines 1, 9, 20, and 219 each for one.
For the inefficient lines, Table 7 presents the potential increase in outputs and decrease in inputs expressed as percentage.Line 4 could increase its receipts to 5,98%.Line 11 could decrease its length to 6.40%, increase its receipts and passengers to 5.78% and 7.54% respectively.The potential increase for line 12 is in receipts to 2.61%.Line 18 could increase its receipts to 0.78% and decrease its length to 13.23%, line 23 has a potential to increase its receipts to 1.18%.Line 217 could increase its receipts to 5.82% and decrease its length to 29.41%.Finally line 227 has a potential to decrease both in number of bus and length to 2.7% and 10.11% respectively.As the DEA-CCR, There are no increase in kilometers and decrease in fuel consumption for all the inefficient lines from the BCC model.In Figure 3 9 is depicted the BCC efficient frontier and the position of the lines relative to this frontier.As the CCR case, the inefficient lines are not so far from the efficient frontier and the efficient ones are on it.A part of Figure 3 is zoomed in Figure 4 in order to further visualize the position of the inefficient lines from the efficient frontier.It can be observed from the figure that some lines are concentrated around a same area.The CCR and BCC efficient frontiers have the same shape irrespective of the models used, as mentioned in Carlos et al. (2016).We have remarked from both cases that, the small sized lines are near the origin, i.e. (0,0) point, as was predicted in Carlos et al. (2016) and the large sized ones are far from that point.
Comparing the efficiency scores from both CCR and BCC models, some lines show significant differences between the two scores.This indicate scale inefficiencies of the lines.

Scale Efficiency (SE)
The scale efficiency is the ratio of the CRS efficiency score to the VRS efficiency score.Its value is greater than 0 but less than or equal to 1.When CRS and VRS efficiency scores are identical, this imply that the line under evaluation is scale efficient and the value of scale efficiency is equal to 1. Hence, the efficiency score of that line is not influenced by moving from constant CRS assumption to VRS assumption.The results of DDD company indicate high levels of scale efficiency (see Table 8).Thus, the majority of the inefficiency detected under constant returns to scale is not caused by lines operating on a too high or too low scale.
The RTS information is important because it indicates the gains from adjusting the size of a line.The RTS are considered to be increasing if a proportional increase in all the inputs results in a more than proportional increase in the outputs.RTS are decreasing if a proportional increase in all the inputs results in a less than proportional increase in the outputs.These suggest that lines don't operate at their optimal size.Hence, rescaling is possible.Therefore, the CRS suggest that lines operate at their optimal size.The CRS assumption means that a proportional increase in the inputs results in a proportional increase in the outputs.According to the results presented in • Finally, if the NIRS, the VRS and CRS technical efficiency scores are equal, then the returns to scale are constants (CRS).
The scale efficiency as well as the nature of returns to scale are presented in Table 8.  8).This confirms what is said in the literature, specifically for small size companies increasing returns to scale prevail (Von Hirschhausen & Cullmann, 2010).Correcting the inefficiencies in the lines may lead to improve the overall efficiency of DDD company.This will enable DDD company to be competitive and to provide a satisfactory service to the population.

Concluding Remarks
In this paper, we have developed a DEA and bootstrapping methods for investigating line efficiencies of DDD company.We provide a way for improving it overall performance.We have used an innovative approach to represent the BCC and CCR efficient frontiers in a bi-dimensional graph.These graphical representations can be used as a support for decision makers in order to have a global view of the lines, with respect to the efficient frontier.We have shown that, among the 24 lines of DDD company, 7 lines only are efficient from CRS assumption and 16 from VRS assumption.From both assumptions, the suburban lines are found to be more efficient than the urban ones.This is due to the organization of activities of the population.The majority of the population lives in the suburban areas and have their activities in the urban areas.Hence, they leave every day suburban areas to urban areas and return at the end of their activities.It was also shown that the majority of the lines (fifteen (15)) are operating under IRS, one (1) under DRS and eight (8) under CRS.The information on returns to scale can be used to improve the structure of the lines of DDD company.Concerning maximizing profits and providing satisfactory public service, we suggest for lines operating under IRS, to increase the number of buses on that lines.As a consequence, this will increase the number of passengers, receipts and kilometers.This will also improve the quality of the service, because it will decrease overloads, waiting time and irregularities.The same can be done for lines operating under CRS.In order to improve efficiency of line 227, we suggest to decrease the number of its buses and its length because it operate under DRS.In addition, the analysis indicated that the inefficient lines have opportunities for improvement.Informations on the benchmarks for inefficient lines can be used for improving their performance.The analysis of the slacks showed that there are potential for increase in both receipts generation and passengers transportation; and for decrease in both number of buses and length for the lines.However, there is no reduction in fuel consumption and no increase in kilometers.Correcting the inefficiencies in the inefficient lines may permit to improve the overall efficiency of DDD company.

Figure 1 7
Figure 1 7 presents the graphical representation of the CCR efficient frontier along with the position of lines in relation to this frontier.The efficient lines are on the efficient frontier and those inefficient are below the frontier.

Figure 1 .
Figure 1.Graphical representation of the CCR efficient frontier and the lines

Figure 3 .
Figure 3. Graphical representation of the BCC efficient frontier and the lines

Table 1 .
Studies applying DEA and Bootstrapping in transport

Table 2 .
Descriptive statistics of variables on the period 2014

Table 3 .
CRS efficiency scores and benchmarks

Table 5 .
Results of bootstrapping CRS efficiency scores and confidence intervals

Table 6 .
VRS efficiency scores Table 8, the majority of the lines (16) of DDD company are scale inefficient.To identify the direction of scale inefficiency, i.e Decreasing Returns to Scale (DRS) or Increasing Returns to Scale (IRS), we consider another model called Non-Increasing Returns to Scale (NIRS) model.This model is derived from the VRS (BCC) model in which the constraint Prasado Rao, O'Donnel, & Battese, 2005).The VRS and NIRS technical efficiency scores are compared in order to identify the nature of scale inefficiency for each line.• If the NIRS technical efficiency and the VRS technical efficiency scores are unequal, then the returns to scale are increasing (IRS); • If the NIRS technical efficiency score is equal to the VRS technical efficiency score (but is not equal to the CRS technical efficiency score), then the returns to scale are decreasing (DRS);

Table 8 .
Scale efficiency ) lines of DDD company are operating under IRS, one line operate under DRS and finally eight (8) lines only operate under CRS (see Table