Examining the determinants of efficiency using a latent class stochastic frontier model

Abstract In this study, we combine the latent class stochastic frontier model with the complex time decay model to form a single-stage approach that accounts for unobserved technological differences to estimate efficiency and the determinants of efficiency. In this way, we contribute to the literature by estimating “pure” efficiency and determinants of productive units based on the class structure. An application of this proposed model is presented using data on the Ghanaian banking system. Our results show that inefficiency effects on the productive unit are specific to the class structure of the productive unit and therefore assuming a common technology for all productive units as is in the popular Battese and Coelli model used extensively in the literature may be misleading. The study therefore provides useful empirical evidence on the importance of accounting for unobserved technological differences across productive units. A policy based on the identified classes of the productive unit enables a more accurate and effectual measures to address efficiency challenges within the banking industry, thereby promoting financial sector development and economic growth.


ABOUT THE AUTHORS
Michael Danquah is an economist and lecturer at the Department of Economics, University of Ghana, Legon. He holds a PhD in Economics from Swansea University (UK). His research interests include informality, inclusive growth and stochastic frontier modelling. He has published in journals such as Economic Modelling and Empirical Economics among others. In 2015, he was interviewed by BBC World Service on the Live 8, G8 and the making of Poverty History program.
Peter Quartey holds a PhD in Development Economics from the University of Manchester (UK). He is an associate professor in Development Economics and currently the head, Department of Economics, University of Ghana. He is also an economist with the Institute of Statistical, Social and Economic Research, University of Ghana. He was formerly the deputy director, Centre for Migration Studies. He has published extensively and his research interests are private sector development, development finance, migration and remittances and poverty analysis.

PUBLIC INTEREST STATEMENT
In this paper, we attempt to show that it is important to account for the differences in technology of productive units in order to accurately estimate their efficiency and the determinants of efficiency. In doing this, we use a latent class stochastic frontier model that account for underlying technology differences to derive efficiency as well as determinants. Applying the model using data on the Ghanaian banking system, our results show that inefficiency effects on the productive unit are specific to the class structure of the productive unit and therefore assuming a common technology for all productive units as is normally done in the literature may be misrepresentative. The study therefore provides useful empirical evidence on the importance of accounting for technological differences across productive units. A policy based on the identified classes of the productive unit enables a more accurate and effectual decision-making.

Introduction
Estimating efficiency of productive units and its determinants within the stochastic frontier framework is widespread in the applied economic literature. There are many empirical applications of efficiency analysis in agriculture, banking, hospitals, education and municipal services among others due to its importance to managers and policy-makers (see Fried, Lovell, & Schmidt, 1993). Generally, studies on the determinants of efficiency have employed the popular Coelli (1992, 1995) complex time decay model (herein after, BC). The Battese and Coelli (1995) model is preferred over the other frontier techniques, in that it overcomes the contradiction inherent in the two-stage approach and allows the simultaneous estimation of the parameters of the stochastic production frontier and the inefficiency effects model. 1 The estimation of the stochastic production frontier functions of these Coelli (1992, 1995) models rest on the assumption that the underlying production technology is common to all productive units. However, productive units in a particular industry may use different technologies. In such a case, estimating a common frontier encompassing every sample observation may not be appropriate in the sense that the estimates from the underlying technology may be biased. If the unobserved technological differences are not taken into account during estimation, the effects of these omitted unobserved technological differences might be inappropriately labelled as inefficiency (see Greene, 2005;O'Donnell & Griffiths, 2006;Orea & Kumbhakar, 2004). As a result, the estimated inefficiency is not likely to represent "pure" inefficiency of the productive unit and therefore the determinants of inefficiency may also be biased.
In order to reduce the likelihood of this type of misspecification, within the stochastic frontier framework, a few studies have combined the stochastic frontier approach with the latent class structure in order to estimate a mixture of frontier functions, i.e. the latent class stochastic frontier model thereby accounting for differences in technology to measure efficiency (see Barros, de Menezes, & Vieira, 2013;Caudill, 2003;Greene, 2005;O'Donnell & Griffiths, 2006;Orea & Kumbhakar, 2004). Nonetheless, the many articles on the determinants of efficiency using stochastic frontier methods have employed the BC model and therefore do not account for unobserved technological differences (see Apergis & Alevizopoulou, 2011;Isshaq & Bokpin, 2012;Tahir & Haron, 2008 among others). In this study, we combined the latent class stochastic frontier model with Battese and Coelli (1995) complex time decay model to form a single-stage approach (herein after, latent class BC) that accounts for unobserved technological differences to measure efficiency and more importantly examine the determinants of efficiency based on the class structure of the productive unit. In this way, we contribute to the literature by estimating "pure" efficiency and determinants of productive units based on the class structure to promote the formulation of cogent policies for efficient decision-making and management of resources.
An application of this proposed model is presented using data on the Ghanaian banking system. The Ghanaian banking industry consists of 27 banks under the supervision of the Central Bank. It comprises 15 foreign-owned and 12 domestic-owned banks. The foreign-owned banks account for about 51% of total industry assets, while the largest state-owned bank accounts for 11% of the industry assets, 7% share of total industry loans and advances and 11.3% of total industry deposits as of December 2012. Universal banks have operations extending into commerce and corporate lending, international trade financing, treasury financing and loan syndication among several other services that were hereto in the domain of specialized players such as development, merchant and commercial banks. This development was driven by the Banking Act (2004), which led to the abolishing of the specialized banking regime. Under the Act, banks were required to increase the minimum capital requirement to $US8 m. This figure has subsequently been increased to $US30m and $US60m in 2012 and 2013, respectively. The past decade also witnessed the drive towards the computerization of banking operations with the introduction of automated teller machines (ATMs). The number of ATMs across the country stood at 618 as of 2011. This has resulted in the introduction of telephone, SMS and internet banking products (Alhassan, 2015).
There are a number of studies on banking sector efficiency in Ghana, however only a few have applied stochastic frontier methods. Most of these studies have employed non-parametric approaches such as the Data Envelopment Analysis (DEA) and have concentrated mostly on efficiency analysis of banks (see Adjei-Frimpong, Gan, & Hu, 2014;Saka, Aboagye, & Gemegah, 2012) with the exception of Isshaq and Bokpin (2012). As indicated earlier, these studies on Ghana among others do not account for unobserved technological differences in the estimation as well as the determinants of efficiency.
Given the importance of investment in technology in the Ghanaian banking industry, there are significant technological differences among banks primarily with respect to electronic banking, mobile banking services and Point of Sale systems. For instance, internet banking is not as popular; and a few of the banks that offer internet banking services have only now enabled their systems to allow their customers to complete transactions such as online funds transfer (PwC Ghana Banking Survey, 2014). Using inefficiency covariates such as inflation, size of bank, bank concentration and banks return on assets, which have been used extensively as determinants of bank efficiency (see Alhassan, 2015;Casu, Girardone, & Molyneux, 2004;Darrat, Topuz, & Youzef, 2002;Isshaq & Bokpin, 2012;Saka et al., 2012 among others), we employ both the BC and latent class BC to examine the effects of these covariates on bank efficiency. In this way, we are able to show that the inefficiency effects on the productive unit, in this case, banks are specific to the class structure of the bank.
The rest of the study is discussed under five sections. Section 2 discusses the methodology. Sections 3 and 4 present the data and discuss the empirical results, respectively. The last Section 5 concludes.

Methodology
In this section, we present a stochastic frontier model that combines the latent class stochastic frontier model with Battese and Coelli (1995) complex time decay model. Following from the application of the proposed model, the presentation is done with application to banks. As per the application to banks, the performance or technical efficiency of banks is defined as the ability of the bank to transform (multiple) resources into (multiple) financial resources (see Bhattacharyya, Lovell, & Sahay, 1997). We follow the banking literature and use the intermediation approach proposed by Sealey and Lindley (1977) to define inputs and outputs. In this study, we include loans and advances as output and deposits, staff cost (proxy for labour input) and fixed assets (proxy for capital input) as inputs. In other words, loans and advances of bank i at time t is given by: where Y it is loans and advances of bank i at time t, f (.) is suitable functional form, D it , L it and K it are defined as deposits, labour and capital for bank i at time t, respectively. We assume that some banks may lack the ability to employ existing inputs as efficiently as possible and consequently produce less than the optimal output. Therefore, the actual observable output produced by each bank i at time t (Y it ) is then better described by the following stochastic frontier production function: T is a time trend common to all banks and β is an unknown parameter to be estimated. TE it represent technical efficiency and is defined as the exponential of −u it , where u it >0 and is a measure of the shortfall of output from the frontier (technical inefficiency) for each bank in the sample. v it embodies measurement errors, any statistical noise and random variations of the frontier across banks.
Writing Equation (2) in logarithms form, we have: Replacing TE it with exp (−u it ), Equation (3) can be reformulated as: (1) u it > 0, but v it may take any value and is assumed to be a half-normal distribution.
We apply the Cobb-Douglas specification 2 to characterize the stochastic production frontier as: where y it represents the logarithm of Y it and x nit denotes an n-th input variable. For convenience, the Cobb-Douglas production frontier function in Equation (5) can be rewritten as: where the proxy for technical change T is included in Following from the base model in Equation (6), the popular version of the Battese and Coelli model can be specified as where z it is a vector of explanatory variables in addition to time that may affect inefficiency, η is a vector of parameters, t is the period and T is the last period. With respect to z it explanatory variables that affect inefficiency, although there is an increasing concern about the endogeneity issue in the empirical literature, in practice, as Greene (2011) points out, dealing with the issue in non-linear models such as the stochastic frontier analysis is rather complicated. 3 Moreover, as Mutter, Greene, Spector, Rosko, and Mukamel (2013) stressed, the literature on stochastic frontier analysis does not offer clear guidance on how to tackle the endogeneity problem. In this paper, we address the problem of endogeneity using lags of our explanatory variables (see Iyer, Rambaldi, & Tang, 2008).
As indicated, in the Battese and Coelli model and other standard stochastic frontier approaches, the frontier function is the same for every firm, therefore inefficiency is estimated relative to the frontier for all observations. However, in the latent class stochastic frontier model, we estimate as many frontiers as the number of classes simultaneously. The latent class stochastic frontier model extended to the Battese and Coelli specification is presented as follows; where j indicates class j.
Following from Orea and Kumbhakar (2004) and Greene (2005), the latent class stochastic frontier model in Equation (8) is estimated using maximum likelihood methods. In a latent class model, the unconditional likelihood for productive unit i (bank i ) is obtained as a weighted sum of their j-class likelihood functions, where the weights are the probabilities of class membership. The probabilities reflect the uncertainty that we might have about the true partitioning in the sample. That is (5) where ∏ (i, j) is the prior probability attached to membership in class j. The class probabilities are parameterized as a multinomial logit model, where z i is a vector of productive unit (bank) specific variables. 4 The class membership is estimated by j*, the one with the largest posterior probability.

Data and variables
The latent class stochastic frontier model extended to the Battese and Coelli framework is applied to a panel of 27 Ghanaian banks over the 2006-2010 period. Following from the banking literature, we use the intermediation approach by Sealey and Lindley (1977) to define outputs and inputs. As indicated, the variables in the stochastic production frontier are loans and advances, which include loans and advances to bank customers as output; the three inputs are deposits made up of customers deposits and deposits from other financial institutions, staff cost (proxy for labour input) consisting of expenditure on employees, and fixed assets (proxy for capital input) made up of book values of fixed assets. The trend variable, t = 1, 2, 3, 4 and 5 for years 2006, 2007, 2008, 2009 and 2010. Following from the review of the literature on banks (Alhassan, 2015;Casu et al., 2003;Isshaq & Bokpin, 2012;Saka et al., 2012 among others), the explanatory variables in the inefficiency model are inflation, total assets, bank concentration and ROA of banks. The data-set is sourced from the Banking supervision Department of the Bank of Ghana. Table 1 present the summary statistics of the input and output variables.

Empirical results
We first estimate the BC model before proceeding to estimate a latent class BC specification in order to compare our results. In estimating the latent class model, one has to address the problem of determining the number of classes. In this study, we use the testing down strategy suggested by Greene (2003). We test down from 4 to 3 and 3 to 2 classes. The preferred model is that with two classes.
The estimated prior class probabilities are on average 50%. The highest value is obtained for the second class with a prior class probability of 54% (see Table 2). The classification resulting from these prior probabilities shows that the largest group (second class) is mainly formed by banks with larger total assets. The average total assets of these banks are much larger than the banks in the other class.
A comparison of the inefficiency estimates for the BC model and the latent class BC model using kernel density estimators show that the inefficiency estimates of the BC specification are far larger than that of the latent class BC specification (see Figures 1 and 2). Both mean and variation of the distributions for estimated inefficiencies of the Battese and Coelli specification are larger, indicating that the latent class stochastic frontier extension of the Battese and Coelli specification performs better than the Battese and Coelli. Following from the latent class BC specification, the average efficiency of Ghanaian banking sector as a whole is 0.56. There are however substantial differences in efficiency levels among classes. While the average efficiency in the first class is 0.479, it increases to 0.646 in the second class (see Table 3). A further disaggregation of efficiency of banks in different number of class is also presented in Table 4. Though the sample is made up of commercial banks (that is, banks that are largely engaged in retail banking by providing savings and loans services to individuals and commercial firms), the classification of these banks into classes can be connected to factors such as branch network capacity. For instance, we observed banks in class 1 have an average of 36 branches, while banks in class 2 have an average of 23 branches. In addition, banks in class 2 have a relative concentration of branches in urban areas and leverages on technology-driven banking services such as internet banking, SMS banking, among others, compared to their competitors.
With regard to the inefficiency effects or the determinants of inefficiency, we examine the effects of these covariates on inefficiency in the BC model and the latent class BC model where efficiency effects are estimated separately for each class. Our results show that the effect of these covariates on inefficiency is not the same across the BC model and the classes of the latent class BC (see Table 5). Inflation significantly reduces inefficiencies in the BC model and in the first class of the  latent class BC but it is the opposite in the second class. Increases in total assets also reduce inefficiency in the BC model but its effects are not significant in the first class of the latent class BC. The ROA although significantly decreases inefficiency in the BC model, its effects are not significant in both classes of the latent class BC. In effect, the determinants of inefficiency of banks in Ghana in this case are specific to the class structure of the bank when we account for technological differences. This implies that the covariates of efficiency of a productive unit may be specific to the class structure and therefore different policy measures needs to be formulated for different productive units based on the class structure in order to ensure efficiency.

Conclusions
In explaining inefficiency of productive units in the stochastic frontier framework, most studies have employed the popular Battese and Coelli (1995) complex time decay model. However, the estimation of the stochastic production frontier functions of the Battese and Coelli models rest on the assumption that the underlying production technology is common to all productive units. Nonetheless, productive units in a particular industry may have differences in technologies. In this study, we combined the latent class stochastic frontier model with Battese and Coelli (1995) complex time decay model to form a single-stage approach that accounts for unobserved technological differences to measure efficiency and examine the determinants of efficiency based on the class structure of the productive unit. In this way, we contribute to the literature by showing that the determinants of efficiency of a productive unit is specific to the class structure if there are unobserved technological differences. However, it is important to point out that the stochastic frontier analysis employed for the study does not offer clear guidance on how to tackle endogeneity. This may be a limitation of the study. First lag is used in the case of inefficiency variables to minimize endogeneity problems. A negative sign on the coefficient of the z it vector variable represent a reduction in inefficiencies. *Statistical significance at 10% level. **Statistical significance at 5% level. ***Statistical significance at 1% level. Using data on the Ghanaian banking system, our results show that the effects of the covariates of inefficiency on banks in Ghana is specific to the class structure of the bank when we account for technological differences. The findings indicate that inefficiency effects on the productive unit are specific to the class structure of the productive unit and therefore assuming a common technology for all productive units as in the Battese and Coelli model and in many empirical studies (if there are technology differences) may be misleading. Given the importance of accurate policies to promote efficiency and increased output, identifying the class structure of the productive unit would enable policy-makers to put in place appropriate measures to reduce inefficiency and boost productivity within the banking industry thereby enhancing financial development and promoting economic growth.