Out-of-sample predictability of the cost efficiency measures of US Banks: Evidence of capital requirements

In the evaluation of capital adequacy requirements on the economic efficiency measures of banks, the literature has presented a two-stage Data Envelopment Analysis approach. In the first stage, the efficiency measures are estimated. In the second stage, the relationship between capital adequacy requirements and efficiency measures treated as the dependent variable is evaluated using the classical ordinary least squares (OLS) model. However, determining the out-of-sample predictability of capital adequacy requirements has been lacking. Therefore, this paper contributes to the sparse literature of capital adequacy requirements by applying Support Vector Regressions (SVRs) and the routinely used OLS linear model benchmarks. Using a total of 10,380 December quarterly observations of United States’ commercial and domestic banks from 2008 to 2019, we estimate the model parameters of SVRs with Linear, Polynomial and Radial basis function kernels and motivated by h-block crossvalidation technique. The results reveal that the SVRs provides better benchmarking insights in the evaluation of financial banks’ capital adequacy requirements than the benchmark OLS model.


Introduction
In the evaluation of capital adequacy requirements on the economic efficiency measures of financial banks, recent empirical studies, using the linear programming Data Envelopment Analysis (DEA) of Charnes et al. (1978) and Banker et al. (1984), have suggested that capital adequacy requirements is one of the most direct contributors to financial stability; see for example, Pasiouras et al. (2009); Gorton and Metrick, (2010) ;Chortareas, et al. (2012); Barth et al. (2013); Dong et al. (2014); Pessarossi and Weill, (2015); Sakouvogui and Shaik, (2020); and Sakouvogui, (2020a). 1 These contributors are motivated in part by the primary effect of capital adequacy requirements and its influence on bank's efficiency, failures, future problem loans, and risk-taking (Pessarossi and Weill, 2015).
Henceforth, over recent years, the financial banking sector has undergone enormous changes and consequently, the role of capital adequacy requirements has turned out to be more complex due to the economic liberalization. Due to this phenomenon, various papers have concluded that financial banks operate in heavily regulated sector (Barth et al. 2004;Pasiouras et al. 2009;and Barth et al. 2013) with two converging views of capital adequacy requirements (Barth et al. 2004;Barth et al. 2013;Pessarossi and Weill, 2015;and Sakouvogui, 2020b). One strand of literature has found a positive effect of capital adequacy requirements on bank's efficiency measures (Holmstrom andTirole, 1997 andMehran andThakor, 2011). In contrast, another strand of literature has suggested a negative effect of capital adequacy requirements on bank's performance (Berger and Bonaccorsi, 2006;Pasiouras et al. 2009;Pesssarossi and Weill, 2015;and Sakouvogui and Shaik, 2020). Henceforth, with these divergent effects, determining the impact of capital adequacy requirements has become a paramount and popular question in the literature of financial banks (Pessarossi andWeill, 2015 andShaik, 2020). Further investigations are therefore needed of the tools used to understand the mechanism of how of capital adequacy requirements affects the economic efficiency measures of financial banks.
Literature has presented a two-stage DEA approach in the evaluation of capital adequacy requirements on the economic efficiency measures of financial banks in which the specification of cost function is required (Pessarossi andWeill, 2015 andShaik, 2020). In the first stage, the cost efficiency measures are estimated based on the selected input and output variables that are pertinent to the intermediate approach (cost function). In the second stage, the relationship between the exogeneous variables and cost efficiency measures, treated as the dependent variable, is evaluated using either the Tobit or Ordinary Least Squares (OLS) regression model (Barth et al. 2004;Hoff, 2007;Hwang and Kao, 2008;Pasiouras et al. 2009;Barth et al. 2013;and Sakouvogui and Shaik, 2020).
While the empirical technique of the second stage is maximum likelihood estimation of a Tobit or OLS model, recently, Hwang and Kao (2008) have concluded that the linear models estimated via OLS are preferable. Therefore, according, to Hoff (2007) and Hwang and Kao (2008) the OLS model should replace the Tobit regression model as a sufficient second stage DEA's model. 2 However, in applying the OLS regression model, the current empirical studies have so far failed to focus on the predictability of capital adequacy requirements in out-of-sample analysis.
The apparent limitation of the second stage in a two-stage DEA approach, known as the impossibility to determine the performance of OLS regression model, is present in existing literature; see, for example, Barth et al. (2004); Pasiouras et al. (2009); Barth et al. (2013); and Sakouvogui and Shaik (2020). This critical step has been omitted in empirical studies, which often focus exclusively on hypotheses testing on the estimation sample, i.e. the evaluation of capital adequacy requirements on the economic efficiency measures of financial banks (Sakouvogui and Shaik, 2020). However, determining the out-of-sample predictability of capital adequacy requirements on the economic efficiency measures of financial banks depends greatly on the variables used.
With our interest lying on the out-of-sample predictability, it is thus important to guarantee that a significant conditionality of forecast performance of capital adequacy requirements on the economic efficiency measures is not due to omitted variable bias. Fortunately, recent papers by Barth et al. (2004); Pasiouras et al. (2009); Barth et al. (2013); and Sakouvogui and Shaik (2020) have concluded in addition to capital adequacy requirements that Basel Accord III, bank size, Dodd Frank Act, and annual state gross domestic product, are determining factors that are correlated with cost efficiency measures and have the additional advantage of usually exhibiting considerable association with the banking cycle.
Henceforth, this paper contributes to the literature by addressing the outof-sample predictability of capital adequacy requirements on the economic efficiency measures in four aspects. First, the economic efficiency measures of financial banks are estimated while accounting for the temporal (by year) variation. The temporal variation accounts for the heterogeneity effects present in the financial banking sector, and thus, could help banks and policy-makers make sound financial decisions.
Second, this paper adds to the sparse literature of capital adequacy requirements of financial banks by employing tools from machine learning technique in addition to the benchmark OLS regression model. Specifically, this paper uses Support Vector Regressions (SVRs) with Linear, Polynomial, and Radial Basic Function (RBF) kernels to investigate the predictability of capital adequacy requirements. According to Anderson and Audzeyeva (2019), "SVRs are data-driven, non-parametric models, are that (a) unlike linear models, they do not require strong a-priori assumptions about the relationship between the cost efficiency measures and exogenous variables, and (b) they are, by design, better able to allay the issue of over-fitting inherent in the standard multivariate linear regression techniques." Third, this paper extends the model comparison of the performance of OLS and SVRs models by: a) applying reality check of Diebold-Mariano Test for forecast evaluations in the evaluation of capital adequacy requirements; b) estimating mean absolute deviation (MAD) and root mean square error (RMSE); and c) determining the best model using the Model Confidence Set (MCS) methodology of Hansen et al. (2011). 3 Our last contribution is empirical in nature. One prominent feature of this paper is the emphasis on a model's out-of-sample prediction performance using a sample period of December quarterly data of 10,380 U.S Commercial and Domestic banks that spans from 2008 through 2019. Hence, in using the long sample period, this paper minimizes the correlation between training and testing sets while allowing for a more robust estimate. To the best of our knowledge, this is the first paper that applies the SVRs methodology in the context of capital adequacy requirements.
The remainder of this paper is structured as follows: Section 2 introduces the DEA and SVRs models. Section 3 presents the empirical data set including input and output variables. Section 4 discusses results and implications. Section 5 summarizes the paper and provides additional discussion.
2 Theoretical framework

Data Envelopment Analysis
The theoretical dual cost theory assumes that the relationship between multiple producing output quantities, y = (y 1 , y 2 , ..., is reflected by the concept of cost function. The cost function of an i th bank at time, t, can be expressed as: where T C it is the total cost of an i th bank at time, t. y it is the vector of output quantities of an i th bank at time, t. w it is the vector of input prices of an i th bank at time, t associated with the input variables, x it . Equation (1) can be estimated using the linear programming Data Envelopment Analysis (DEA) while accounting for temporal variation, t. In this paper, an input-oriented DEA model is used because financial banks have better control of the inputs than outputs. Therefore, following Färe et al. (1985) and , the cost minimization DEA model of an i th bank can be defined as: =minimize where i = 1, . . ., n is the number of the bank. o = 1, . . . , O is the number of inputs used by the banks. j = 1, . . . , J measures the number of output. x * o is the cost minimizing vector of input quantities for the evaluated bank, given the vector of input prices, w o , and output quantities, y j .
DEA efficiency measures differ depending on the scale assumptions that underpin the model (equation 2). In the literature, two scale assumptions are generally employed: constant returns-to-scale (output will change in the same proportion as inputs are changed) and variable returns-to-scale (production technology may exhibit increasing, constant and decreasing returns-to-scale). Thus, if the variable returns-to-scale's convexity condition for the weight, λ i , is included in the constraint equation 2, then n j=1 λ i =1. Without the convexity constraint, n j=1 λ i =1, equation 2 represents the constant returns-to-scale. In the estimation of cost efficiency measures (equation 2), and to avoid bias of scale due to the cost efficiency measures, the scale efficiency measures, computed as the ratio of the DEA efficiency measures estimated under constant returns-to-scale over pure technology estimated under variable returns-to-scale are estimated in equation 2.
After the estimation of DEA's cost efficiency measures (equation 2), this paper evaluates the out-of-sample of capital adequacy requirements. To derive the magnitude of predictability of different factors influencing the DEA's efficiency measures, a simple SVRs model of Cortes and Vapnik (1995) with Linear, Polynomial and RBF kernels were used and compared to the benchmarking, OLS model, widely used in the literature. We review the primal formulation of SVRs, its dual form for quadratic programming optimization and kernels functions in Section 2.2.

Support vector Regression
For a training set, Z=(x 1 , 1 )....., (x n , n ), with predictor variables variables, x i ∈ R n and positive cost efficiency measures, i ∈ R n + , the objective of SVRs is to find a function to predict the cost efficiency measures (target variable), . To fulfill this objective, SVRs consider a linear estimation function of predictive models for q-step-ahead forecasts of space n at time, t, represented by: where φ(i) : R n → R n + and weight, w, and constant, b t , are the parameters learned from the training data. Φ is a kernel function that maps the input data vector, x, into a higher dimensional space representing that can better predict the target variable, .
SVRs implement structural risk minimization with the purpose of constructing models with reliable out-of-sample performance (Vapnik, 1995). Instead of empirical risk minimization, which minimizes the error on observed data in OLS model, sum of square error linear regression, minimize ) 2 , SVRs seek to minimize an upper bound on the generalization error. This is done by mapping nonlinear input, x, through kernel function, Φ, into a higher dimensional feature space by finding the optimum weight, w of the empirical risk minimization under ε-insensitive loss function, . Furthermore, for predicted values outside of the ε-tube, Cortes and Vapnik (1995) introduced the slack variables, ξ and ξ , which allow for model complexity, and a vector of weights that minimizes the regularized empirical risk function defined as: where 1 2 w T w accounts for the model complexity and C is the cost or regulatization parameter that measures the trade-off between the model complexity, 1 2 w T w, and the bandwidth, ε. The two positive slack variables, ξ and ξ , represent the distance from actual values to the corresponding boundary values of ε-tube.
The predictions of the parameterized function can violate the constraint, but at a cost proportional to C. With this so called "double-hinged loss function, the loss will be zero when | i − w T Φ(x)| < ε and increase linearly at the rate C for points where the predicted value falls outside the ε-insentive region. In addition, with the primary objective of SVRs being to minimize the training error 4 and provides robustness against parameter-driven model over-fitting through the judicious choice of the regularization and bandwidth parameters (Anderson and Audzeyeva, 2019), using Lagrange multipliers, the quadratic dual can restate the optimization problem as: where 0 < α i , α i ≤ C; ı = 1, . . . , n, α i and α i are Lagrange multipliers. The coefficients, α i and α i , have an intuitive interpretation as forces pushing and pulling the regression estimate towards the measurements, i . Here the bandwidth parameter, ε, determines an ε-insensitive region for characterizing empirical risk, and the regularization parameter, C.
As for the nonlinear cases, SVRs algorithm can be made nonlinear by simply mapping the original problems into a high dimension space, in which dot product manipulation can be substituted by a kernel function, Φ. Table 1 summarizes the kernel functions, Linear, Polynomial, and RBF, used in this paper with their associated parameters. 5 In Section 2.3, we show how to accommodate the cross-validation technique.
Each kernel function requires a choice of the cost parameter, C, and the bandwidth parameter, ε. For the RBF kerne, φ controls the radius of influence of individual observations. Furthermore, larger φ reduces the radius of influence. For the polynomial kerne, d determines the degree of the linearity response, a is the location of the point where the kernel function value is zero, and γ moderates the sensitivity to the nonlinear interaction term while s determines the location of the zero response.

Cross-Validation Technique
This section outlines the procedure used for the selection of tuning parameter candidates. The model's accuracy depends on the hyper-parameters: cost parameter, C, and bandwidth parameter, ε, along with any kernel-specific parameters. But how do we know that we have used the best possible value for SVR-Linear, SVR-Polynomial, and SVR-RBF models to deal with over-fitting? A solution to the problem of identification of optimal values for the hyperparameters is cross-validation.
Cross-validation is a way to test several combinations of hyper-parameters to identify the optimal values. This method works on the training ser. However, if we train the model on the entire training data then over-fitting is inevitable. Thus, to guard against over-fitting, there is a dire need to extract a hold-out set from the training set. Furthermore, in order to conduct useful and statistically valid assessment with cross-validation, care must be taken due to the correlation present between a single variable and lags of the variables in training and testing sets. Fortunately, according to Sakouvogui and Nganje (2019) and Anderson and Audzeyeva (2019), there are many straightforward techniques for conducting cross-validation with auto-correlation, see for example, (Burman et al., 1994;Hart and Yi, 1998;Racine, 2000;Hart and Lee, 2005;and Carmack et al. 2009). Henceforth, the h-block cross-validation of Burman et al. (1994), a generalization of the well known leave one out cross-validation approach was conducted by withholding blocks of data when estimating parameters associated with SVRs. Burman et al. (1994) have shown that the technique can produce statistically consistent results for model and variable selection in the presence of auto-correlation.
Following Burman et al. (1994), assume that Z is a set of jointly dependent stationary observations and z i the i th row (observation) of Z, then the covariance between z i and z i+j depends only on j and approaches 0 as |j − i| → ∞ (Figure 1). 6 To apply the h-block cross-validation, for each observation, i, we remove h observations on either side of the observation z i leading to a remaining local training set of size, n c = n − 2h − 1. 7 Hence, the algorithm is trained on the local training set, n c . The h-block cross-validation function is given by: Hold-out Hold-out For Estimation For Estimation zi h F orT esting

Figure 1: H-block cross-validation
The training data encompasses the estimation and hold-out sets. Henceforth, the optimal parameters of SVRs and cross-validation are found using the training set.

Data and construction of the variables
In this paper, following Sakouvogui and Shaik (2020), a total of over 10,380 December quarterly observations of United States' commercial and domestic banks was selected from 2008 to 2019. The data are from the Federal Financial Institutions Examination Council based on the Council Form 041 Report of Condition and Income of U.S. commercial and domestic banks that report to the Federal Reserve Board. In the selection of output quantities and input prices pertinent to the economic efficiency measures present in the DEA model (equation 2), we follow the intermediate approach of economic theory presented in Pessarossi and Weill (2015) and Sakouvogui and Shaik (2020). 8 Two output quantities are selected, total loans (y 1 ) and other earning assets 8 A full description of data and methodology underlying the construction of the aggregate data appears in Sakouvogui and Shaik (2020). The intermediate approach suggests that banks collect deposits to transform them into loans and capital. Otherwise, there are two additional approaches pertinent for the selection of input and output variables, Production of Sherman and Gold (1985) based on production theory, and Profitability of Drake et al. (2006) based on profit theory.
(y 2 ) with three input prices, price of labor, (w 1 ), calculated as the ratio of personnel expenses to total assets, price of physical capital, (w 2 ), calculated as the ratio of other operating expenses to premises and fixed assets, and price of borrowed funds, (w 3 ), calculated as the ratio of interest expenses to total deposits. 9 The dependent variable total cost, (T C), is calculated as the sum of interest expenses, personnel expenses, and other operating expenses. We additionally impose homogeneity conditions by normalizing respectively T C, w 1 and w 2 by w 3 .
Furthermore, in the predictive analysis of capital adequacy requirements, we employ exogeneous variables, motivated by economic theory, that have been previously analyzed. Accordingly, the potential exogenous factors affecting the cost efficiency measures include capital adequacy requirements (Sol), bank size (size), state GDP (GDP), Basel Accord III (Basel3) and Dodd Frank Act (Dood). Equation (3), rewritten into equation (8) to account for the construction of SVRs predictive models for 3-step-ahead forecasts of i th bank at time, t, is: where it is the cost efficiency measures estimated using equation (2). 10 Table  3  As presented in Sakouvogui and Shaik (2020), the definitions of the variables are as follows. Total loans is the sum of loans secured by real estate, loans to finance agricultural production and other loans to farmers; loans to finance commercial real estate; construction; and land development activities; loans to individuals for household; family; and other personal expenditures; loans to individuals for households, family, and other personal expenditures: credit cards; Other construction Loans. Other earning assets consists of balances due from the bank, inter-bank loans, investments, and securities. Price of labor is the price associated with the sum of all wages paid to employees, as well as the price of employee benefits. Personnel expenses include salaries and employee benefits. Total asset is the sum of total loans and leases, total securities (HTM), total securities (AFS), trading assets, total intangible assets, other real estate owned, all other assets minus Allowance for loan and lease losses. Price of physical capital is the price of maintaining building. Other operating expenses is the sum of Goodwill impairment losses, amortization expenses and impairment losses for other intangible assets. Fixed assets are assets which are purchased for long-term use and unlikely to be quickly converted into cash. Price of borrowed funds is the price of associated with borrowing money. The total interest expense is the sum of the interest expense. Total deposit is the sum of all domestic deposits including demand, saving and fixed deposits minus noninterest bearing and interest bearing. Total equity is the total holding company or bank equity capital, including paid-up capital, share premiums and reserves.
10 Capital adequacy requirement, defined as log( T otal equity T otal assets ), is the capacity of banks to face difficulties during the downturn. Bank size measures the natural logarithm of total amount of assets owned by banks. Basel Accord III (basel3), banking regulation agreements proposed in 2010 and implemented from 1st of January of 2013 till 1st of January of 2019, is a dummy characterized by 1 since implementation =2013 and 0 otherwise https : //www.f ederalreserve.gov/supervisionreg/basel/U SImplementation.htm#Basel I II T ools. State GDP, defined as the total market value of all the finished goods and services produced within a state, the natural logarithm of GDP https : //www.bea.gov/. cial and domestic banks variations are, nevertheless, apparent. Such variations primarily reflect differences in specific banks.

Empirical Estimation Framework
The empirical application of our paper is straightforward and entails a four-step process and is completed as follow: 1. The first step involves the estimation of the DEA' cost efficiency measures of 10,380 December quarterly observations of United States' commercial and domestic banks from 2008 to 2019 while accounting for the temporal variation.
2. The second step is concerned with the specification of the predictive SVRs with its kernel functions and tuning parameters. Therefore, we partition the 10,380 December quarterly observations of United States' commercial and domestic banks into training and testing sets. For the out-ofsample predictability analysis, we follow the framework of Anderson and Audzeyeva (2019) and thus allocate for training set, the quarterly data from December 2008 through December 2015 (8,063 consecutive observations) available in the data sample period. The forecast accuracy of SVR-Linear, SVR-Polynomial, and SVR-RBF and OLS models were determined using the remaining quarterly data, testing set, from December 2016 through December 2019 (2,317 consecutive observations).
3. Using the guidelines of Burman et al. (1994) support h = 76, for our training dataset of 7,912 observations, the estimations of the tuning parameters of the SVR-Linear, SVR-Polynomial, and SVR-RBF models were derived. Henceforth, SVRs models were trained using the quarterly observations from December 2008 through December 2013 (estimation set) and to find its optimal parameters, we used the remaining of the training data from December 2014 through December 2015 (hold-out set). In doing so, our paper provides estimates of the optimal tuning parameters associated with SVR-Linear, SVR-Polynomial, and SVR-RBF models using the grid search method.
4. Lastly, the out-of-sample forecasting accuracy is subsequently assessed, permitting the selection of a subset of best performing models. The forecast accuracy of SVR-Linear, SVR-Polynomial, and SVR-RBF and OLS models were determined using the testing set composed of quarterly observations from December 2016 through December 2019.

Empirical Application and Results
In the evaluation of economic efficiency measures forecasting accuracy of commercial and domestic banks, two steps were completed. First, estimations of the DEA' cost efficiency measures of commercial and domestic banks from 2008 through 2019. Second, based on the estimated cost efficiency, SVRs with three kernels, Linear, Polynomial and RBF, were estimated in equation 7 and compared to the forecasting accuracy of OLS model. The implementations of SVRs and OLS models and related functions used in this paper are scripted in the R statistical language.

Cost Efficiency Measures
In this paper, using the input-oriented DEA model based on the theoretical cost function in equation (2), CRS, VRS, and the scale efficiency measures are estimated while accounting for the yearly variability and thus, for technological changes. Table 4 presents the annual summary statistics of CRS, VRS, and scale efficiency measures of December quarterly data from 2008 through 2019. In Table  4, four interesting results emerge. First, the minimum and maximum efficiency measures of DEA models under CRS, VRS, scale are high. This is expected due to the lack of random noise in DEA model and the suggestion that banks are generally efficient. Second, the mean efficiency measures fluctuate slightly throughout the years. The yearly average of efficiency measures ranges between 0.594 to 1.000 for CRS, 0.657 to 1.000 for VRS and 0.631 to 1.000 for the scale efficiency measures. Third, we additionally note that in column 3 of Table 4, for large standard deviations, there exists a large variation between the minimum efficiency measure in that year and the average efficient measures. Furthermore, a comparison of the ratio suggests that VRS technology overestimates on average and this is suggested by the scale efficiency measures.

Stationarity of cost efficiency measures
The analysis of stationarity of time series is very essential in the out-of -sample analysis of SVRs and OLS models. This is primarily important because a nonstationary series can lead to spurious regression results with bias asymptotic properties. With most unit root tests of time series, the Augmented Dickey-Fuller unit root test of Dickey and Fuller (1979), a widely used test in the literature is used. Thus, given time series observations, 1 ... n , the presence of a unit root in the data generating mechanism is obtained by fitting the regression equation: where ∆ t = t − t−1 , p= 1, ..., 5 predictors, ρ is the OLS estimate, t is independent and identically distributed with mean, 0, and variance, σ 2 . Under, the null hypothesis, H o : ρ=1, and withρ n been the estimator of ρ, Dickey and Fuller (1979) proposed the studentized statistic, t n =ρ n −1 Std(pn) , whereŜtd(p n ) is the standard deviation of the estimatorρ n . In applying the Dickey and Fuller test, there is significant statistical evidence to suggest that the cost efficiency measures are stationary at the 5% significance level. Additionally, Table 5 presents the Spearman rank correlation between the dependent and independent variables. From Table 5, a significant and negative Spearman rank correlation was found among cost efficiency measures, and the exogeneous variables.

Optimization of the tuning parameters
While the optimization technique of SVR-Linear, SVR-Polynomial, and SVR-RBF models, is theoretically sound and relatively straight forward to implement, literature has emphasized that the performance of SVRs is highly sensitive to the selected kernel and tuning parameters (Song and Zen, 2008;Sakouvogui, 2015;and Anderson and Audzeyeva, 2019). Therefore, researchers tend to resort to applying a grid-search optimization technique, in conjunction with a metric characterizing the goodness-of-fit, typically based on MAD and RMSE, see, for example, Min and Lee (2005), and Gunduz and Uhrig-Homburg ( However, in the presence of serially correlated data series, the standard SVRs forecasting methodology may be unsuitable due to the failure of the standard cross-validation schemes to interpret the serial correlation as a high frequency relationship with small variance, leading to spurious parameter choices (Brabanter et al. 2011). Therefore, in adopting the h-block cross-validation to deal with the serial correlation, we employ only the training set at this stage. To guard against overfitting, we extract an estimation and a holdout sets from the training data. The models are trained using the estimation set composed of 5,280 consecutive observations, and the hold-out set composed of the remaining 2632 last consecutive observations are used to test the model. Henceforth, the hyper-parameter combination that gives the best model statistics, RMSE and MAD, for the holdout set is considered as the optimal. Table 6 shows the values of the MADs and RMSEs at various optimal values of SVR-Linear, SVR-Polynomial, and SVR-RBF models. The optimum parameters for each SVR model are given in Table 6 with the performance criteria of MAD and RMSE. The results of Table 6 suggest that the optimal values of the kernel functions allow the SVRs to avoid over-fit and under-fit. An appropriate value of the kernel parameters produces the minimum RMSE and MAD and hence prove to be the best possible value and thus provide the best optimal parameters for possible forecasts.

Predictive Analysis
After optimizing the parameters of SVRs models in Table 6 based on the training set, our next step is then to forecast based on the unseen testing set of quarterly data spanning from December 2016 through December 2019. Table 7 presents For optimization the SVRs parameters and kernel functions, n the total hold-out set, i andˆ i are respectively the true and predicted cost efficiency measures of the hold-out sets. It is important to note that the cost C and the shape of the kernel function directly influences the values obtained by the SVRs and thus should be optimized. For this purpose, a training set is divided into two new sets, estimation set-used to choose the optimal parameters, and hold-out set -used to validate the smallest error possible. For prediction in section 4.4, n the total test samples, i andˆ i are respectively the true and predicted cost efficiency measures of the test samples.
the results of prediction performance, MAD and RMSE, for OLS and SVRs models.
In comparison of Table 6, the application of SVR with linear, polynomial and RBF kernels in Table 7 reveals that the performance criteria of MAD and RMSE are smaller using the test data. This behavior is not always expected, as the models are optimized for the training set. Furthermore, to evaluate the performance accuracy of OLS and SVRs models, the Diebold-Mariano test (1991 and1995) of forecast errors was used. Therefore, consider two forecasts,ˆ it andˆ jt of the time series it with the associated forecast errors,ê it andê jt , we wish to assess the expected loss associated with each of the forecasts. The Diebold-Mariano test is based on the loss differential, d t =ê it -ê jt . Under the null-hypothesis of equal predictive accuracy, E[d t ]=0, the Diebold-Mariano test for the 3-step forecasts assumed to be computed for t= t 1 ,...,T for a total of T 0 forecasts is: . In comparing the results of SVR-Linear, SVR-Polynomial, and SVR-RBF models with the predictions obtained with the OLS model of Table 7, SVRs models regularly confirm their predictive powers. In addition, Table 7 presents the results of loss differential time series using the Diebold-Mariano test. At the horizon of three quarters ahead, actually the Diebold-Mariano test statistically rejects the null hypothesis of the forecast errors being the same. To further illustrate SVRs prediction capabilities, Figure 7 presents a three-dimensional plot of the testing set spanning from December 2017 through December 2019 for actual DEA cost efficiency and the prediction of DEA cost efficiency measures with OLS, SVR-Linear, SVR-Polynomial, and SVR-RBF models.

22
Using the performance criteria of MAD and RMSE, Table 7 further shows that there is no persuasive evidence for singling out the best models among OLS, SVR-Linear, SVR-Polynomial, and SVR-RBF that may be most suited for modeling the cost efficiency measures. That is, the MAD and RMSE values show that the forecast for the various models' cost efficient measures deviates from the realized cost efficiency measures. Therefore, a most relevant question in analyzing the competing models that we face while guarding against data snooping is: Which model is the best? Effectively, comparing several SVR-Linear, SVR-Polynomial, and SVR-RBF to a benchmark, OLS model, may result inspuriously identifying a superior model just by chance. To account for this data snooping issue, we adopt the Model Confidence Set (MCS) proposed by Hansen et al. (2011) and based on the seminal papers of Hansen (2005) and White (2000) for selecting a subset of a group of models whose members likely have the best forecasting accuracy. In the selection of the best model, MCS seeks to find the group of models that are equally likely to be superior. The MCS test has a number of important advantages over widely used alternative tests of Diebold and Mariano (1995), White (2000), Hansen (2005), Hansen, and Lunde (2010), and Harvey and Liu (2019). First, MCS test provides a measure of uncertainty surrounding model selection. Second, MCS increases as the level of confidence increases. Third, unlike alternative tests of pair-wise model comparison, MCS test requires less information about the optimal forecasting model.
Let the squared error loss function for the model j predictionˆ j ,t of t to be given by L j ,t =L( t ,ˆ j ,t )=(ˆ j ,tt ) 2 . Henceforth, the best model is the model whose forecasts produce the minimum expected loss. Hansen et al. (2011) define the measure of relative model performance, ν j j , finite and independent of t, as: ν j j ≡ E(L j ,t -L j ,t ) for all j , j ∈ M 0 , a finite initial collection of forecasting model. With model j preferred to model j when ν j j < 0, define the set of superior models as: M * ≡ {j ∈ M 0 : ν j j ≤ 0 ∀j ∈ M 0 }.
With the objective of determining M * , the test procedure estimates, M * 1−α , of confidence set at level, α, through a sequence of significance tests with null hypothesis, H 0,M : ν j j =0 for all j , j ∈ M and M ⊂ M 0 . We denote the alternative hypothesis, H 1,M : ν j j = 0 for all j , j ∈ M . Therefore, with the selection process of the superior model starting with the allocation of the initial set of models M 0 to the set M , if the null hypothesis that the models are equally good is rejected, an elimination rule is employed to remove an inferior model from the set of models. This process is then repeated until it reduces the coverage ratio, (1 − α), below the specified confidence level. Hence, when the null hypothesis is statistically not rejected at the level, α, the remaining set of models is the MCS, M * 1−α . Consequently, MCS procedure delete a model only if it is found to be significantly inferior to another model. Table 8 presents the results of MCS for the collection of model candidates, OLS, SVR-Linear, SVR-Polynomial, and SVR-RBF, and thus providing evidence that the out-of-sample forecasting evidence confidently identifies superior predictive accuracy of the SVRs based forecasts over OLS model. MCS identi-fies SVR-RBF from a collection of model candidates, OLS, SVR-Linear, SVR-Polynomial, and SVR-RBF, as the best model with a given level of confidence. Additionally, Table 8 presents the ranks of the model forecasts ordered by MCS p-value, with SVR-RBF more likely to generate the most accurate forecasts listed first, SVR-Polynomial listed second, SVR-Linear listed third, and finally OLS model.

Challenges and Conclusions
The evaluation of the capital adequacy requirements affecting the economic efficiency measures of U.S commercial and domestic banks has been addressed in this paper using a two-step approach analysis. First, the cost efficiency measures are estimated using the nonparametric DEA model. Second, this paper examines the application of OLS and SVR models to forecast the performance of capital adequacy requirements, in addition to regulatory, macroeconomics, and bank characteristics, affecting the economic efficiency measures. Our sample consists of 10,380 December quarterly observations of United States' commercial and domestic banks covering the period of 2008 through 2019. Henceforth, a coherent framework for producing a set of highly accurate SVRs models for forecasting the cost efficiency measures is present in three steps. First, an approach for setting robust parameter values for SVR-Linear, SVR-Polynomial, and SVR-RBF while accounting for auto-correlation is presented. Second, the out-of-sample forecast of SVR-Linear, SVR-Polynomial, and SVR-RBF models are compared to the out-of-sample forecast of OLS model. Finally, our paper adopts the MCS test to select a subset of most accurate models, OLS, SVR-Linear, SVR-Polynomial, and SVR-RBF.
Our results of the SVRs approach present better forecasts of cost efficiency measures with a small set of predictors limited to the capital adequacy requirements, state gross domestic products, Basel Accord III and bank size, outperforming the linear-regression-based benchmark. The evaluation of a three quarter-ahead out-of-sample performance of SVRs forecasts provides evidence that our approach identifies a relatively small set of SVRs models with a superior out-of-sample forecasting ability in economic and statistical terms relative to the classical OLS model utilized literature. Hence, our results provide direct evidence that highly flexible SVRs models provide an alternative, promising technique and may be better suited than linear models, routinely employed in the literature. Furthermore, our approach accommodates novel economic applications characterized by serially correlated data.
There are, however, limitations that future researchers could improve. First, despite considering only predictive models of SVRs and OLS, the SVRs models allow for testing of many kernel functions, while this study is limited to only the three most common in the literature. Another limitation of these research results is the length of the periods of historical financial banks considered. Future studies may include a larger number of year and thus, consider the impact of 2007-2008 financial crisis. The results of the efficiency measures and tuning parameters of the SVRs kernel functions could vary. Finally, future studies could incorporate regional classification. These forecasting results could vary depending on the twelve federal districts of the U.S.

Declarations
Availability of data and materials: The data are fully available from the Federal Financial Institutions Examination Council.