Efficacies of artificial neural networks ushering improvement in the prediction of extant credit risk models

Abstract The study’s objective is to check whether the predictive power of Machine Learning Techniques is better than Logistic Regression in predicting the bankruptcy of firms and that the same predictive power of ascertaining bankruptcy improves when a proxy for uncertainty is added to the model as a default driver. We considered the covid pandemic a black swan event that had caused ambiguity. A significant factor that has increased the probability of bankruptcy in recent times has been the large-scale supply chain disruptions and crippling lockdowns. Firms are trying to get back to pre-Covid utilization of plant capacity or pivot their business models differently to seize newer opportunities amidst the crisis. We considered the change in operating expenditure (primarily decrease) as our proxy for uncertainty as firms were forced to cut down majorly on their operations and thus incurred lesser variable costs. In an economy showing inflationary trends, the operating expenses will generally increase. But we found that the operational costs had shown a dip in the case of many of the firms during FY 20–21, and we attributed it to Covid disruptions. Results show that Machine Learning Techniques are better than Logistic Regression in predicting the bankruptcy of firms and that the same predictive power of ascertaining bankruptcy improves when a proxy for uncertainty is added to the model.


Introduction
Bankruptcy is an economic condition of an entity when it cannot fulfill its debt obligation to its stakeholders. The inability to pay interest on the debt and overdrawn bank accounts are some initial signs of financial stress, eventually leading to bankruptcy. Coats and Fant (1993) have found that bankruptcy is only one outcome of financial distress. Others include reorganization, liquidation, and acquisition by a viable firm. The predictions of business failure started with a simple evaluation of accounting ratios using univariate discriminant analysis. This step was followed by the multivariate linear discriminant analysis developed by Altman. This method has evolved over a while, with techniques such as probit/logit, reduced-form models, heuristics, and, more recently, machine learning algorithms gaining importance. The universe of data elements used in credit risk models, which are in a position to discriminate between defaulting and non-defaulting companies, is enormous. These measures can be categorized into variables from financial statements and statistical values, variables using information about the company and its environment, and variables using market data.
Enough models have been developed with a combination of data elements but have failed to predict default accurately. The problem appears to be in the design of the models-the technique used, the choice of variables used as default drivers, the weightage assigned to the variables, and the benchmark values, which has resulted in the models being ineffective. Recent studies have shown that the techniques used in the field of artificial intelligence can be a suitable alternative to traditional statistical methods because they do not apply past assumptions on the distribution of background data or the structure of the relationships between the variables involved. While statistical models require the researcher to specify the functional relationships between dependent and independent variables, non-parametric techniques enable the data to identify the functional relationships between the variables in the model. Today, businesses work in a VUCA world, exposed to Volatility, Uncertainty, Complexity, and Ambiguity, constantly changing, challenging to predict, and having multi-layered problems where nothing is black or white. Businesses often face distributions that are anything but typical, with fatter tails and black swan events. The world is yet to come out of Covid and its related disruptions. Many businesses face the risk of bankruptcy. In such a world, complexities and simplicities must co-exist.
In contrast, previous studies on models have revealed that none consider a proxy for uncertainty as a default driver. They assume that "Credit Risk Analysts" are rational and fully know the unknown future and its associated probabilities. This phenomenon leads us to believe that credit risk management is simple, and that portfolio risk can be calculated, priced, and hedged because of symmetry and normality.
In this study, we intend to contribute to the literature by suggesting that a proxy for uncertainty (the effect of COVID-19 ambiguity (as a temporary shock)) be considered an additional default driver for predicting bankruptcy. Also, we intend to suggest that machine learning techniques may be better than statistical techniques like logistic regression to predict bankruptcy.

Literature review
The literature on bankruptcy prediction dates to the 1930s, beginning with the initial studies concerning the use of ratio analysis to predict the future bankruptcy of firms. We found that research up till the mid-1960s focused on univariate analysis (single factor/ratio analysis). While analyzing the financial ratios of bankrupt and nonbankrupt firms, FitzPatrick (1932) found a significant difference between their turnover, liquidity, and debt ratios and suggested that less importance should be given to the Current and Quick ratios for firms with long-term liabilities. Analyzing the financial ratios in pairs of 183 failed firms from various industries, Smith and Winakor (1935) found that the Current ratio reduced as the firm approached bankruptcy. Merwin (1942) found that failing firms displayed signs of failure as early as 4 to 5 years before the bankruptcy.
In his comparison of the financial ratios of profitable and unprofitable firms, Jackendoff (1962) found the Current Ratio to be better for profitable firms. Beaver (1966), in his univariate study, compared the mean values of 30 ratios of 79 failed and 79 non-failed companies in 38 industries and found Net Income to Total Debt to be the best predictor. In his suggestions for future research, he indicated the possibility that multiple ratios considered simultaneously may have higher predictability than single ratios-and so began the evolution of bankruptcy prediction models. The first multivariate study was published by Altman (1968). He predicted bankruptcy in Manufacturing firms using Discriminant Analysis. Since then, various methods such as Multivariate discriminant analysis (MDA), Probit analysis, Logit analysis, and Neural network have been used to develop models for predicting bankruptcy.
Analyzing the various bankruptcy studies from 1972 to 2001, Jodi Bellovary et al. (2007) found 165 studies involving thousands of firms. In all these studies, the factors considered ranged from 1 to 57. A total of 752 different elements are used in all these studies combined.
Theoretical arguments for using logistic regression with maximum likelihood estimation than using linear discriminant analysis in both the classification problem and the problem of relating qualitative to explanatory variables were presented by Press and Wilson (1978). Zmijewski (1984) examined two potential biases caused by the sample selection/data selection process used in most financially distressed studies and derived that the potential benefit of using these approaches appeared to be in estimating the sample probability distribution. Frydman et al. (1985) used 20 financial ratios to present a new classification procedure, Recursive Partitioning Algorithm (RPA), for economic analysis. They found that it had immense positive attributes in the case of corporate distress issues and compared it well with discriminant analysis within the context of firm financial distress.
Using five financial states-financial stability, omitting or reducing dividend payments, technical default, default on loan payments, and protraction under Chapter X or XI of the Bankruptcy Act, Lau (1987) approximated the continuum of corporate financial health and estimated the probabilities that a firm will enter each of the five economic states. Pacey and Pham (1990) used Multiple Discriminant Analysis (MDA) and Logit/Probit techniques to address three methodological problems in an earlier Bankruptcy model. They used seventeen quantitative variables on sample data from 1966 to 1986 of Australian companies. They found no evidence of a significant difference in model explanatory power adduced by the choice of either factor analysis or the stepwise procedure.
Using Logistic Regression and Neural Network Computing, Bell et al. (1990) found that the neural network model performed as well as the logistic regression model. Coats and Fant (1993) attempted to improve the prediction rate of models by contrasting neural network predictive accuracy with that of discriminant analysis. They concluded that both models could not supplant human judgment in the credit decision process.
The utility of a neural network as a bankruptcy prediction system was discussed by Tsukuda and Baba (1994). They used the financial data of manufacturing companies three years before failure. They found the predictions matching accurately only for a small subset of companies Wilson, with his co-authors R. L. Wilson and Sharda (1994) and N. Wilson et al. (1995), compared the predictive performance of classical multivariate discriminant analysis to that of a neural network for a firm's bankruptcy. They developed a model using neural networks to predict three corporate outcomes: failure, non-failure, and distress. The results showed that the model correctly predicted 100% of non-failed firms and 97.5% of failed firms, with an overall accuracy of 98.5%. Rudorfer (1995) conducted similar research for Austrian companies with five balance sheet ratios for 82 companies and achieved 90% accuracy.
A comparative study on logit and neural networks for bankruptcy prediction using cash-based ratios for the oil and gas industry was conducted by El-Temtamy (1995). He found the performance of neural networks trumping that of logit models. Similarly, Leshner and Spector (1996) compared the performance of various neural network models differing in their data span, learning technique, and the number of iterations. The results showed that interpreting these neural models had better predictive powers than Altman's discriminant analysis Z-model. Begley et al. (1996) compared Altman and Ohlson's models and found that they could not perform well in recent periods. While comparing with their respective re-estimated models, it was found that Ohlson's original model displayed the most robust overall performance.
Working with five financial ratios to extensively predict with 81% accuracy using the Kohonen maps/Self Organising Feature Maps (SOFM), Martin-Del-Brio and Serrano-Cinca (1995) showed that neural network systems could be integrated into broader decision-making. Refenes et al. (1996) constructed a model of implied volatility. They provided incremental value by extending the model to capture residual non-linear dependencies. They took data from Spain's principal stock exchange, Ibex, and applied neural applications in option pricing, cointegration, term structure of interest rates, and model of investor behavior. They found that the solutions required further rigorous statistical foundations. Zhang et al. (1999) attempted to bridge the gap between theoretical development and the real-world application of artificial neural networks (ANN). They concluded that ANN was the only known method that estimated posterior probabilities directly when underlying group population distributions were unknown.
These predictive studies continued to gain momentum, and the last two decades have seen models progressing more toward machine learning techniques. Platt and Platt (2002) deployed a logit model to predict financial distress among companies in the automobile supplier industry. The results showed that 98 percent of all firms in the population were correctly classified, Pohar et al. (2004) identified the two most widely used statistical techniques for analyzing categorical outcomes, i.e., linear discriminant analysis and logistic regression. They compared the techniques to set some guidelines to enable choosing between methods. Chava and Jarrow (2004) used a combination of financial and stock market-related in Shumway's (2001) model. They found that it had superior forecasting performance and demonstrated the importance of including industry effects in hazard rate estimation, as opposed to Altman (1968) and Zmijewski (1984). With the usage of ten ratios, Gepp and Kumar (2008) suggested that survival analysis techniques provide more information that can be used to further the understanding of the business failure process. Iazzolino et al. (2013) looked at both quantitative (solvency/liquidity/profitability/interest coverage/efficiency) as well as qualitative data (human/structural/relational capital). They showed the importance of taking into account some aspects of intangible assets in credit risk evaluation.
A study by Cerchiello and Nicola (2018) found that temporal dynamics and spatial differentiation matter in news contagion. They analyzed the contagion pattern in the information flow related to the characteristics and environment in which the entities of interest are operating. They used a modified version of the topic model Linear Discriminant Analysis (LDA) -structural topic model (STM) to investigate the causal effect in the diffusion of the news.
Vochozka, with his co-authors, Vochozka, Vrbka, et al. (2020)(, effectively combined traditional methods with advanced artificial intelligence techniques to improve the effectiveness of the models. They created a methodology for the identification of company bankruptcy. They used artificial neural networks with at least one long short-term memory (LSTM) layer. They found that the LSTM models produce excellent results in the Area of dynamic counting and that neural networks as a tool could smoothen the time series data. Rowland et al. (2020) showed that multilayer perception networks are more efficient than radial basis function neural networks.
Giudici, with his co-authors, Giudici (2001) and Giudici et al. (2020), found that the Bayesian method coupled with Markov Chain Monte Carlo computational techniques can be successfully employed in the analysis of high dimensional complex data sets. They augmented traditional credit scoring methods with alternative data derived from similarity networks among borrowers, deduced from their financial ratios. The results showed the proposed approach to have better prediction accuracy. Bussmann et al. (2021) proposed an Artificial Intelligence (AI) model for measuring risks that arise when credit is borrowed using peer-to-peer lending platforms. This model applied correlation networks to Shapley values and grouped AI predictions according to the similarity in the underlying values.
The AI method based on Lorenz decompositions was provided by Giudici and Raffinetti (2021) to illustrate the prediction of bitcoin prices within the context of real financial problems. Fantazzini et al. (2008) demonstrated the usage of "Copula," a statistical tool used in finance and engineering to build flexible joint distribution to model many variables for managing operational risk.
Typically, one should consider defaults (a condition where the firm delays or misses a contractual debt repayment) when predictive models are being discussed. However, the default data is unavailable. Most studies have examined bankruptcies (where companies file under chapter 7 [eventual liquidation] or chapter 11 [when there is uncertainty in the timing and magnitude of final payments to the firm's claim holders]).

Data and methodology *
The primary objective of this study is to compare the predictive performance of different machine learning models before and after adding a proxy for uncertainty as an additional default driver. In the exploratory analysis, we would like to observe if a neural network (along with other machine learning techniques) performs better than the traditional model while predicting bankruptcy. Based on the study's objectives, we consider companies listed on BSE or NSE as our data universe. Out of approximately 5000 actively listed companies, we ignore stocks of financial companies because of the differential treatment of the balance sheet of those companies. From the 3392 non-financial companies, we selected 1149 companies spread across various sectors based on their market share. The data set comprises information about both bankrupt companies and nonbankrupt ones. Next, we have selected the default drivers. From the historical studies, we chose 15 studies with high levels of accuracy. The financial default drivers used in these studies ranged from 2 to 21. These served as a base for our research. We then zeroed down on six studies (see Table 1), where the accuracy levels were close to 100 percent and listed the drivers used by them. A total of 89 financial default drivers were used in these studies. After filtering for duplicates, we were left with 63 drivers, as shown in Table 1.
We took the data for the past 12 years, i.e., from FY 2010 to FY 2021. Due to the non-availability of actual data for Q4 of FY 21 at the time of finalizing our report for this thesis paper, we extrapolated the financials of the entire financial year based on the actual financials of the previous three quarters to find out the relevant ratios for our analysis. This time frame is particularly of interest due to the 2008 global financial crisis and the recent pandemic. Both these were black swan events that disrupted the economies worldwide. With the collapse of Lehman Brothers in mid-September 2008, there was a full-blown meltdown of the global financial markets. It created a crisis of confidence that led to the seizure of the interbank market and had a trickledown effect on trade financing in emerging economies. Similarly, because of the sudden rise of COVID-19 cases, the Government of India enforced a nationwide lockdown, resulting in a supply-and demand-side shock to the Indian economy. Many companies started having financial problems due to the complete shutdown of their operations. Due to this crisis, many businesses are expected to go bankrupt in the coming years. None of the input ratios used in the earlier studies captures the effect of black swan events. Hence, to make this research more relevant, we have taken changes in Operating Expenses between FY 21 & FY 20 as a proxy for the COVID-related uncertainty.

COVID-19 and its impact on businesses
The outbreak of COVID-19 brought social and economic life to a standstill. Financial stress is rapidly growing. All the resources have been diverted to meeting the never-experienced-before crisis of such magnitude. While lockdown and social distancing measures result in productivity loss on; they cause massive supply-chain disruptions and sharp declines in demand for goods and services by the consumers in the market, thus leading to a collapse in economic activity. As stated in a research paper by the Journal of Health Management, a 2019 joint report from the WHO and the World Bank estimated the impact of such a pandemic at 2.2 percent to 4.8 percent of global GDP. In another report titled "COVID-19 and the world of work: Impact and policy responses" by the International Labour Organization, it was explained that the crisis has already transformed into an economic and labor market shock, impacting not only supply (production of goods and services) but also demand (consumption and investment). As far as India is concerned, economists are pegging the cost of the COVID-19 lockdown at US$120 billion or 4 percent of the GDP. International and internal mobility is restricted, and the revenues generated by travel and tourism, which contribute 9.2% to the GDP, will significantly affect the GDP growth rate. Aviation revenues will come down by USD 1.56 billion. Foreign Portfolio Investors (FPIs) have started withdrawing massive amounts from India. India's actual gross domestic product (GDP) is expected to grow by Many financial agencies have used different methods and sector-wise calculations to estimate the potential losses for India due to the lockdown. The Asian Development Bank, for example, has estimated that the GDP growth in India for FY 2020-21 will slip to 4%. The World Bank has stated that GDP growth will be as low as 1.5-2.8% (ibid). The IEA has estimated that each month of lockdown will have a pro-rata impact of a reduction in annual GDP by 2%. Tejal Kanitkar (2020) and Chadha et al. (2020), in an output-input analysis, predicted the output losses due to lockdown implementation. The model result showed that the economic losses would range from between 10 and 31% of the gross domestic product for 2020-21. Most of the other estimates deal with the direct impact of the lockdown on each sector. In contrast, the results from the model presented here also consider the indirect consequences of reduced final demand and multiplier effects that impact an industry because of changes in other sectors.
Considering COVID-19 as the black swan event, we then selected the change in operating expenses (excluding depreciation & amortization) as the proxy for uncertainty. We believe this will capture the effects of the COVID-19 pandemic-led disruptions, as many business operations were shut down. An operating expense is an expense a business incurs through its normal business operations. It includes rent, equipment, inventory costs, marketing, payroll, insurance, funds allocated for research and development, etc. When businesses do not work to their total capacity in a year (for any reason), it gets reflected in the operating expenses. Covid had disrupted firms across all sectors during the early half of the financial year 20-21 because of lockdowns. In an economy showing inflationary trends, the operating expenses will generally increase. But because of the lockdown, the operational costs have decreased in 20-21. This fact can only be attributed to covid disruption.

Data mining process
The entire data mining process comprised Importing and understanding the data for 1149 companies. First, we checked whether the data contained any shortcomings/imbalances and fixed them. Then we used training models using various modeling techniques. We then evaluated the performance of the models using multiple metrics. Then we introduced Uncertainty (Covid-19 impact) into the model and again assessed the performance of the model. Thus, we captured the chosen models' incremental efficacies (if any). We then ranked the models based on their accuracies and graphically represented the results.
The dataset comprises information about both bankrupt firms and operating firms. Tobin's Q has been taken as a proxy of a firm's operating performance in many studies of corporate governance Wernerfelt and Montgomery (1988), Fu et al. (2016), Hennessy and Whited (2005). Has considered the coefficient of the variable Q (Tobin's Q) as the primary proxy for an index of the probability of bankruptcy while modeling for the same. Hence for the classification scheme, we have taken Tobin's Q measure of greater than one as indicative of non-bankruptcy (Class 0) in a particular year and Tobin's Q measure of less than one as symbolic of bankruptcy (Class 1) in a specific year. Some observations in the dataset had missing data values for some variables. In general, missing values are more frequent for companies that go bankrupt than the ones that survive. Hence Class 1 is categorized as a minority class and Class 0 as a majority class.
We found data imbalance with a severe skew in Class 0, which was treated using the "Undersampling" method. We randomly removed observations of the majority class to better balance the number of records of the two categories, given that our dataset has more than 10,000 total observations. On checking the remaining observations, we found missing values for some of the variables. Some researchers replace the missing data with "0". This change can again lead to the introduction of bias and make the results irrelevant. A method is needed to replace the missing value with a valid value.
Hence, we made use of model-based imputation as it measures the uncertainty of the missing values in a better way. The chained equations approach is also very flexible. It can handle different variables of different data types (i.e., continuous, or binary) and complexities such as bounds, or survey skip patterns.
Regarding the predictive modeling techniques, we used different statistical, data mining, and machine learning techniques which range from single classifier techniques (viz. Logistic regression) to ensemble techniques (viz. AdaBoost).

Results
The following hypotheses were formulated.
(1) The predictive power of Machine Learning Techniques is better than Logistic Regression in predicting the bankruptcy of firms.
(2) The same predictive power of ascertaining bankruptcy improves when Covid-19-based uncertainty is added to the model.
After cleaning the data, we arranged the 63 calculated ratios of our 1149 companies vertically year-wise from FY 2009 till FY 2021 (taking the respective years as controlling variables alongside the observations). Years as controlling variables have been skipped. Company names have been kept as metadata. All the attributes as a feature while fitting the data for our analysis in Orange.
We evaluated our model for the pre-processed data via the Test & Score module in orange data mining software with the data using six predictive modeling techniques-Classification Tree, Random Forest, Neural Network, Naïve Bayes, Logistic Regression, and AdaBoost. We opted for 5-fold cross-validation on the data and the Target Class as 1. We used 5-fold cross-validation, assuring the generalizability of the model prediction results. The process involves dividing the dataset into five folds randomly. The four folds would be used for training, and the 5 th fold for validation. This process is iterative, and the results of each iteration are aggregated. The aggregate results are shown in Table 2(a).
Based on the five evaluation parameters for each of those six techniques-Area Under the ROC Curve (AUC), Classification Accuracy (CA), F1 Score, Precision, and Recall (Details discussed in Appendix 1); we found the Random Forest modeling technique to be the superior technique in terms of the number of correct predictions made out of the total number of predictions made and classifier separability. The following best technique is the Neural Network.

We computed the No Information Rate = Majority Class
Total no of Records = 2678 4136 = 0.65 in this case. Hence, all models apart from the Naïve Bayes model qualify for further analysis; for those, classification accuracy comes out to be greater than the NIR as it is considered a cut-off value (for Naïve Bayes, the classification accuracy is 0.636, which is less than 0.65).
In this scenario, we believe that a Type 2 error or False negative is more critical as this indicates that the model predicts a nonbankrupt company, whereas a company is bankrupt. Hence, we will consider models based on Recall value as we want to reduce false negatives. Based on the criteria, Random Forest ranks high, followed by Neural Network, Classification Tree, AdaBoost, and Logistic Regression.
When checked via ROC analysis, Neural Network and Random Forest also gave the best possible separability measures between the positive and negative classes, with the Area under the ROC Curve nearing 1.
To check for the improvement in prediction accuracy in the models post adding the Covid-19related uncertainty index, we took published Operating Expenditure data of the same companies for nine months of FY 2021 and extrapolated that for the entire financial year. Upon observing the increase (decrease) in the same metric -"OPEX" over FY 2020E, we standardized the difference to use it as our COVID-19 proxy using Z-scores (Z Score = [Individual Value-Mean Value]/Standard Deviation). After that, we applied the dataset with 64 independent variables (the same 63 variables as used previously and this COVID-19 proxy variable) on the same models. The results are shown in Table 3.
While predicting for test data, we again found the Random Forest modeling technique to be superior in predicting the target variable with the maximum classification accuracy, with Neural Network coming a close second.
The prediction accuracy thus seems to have improved for Random Forest (0.758 from 0.753), while for Neural Network, the same has decreased slightly (0.744 from 0.751). The accuracy of the other three relevant models-Logical Regression, AdaBoost, and Classification Tree-has shown no change post adding the Covid-19-related uncertainty index.
Thus, we can observe from the models that the predictive power of the Machine Learning Techniques-Neural Network, AdaBoost, and Random Forest is superior to Logistic Regression in predicting the bankruptcy of firms. Subsequently, the same predictive power is also improved when Covid-19-related uncertainty is added to the model.

Conclusion
From the results, we conclude that the accuracy of machine learning Algorithms could be improved by including a proxy for uncertainty as a default driver. We have considered the covid  pandemic as a black swan event that had caused ambiguity and used the change in operating expenses as the said proxy. A significant factor that has increased the probability of bankruptcy in recent times has been the large-scale supply chain disruptions and crippling lockdowns. The firms are trying to get back to pre-Covid plant capacity utilization or having to pivot their business models differently to seize newer opportunities amidst the crisis. We considered the change in operating expenditure (primarily decrease) as our COVID-19 proxy as firms were forced to majorly cut down on their operations and thus incurred lesser variable costs. In an economy showing inflationary trends, the operating expenses will generally increase. But we found that the operational costs had shown a dip in the case of many of the firms during FY 20-21, and we attributed it to Covid disruptions. This value captures individual firm-level idiosyncratic exposures to COVID-19 as per our model. We studied 1149 Indian firms from various sectors (other than financial ones) based on their market share (mainly large-cap and mid-cap companies). As depicted in the results above, the predictive power of the Machine Learning Techniques-Neural Network, AdaBoost, and Random Forest was superior to Logistic Regression. Subsequently, the same predictive power also improved when we added Covid-19-related uncertainty to the model.
Future research could study the performance of models using other techniques after including a default driver for uncertainty. The scope of the model could be extended to different geographies as well.

Disclosure statement
No potential conflict of interest was reported by the authors.