Small- and medium-enterprises bankruptcy dataset

Bankruptcy prediction is a long-standing issue that receives significant attention of academic researchers and industry practitioners. Most of the papers on bankruptcy prediction focus on companies that are listed on the stock market, and there are only limited data for the rest of the companies. These companies, not indexed at any stock market, represent a significant part of the economy. The presented dataset consists of financial ratios of Slovak companies. There are 21 distinctive financial ratios which are available for three consecutive years prior to evaluation year in which companies may have filed for bankruptcy or not. The companies come from four different industries - agriculture, construction, manufacture, retail. We provide data for four consecutive years 2013–2016 for each industry. All companies are categorized as small-medium enterprises according to EU classification. Prediction performance results on this dataset are published in the research paper “Bankruptcy prediction for small- and medium-sized companies using severely imbalanced datasets” (Zoričák et al., 2019).

Financial ratios SME Bankruptcy Imbalanced data Machine learning a b s t r a c t Bankruptcy prediction is a long-standing issue that receives significant attention of academic researchers and industry practitioners. Most of the papers on bankruptcy prediction focus on companies that are listed on the stock market, and there are only limited data for the rest of the companies. These companies, not indexed at any stock market, represent a significant part of the economy. The presented dataset consists of financial ratios of Slovak companies. There are 21 distinctive financial ratios which are available for three consecutive years prior to evaluation year in which companies may have filed for bankruptcy or not. The companies come from four different industries -agriculture, construction, manufacture, retail. We provide data for four consecutive years 2013e2016 for each industry. All companies are categorized as small-medium enterprises according to EU classification. Prediction performance results on this dataset are published in the research paper "Bankruptcy prediction for small-and medium-sized companies using severely imbalanced datasets" (Zori c ak et al., 2019).
© 2019 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

Data
The dataset is accessible on Data Mendeley [2] and provides financial ratios of limited liability companies. There are three possible views on the data as depicted in Table 1 [1].

Value of the data
The dataset provides financial ratios of an exhaustive set of companies in four sectors: agriculture, construction, manufacture, and retail The data can be used to propose or to benchmark statistical models or machine learning algorithms for bankruptcy prediction The dataset can be used to investigate markers of upcoming bankruptcy The data can be used to benchmark and validate methods for imbalanced learning, since distribution of the bankrupt and non-bankrupt companies is strongly imbalanced Financial ratios of companies are provided for three years prior to the year when the company is evaluated as bankrupt or non-bankrupt. In order to provide an overview of individual variables, we provide descriptive statistics in the form of boxplots in Fig. 2. All variables include outliers for almost all years. The interquartile range is relatively stable for all variables for all industries.

Experimental design, materials, and methods
We extracted values from the financial statements of each company for all available years. Financial statements consist of balance sheet and the income statement. A balance sheet provides detailed information regarding assets, equity, and liabilities. The income statement covers revenues, costs, and profit/loss for a given accounting period. Financial statements are publicly accessible on the Register of Financial Statements [3], which is database of financial statements of all business entities operated by Ministry of Finance of Slovak Republic. We used extracted values to calculate the financial ratios listed in Table 1 using Equation (1)e(21). Based on the available data, we identified four evaluation years e 2013, 2014, 2015 and 2016. Companies were evaluated and divided into two categories: bankrupt and non-bankrupt. Companies were evaluated based on [4] with two distinctive proceedings defined for companies in financial difficulties. It is either bankruptcy procedure or restructuring. A company which begins the restructuring process may recover its financial health but, nevertheless, poses risk for its creditors. Thus, we classify companies in both the bankruptcy procedure and restructuring process as bankrupt. After classification, we selected only companies with available data for three years prior to the evaluation year.