Credit Risk Theoretical Model on the Base of DCC-GARCH in Time-Varying Parameters Framework

Moiseev, Nikita; Sorokin, Aleksander; Zvezdina, Natalya; Mikhaylov, Alexey; Khomyakova, Lyubov; Danish, Mir Sayed Shah

doi:10.3390/math9192423

Open AccessArticle

Credit Risk Theoretical Model on the Base of DCC-GARCH in Time-Varying Parameters Framework

¹

Department of Mathematical Methods in Economics, Plekhanov Russian University of Economics, 117997 Moscow, Russia

²

Department of Statistics and Data Analysis, Faculty of Economic Sciences, National Research University Higher School of Economics, 101000 Moscow, Russia

³

Financial Research Institute of Ministry of Finance of the Russian Federation, 127006 Moscow, Russia

⁴

Institute for Research of International Economic Relations, Financial University under the Government of Russian Federation, 124167 Moscow, Russia

⁵

Strategic Research Projects Center, University of the Ryukyus, Nishihara, Okinawa 903-0213, Japan

^*

Author to whom correspondence should be addressed.

Mathematics 2021, 9(19), 2423; https://doi.org/10.3390/math9192423

Submission received: 20 August 2021 / Revised: 19 September 2021 / Accepted: 26 September 2021 / Published: 29 September 2021

(This article belongs to the Special Issue Mathematical-Statistical Models and Qualitative Theories for Economic and Social Sciences)

Download

Browse Figures

Versions Notes

Abstract

:

The research paper is devoted to developing a mathematical approach for dealing with time-varying parameters in rolling window logit models for credit risk assessment. Forecasting coefficients yields a better model accuracy than a trivial approach of using computed past statistics parameters for the next time period. In this paper, a new method of dealing with time-varying parameters of scoring models is proposed, which is aimed at computing the default probability of a borrower. It was empirically shown that in a continuously changing economic environment factors’ influence on a target variable is also changing. Therefore, forecasting coefficients yields a better financial result than simply applying parameters obtained by accumulated statistics over past time periods. The paper develops a new theoretical approach, incorporating a combination of the ARIMA class model, the DCC-GARCH model and the state–space model, which is more accurate, than using only the ARIMA model. Rigorous simulation testing is provided to confirm the efficiency of the proposed method.

Keywords:

default probability; scoring model; logistic regression; time-varying parameters; time series forecasting; ARIMA; DCC-GARCH; Kalman filter; state–space model

1. Introduction

Since the loan lending procedure is one of the primary operational activities for most banks around the world it is necessary to properly classify their borrowers by the risk of default occurrence. If a bank’s scoring system fails to reliably select solvent clients it threatens the bank’s financial stability, damages the banking sector and, as a consequence, can lead to an economic crisis. Thus, a good scoring model can significantly improve a bank’s profit and prevent unnecessary losses. However, to elaborate a reliable and accurate scoring model in the context of a dynamic and volatile economic environment becomes a challenging task. Since the end of the last century, researchers have tested a vast number of methods for predicting the probability of default (PD) by incorporating various factors associated with each particular borrower. A list of developed methods and their extensions includes such popular approaches as: linear discriminant analysis (LDA), logistic regression, support vector machine (SVM), neural networks, naïve Bayes classifier, classification trees, random forest, gradient boosting, ensembles of different models and others. However, it is not so easy to understand which method provides the best model. Conventionally it is performed by the following procedure: obtained statistics are split into training and testing datasets. Then parameter fitting is performed for the analysed methods on a training dataset and the model’s accuracy on the testing dataset is consequentially verified. Such an approach, nevertheless, has a flaw, since in this case the models are tested on already filtered borrowers, which was carried out by the initial scoring system. Thus, tested methods fail to take into account the so-called “grey zone” (clients, that either did not apply for a loan or were filtered out by the initial scoring system. Therefore, obtained results can be regarded as the efficiency of an additional filter of a scoring system. The downside of this method of upbuilding filters is the significant shrinkage of a pool of borrowers, which jeopardises a bank’s profit. In order to objectively compare scoring models, one would require a non-filtered pool of borrowers to be independently assessed by tested models. Besides that, due to the evolution of the economic environment, the true parameters of any model are subject to change. Some factors become insignificant with time, some—increase the extent of their influence.

The paper fills the gap in methods incorporating adjustments of obtained coefficients, performed by Kalman filter since we get two independent estimates: First—by forecasting a parameters’ vector, second—by fitting parameters to accumulated statistics. Traditionally the procedure of implementing a scoring model into a credit organisation implies fitting parameters to obtained statistics and applying the model to coming borrowers afterward. However, if a true model’s parameters are dynamically changing, such an approach leads to applying already obsolete parameters’ values, causing significant inefficiency of the scoring model. The main significance is in the development of a mathematical approach, that is aimed at accounting for time-varying parameters and, thus, will help improve the quality of the produced scoring model and consequently contribute to increasing the stability of the world financial system.

This paper has the following structure. Section 2 presents a quick literature review on existing scoring techniques. Section 3 presents a mathematical approach to adjusting parameters’ estimates, taking into account the volatile economic environment. In Section 4 an empirical reasoning for the proposed approach is presented. Section 5 is devoted to simulation testing of the proposed method. Section 6 sums up the key points of the paper.

2. Literature Review

Credit scoring is an essential part of a bank’s operational activity and guarantees the stability of a country’s financial system. There are numerous statistical methods used in credit scoring such as linear discriminant analysis (LDA) [1], decision trees [2], Markov chain analysis [3,4], profit analysis [5] and logistic regression [6].

Popular machine learning methods include artificial neural networks [7,8,9], genetic algorithms [10] and artificial immune systems [11].

Traditionally, LDA and logistic regression are the most widespread scoring techniques, which, as of late, are actively challenged by machine learning methods and models’ ensembles. The earliest application of LDA for credit risk analysis, fraud detection and anti-money laundering system dates back to 1941 [12]. The use of logistic regression models, nonparametric models, such as classification trees, neural networks and k-nearest neighbours was thoroughly investigated [13,14,15]. At the very beginning of the 21st century, it became very popular to apply machine learning to credit risk analysis. Hurley et al. consider credit risk analysis in the era of Big data [16]. The most prominent papers on decision trees include [17,18,19], regarding neural networks [20,21,22].

Ensemble methods have become extremely popular in credit risk analysis. The core of this method is the integration of a set of individual models of different nature to capture the best features of each learner. Already at the end of the 20th century, it was shown that ensemble methods are more efficient than individual learners [23]. Huang et al. demonstrated a significant difference between statistical methods and machine learning approaches and along with Fu et al. and Opitz and Maclin stated that ensemble methods are preferable to individual models in terms of their prediction accuracy [24,25,26].

However, in scoring literature the topic of time-varying parameters is not highlighted, though it is very popular in time-series forecasting, see for example Bitto and Frühwirth-Schnatter, Chan and Eisenstat, and Kalli and Griffin [27,28,29]. Since most economic time series are subject to the volatility of parameters, such models usually provide better accuracy. In credit scoring, this approach, as will be shown further in the paper, can also increase the efficiency of previously constructed models [30,31,32].

3. Materials and Methods

Let

{y_{t}, X_{t} : t = 1, \dots, T}

be a considered real-valued sample, where

y_{t} = {(y_{1 t}, y_{2 t}, \dots, y_{n t})}^{T}

is a binary target variable, which equals one in case a loan was defaulted and zero otherwise, and

X_{t} = (1, x_{1 t}, x_{2 t}, \dots, x_{m t})

is a countable longitudinal dataset of possible explanatory variables, where

x_{i t} = {(x_{1 i t}, x_{2 i t}, \dots, x_{n i t})}^{T}

. Suppose the probability of default can be modelled by logistic function as follows:

p (1 | X_{t}) = \frac{1}{1 + e^{- X_{t} β_{t}}}

(1)

where

β_{t} —

time-varying parameters vector.

We can model the dynamics of each

β_{i t}

either by seasonal trend model or by some ARIMA class model, classical representation of which is presented below:

{\hat{β}}_{i t} = α_{0} + α_{1} {\hat{β}}_{i (t - 1)} + \dots + α_{p} {\hat{β}}_{i (t - p)} + γ_{1} {\hat{ε}}_{i (t - 1)} + \dots + γ_{q} {\hat{ε}}_{i (t - q)} + {\hat{ε}}_{i t}

(2)

where the hat sign corresponds to estimators obtained from the analysed sample.

Thus, it is stated that if forecasted parameters are used instead of obtained ones, see Equation (2), the model will be more efficient.

p (1 | X_{t}) = \frac{1}{1 + e^{- X_{t} {\hat{β}}_{t}}}

(3)

Moreover, at each time period two independent estimates of true parameters’ vector

β_{t}

are obtained: one from the forecasting Equation (3) and the other one—from model fitting according to Equation (1). Thus, these two estimates by Kalman filter can be combined to obtain a more precise parameter vector. The probability density function for such parameters can be computed as shown below.

f (β_{t} | {\hat{β}}_{t - 1}, \dots, {\hat{β}}_{t - p}, X_{t}) = \frac{f_{1} (β_{t} | {\hat{β}}_{t - 1}, \dots, {\hat{β}}_{t - p}) f_{2} (β_{t} | X_{t})}{\int_{ℝ} f_{1} (β_{t} | {\hat{β}}_{t - 1}, \dots, {\hat{β}}_{t - p}) f_{2} (β_{t} | X_{t}) d β_{t}}

(4)

where:

f_{1} (β_{t} | {\hat{β}}_{t - 1}, \dots, {\hat{β}}_{t - p}) = {(2 π)}^{- \frac{m + 1}{2}} \det {(Σ_{t})}^{- \frac{1}{2}} e^{- \frac{1}{2} {(β_{t} - {\hat{β}}_{t})}^{T} Σ_{t}^{- 1} (β_{t} - {\hat{β}}_{t})},

(5)

f_{2} (β_{t} | X_{t}) = {(2 π)}^{- \frac{m + 1}{2}} \det {(Ω_{t})}^{- \frac{1}{2}} e^{- \frac{1}{2} {(β_{t} - {\tilde{β}}_{t})}^{T} Ω_{t}^{- 1} (β_{t} - {\tilde{β}}_{t})}

(6)

The covariance matrix of parameter vector

Ω_{t}

from Equation (6) is usually estimated by Fisher information matrix as follows:

Ω_{t} = X_{t}^{T} \tilde{W} X_{t}

(7)

where:

\tilde{W} = diag (\frac{e^{\sum_{j = 0}^{m} {\tilde{β}}_{j t} x_{1 j t}}}{{(1 + e^{\sum_{j = 0}^{m} {\tilde{β}}_{j t} x_{1 j t}})}^{2}}, \dots, \frac{e^{\sum_{j = 0}^{m} {\tilde{β}}_{j t} x_{n j t}}}{{(1 + e^{\sum_{j = 0}^{m} {\tilde{β}}_{j t} x_{n j t}})}^{2}})

(8)

Covariance matrix of parameter vector

Σ_{t}

from Equation (5) can be obtained by a slightly modified DCC-GARCH model.

Σ_{t} = D_{t} R_{t} D_{t}

(9)

D_{t} = [\begin{matrix} \sqrt{h_{1 t}} & 0 & \dots & 0 \\ 0 & \sqrt{h_{2 t}} & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & 0 \\ 0 & \dots & 0 & \sqrt{h_{m t}} \end{matrix}]

(10)

where each

h_{i t}

is a variance of the numerically generated probability density function

(p d f_{i t} (β_{i t}))

, which is the average of

N

probability density functions for the forecasted value of

β_{i t}

, obtained from model (2). The reason for numerical generation, in this case, is the fact, that we never have the true values of

β_{i t}

, but rather their estimates with some uncertainty. That is why we generate results of

β_{t - 1}, \dots, β_{t - p}

based on their pdf, obtained at previous steps in order to get

p d f_{i t} (β_{i t})

as follows:

p d f_{i t} (β_{i t}) = \sum_{j = 1}^{N} p d f_{i j t} (β_{i j t}) / N

(11)

Since

R_{t}

represents conditional correlation matrix for

β_{t}

and looks as follows:

R_{t} = [\begin{matrix} 1 & ρ_{12 t} & \dots & ρ_{1 n t} \\ ρ_{21 t} & 1 & ⋱ & ρ_{2 n t} \\ ⋮ & ⋱ & ⋱ & ⋮ \\ ρ_{n 1 t} & ρ_{n 2 t} & \dots & 1 \end{matrix}]

(12)

Then from Equations (10) and (12) each element of

Σ_{t}

can be presented as below:

{[Σ_{t}]}_{i j} = \sqrt{h_{i t} h_{j t}} ρ_{i j t}

(13)

The evolution of the correlation matrix

R_{t}

can be modelled as follows:

R_{t} = Q_{t}^{* - 1} Q_{t} Q_{t}^{* - 1}

(14)

Q_{t} = (1 - a) \bar{Ω} + a Ω_{t - 1}

(15)

where:

\bar{Ω} = \frac{1}{T} \sum_{t = 1}^{T} Ω_{t}

(16)

Q_{t}^{*} = [\begin{matrix} \sqrt{q_{11 t}} & 0 & \dots & 0 \\ 0 & \sqrt{q_{22 t}} & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & 0 \\ 0 & \dots & 0 & \sqrt{q_{m m t}} \end{matrix}]

(17)

Here

R_{t}

is decomposed into

Q_{t}^{* - 1}

and

Q_{t}

to ensure that absolute values of all entries are less or equal to one. In order to estimate parameters of the DCC-GARCH model, the paper uses the Maximum Likelihood Estimation method.

Having obtained the pdf for

β_{t}

the corrected estimates are computed by maximising the likelihood of Equation (4):

{\hat{β}}_{t} = \underset{β_{t}}{argmax} f (β_{t} | {\hat{β}}_{t - 1}, \dots, {\hat{β}}_{t - m}, X_{t})

(18)

These corrected estimates and pdf are then used in the forecasting Equation (2) to produce predictions for the next values of true parameters

β_{t + 1}

. It is worth noticing that if each

β_{i t}

is subject to or can be approximated by the normal distribution and

Σ_{t}

,

Ω_{t}

are diagonal, the corrected probability density is also normal and Equation (4) can be explicitly written. In order to show that the study introduces the following notation for the sake of simplifying further calculations:

E (β_{i t} | {\hat{β}}_{i, t - 1}, \dots, {\hat{β}}_{i, t - m}) = μ_{1}

,

E (β_{i t} | X_{t}) = μ_{2}

.

v a r (β_{i t} | {\hat{β}}_{i, t - 1}, \dots, {\hat{β}}_{i, t - m}) = σ_{1}^{2}

,

v a r (β_{i t} | X_{t}) = σ_{2}^{2}

. Then pdf for

β_{i t}

from Equation (4) can be rewritten as below:

p d f (β_{i t}) = \frac{\frac{1}{2 π σ_{1} σ_{2}} e^{- \frac{{(β_{i t} - μ_{1})}^{2}}{2 σ_{1}^{2}} - \frac{{(β_{i t} - μ_{2})}^{2}}{2 σ_{2}^{2}}}}{\int_{ℝ} \frac{1}{2 π σ_{1} σ_{2}} e^{- \frac{{(β_{i t} - μ_{1})}^{2}}{2 σ_{1}^{2}} - \frac{{(β_{i t} - μ_{2})}^{2}}{2 σ_{2}^{2}}} d β_{i t}}

(19)

In order to show this pdf is normal, it is critical to first consider only the numerator of this fraction. Simplifying this expression will yield the result, shown below:

\frac{1}{2 π δ_{1} δ_{2}} e^{- \frac{{(β_{i t} - μ_{1})}^{2}}{2 δ_{1}^{2}} - \frac{{(β_{i t} - μ_{2})}^{2}}{2 δ_{2}^{2}}} = \frac{1}{2 π δ_{1} δ_{2}} e^{- \frac{β_{i t}^{2} (δ_{1}^{2} + δ_{2}^{2}) - 2 β_{i t} (μ_{1} δ_{2}^{2} + μ_{2} δ_{1}^{2}) + μ_{1}^{2} δ_{2}^{2} + μ_{2}^{2} δ_{1}^{2}}{2 δ_{1}^{2} δ_{2}^{2}}} = \frac{1}{2 π δ_{1} δ_{2}} e^{- \frac{{(β_{i t} - \frac{μ_{1} δ_{2}^{2} + μ_{2} δ_{1}^{2}}{δ_{1}^{2} + δ_{2}^{2}})}^{2} + \frac{μ_{1}^{2} δ_{2}^{2} + μ_{2}^{2} δ_{1}^{2}}{δ_{1}^{2} + δ_{2}^{2}} - {(\frac{μ_{1} δ_{2}^{2} + μ_{2} δ_{1}^{2}}{δ_{1}^{2} + δ_{2}^{2}})}^{2}}{2 \frac{δ_{1} δ_{2}}{δ_{1}^{2} + δ_{2}^{2}}}}

(20)

After integrating the denominator of the fraction in Equation (19) and cancelling out the term

\frac{μ_{1}^{2} σ_{2}^{2} + μ_{2}^{2} σ_{1}^{2}}{σ_{1}^{2} + σ_{2}^{2}} - {(\frac{μ_{1} σ_{2}^{2} + μ_{2} σ_{1}^{2}}{σ_{1}^{2} + σ_{2}^{2}})}^{2}

, the following expression for the probability density function is obtained:

p d f (β_{i t}) = \frac{\sqrt{σ_{1}^{2} + σ_{2}^{2}}}{\sqrt{2 π} σ_{1} σ_{2}} e^{- \frac{{(β_{i t} - \frac{μ_{1} σ_{2}^{2} + μ_{2} σ_{1}^{2}}{σ_{1}^{2} + σ_{2}^{2}})}^{2}}{2 \frac{σ_{1}^{2} σ_{2}^{2}}{σ_{1}^{2} + σ_{2}^{2}}}}

(21)

What proves that the corrected pdf for

β_{i t}

is also normal with mean, shown in Equation (22), and variance, shown in Equation (23).

μ = \frac{μ_{1} σ_{2}^{2} + μ_{2} σ_{1}^{2}}{σ_{1}^{2} + σ_{2}^{2}}

(22)

σ^{2} = \frac{σ_{1}^{2} σ_{2}^{2}}{σ_{1}^{2} + σ_{2}^{2}}

(23)

From Equation (23) it is straightforward that

σ^{2} < \min (σ_{1}^{2}, σ_{2}^{2})

, which clearly states that, by applying this method, a better estimate of

β_{i t}

is obtained. Thus, the article shows that the expression Equation (4) can be substituted by a direct and more simple-to-compute form, given that the above-mentioned assumptions hold [33,34,35].

Such an approach of forecasting parameters, as will be shown in Section 5, will provide a better financial result for a financial institution compared to the traditional way of fitting a model on a dataset, available for period t-1, and simply applying obtained parameters for period t.

4. Empirical Reasoning for the Method

To instantiate the above-mentioned statement concerning time-dependent parameters, when dealing with scoring models in an economic environment, we provide the results of an empirical investigation for a microfinancial organisation.

The study uses data from 2015 to 2019 from Thomson Reuters. For the experiment only two factors were used: “Closed credit sum” and “Number of payments to loan term ratio” to construct ten models for 5 consecutive years, each of them was fitted on a dataset covering only 6 months. All in-sample results (2015–2019) are shown in Table 1.

As it can be seen from Table 1, the subsample size is growing since the considered organisation is expanding. Obtained coefficients are mostly significant except for a couple of subsamples where the number of observations was relatively small.

In Figure 1 and Figure 2 the dynamics of coefficients for the two above-mentioned factors and their polynomial approximations are shown. It is easy to track the cyclicity of the tested parameters, which proves that true parameters are time-dependent.

Such a simple empirical experiment shows that the methods, developed in the paper, can substantially boost the performance of logit models when developing a credit risk analysis system in a real-life economy.

5. Simulation Experiment

In our simulation experiment we consider a three-factor logit model:

p (1 | X_{t}) = \frac{1}{1 + e^{- β_{t 0} - β_{t 1} x_{t 1} - β_{t 2} x_{t 2} - β_{t 3} x_{t 3}}}

(24)

For each parameter, an autoregressive cyclical process with different periodicity is generated. Then we generate samples for each modelling period according to model Equation (24). For testing the proposed methods we use generic MLE (maximum likelihood estimation) to fit parameters’ values to a generated sample and auto.arima function in R environment to build ARIMA class models for parameters’ predictions. Figure 3, Figure 4, Figure 5 and Figure 6 illustrate estimated by MLE parameters (circles) and their ARIMA prediction (solid line).

To prove the efficiency of the proposed method we compare it with several other popular classification techniques. Comparisons are performed on the out-of-sample performance, by analogy with [36], of (1) a generic logit model, where obtained coefficients were applied at period t-1 to borrowers, coming at period t, (2) random forest algorithm from library “randomForest” in R package (default settings), (3) gradient boosting algorithm from library “gbm” in R package (default settings), (4) forecasted coefficients model, where parameters’ values are forecasted for period t and applied them to borrowers, coming at period t, (5) forecasted coefficients by a combination of state space approach and DCC-GARCH model. The interest rate is set at the level of 10% which will also be the cut-off rate for approving a loan. Considered methods are tested on a time range of 30 periods, each of them including 20,000 potential borrowers, each of them borrowing 1 money unit. Received profit is then calculated as follows:

p r o f i t = (n l - n d) \times r - n d

(25)

where

n l

is the number of issued loans,

n d

is the number of defaulted loans,

r

is the interest rate. The summary of such investigation is presented in Table 2 and Figure 7.

As can be seen from Table 2, the number of issued loans is almost the same for all considered methods (for the proposed methods we issued slightly fewer loans than for the logit model and two machine learning methods), however the difference in bad loans is dramatic: around nine and a half thousand for generic logit, random forest and gradient boosting against five thousand for forecasted by ARIMA coefficients and only one and a half thousand for state–space DCC-GARCH model.

Consequently, the percent of defaulted loans for proposed methods is significantly lower, which results in the fact, that profit, received according to the proposed methods by 26% exceeds the profit of generic logit for forecasted by ARIMA coefficients and by 47% for state–space DCC-GARCH model.

Figure 7 clearly illustrates the absolute supremacy of the proposed methods over generic logit almost for all 30 considered time periods. Profit, received by the trivial method has a distinct seasonality with sharp drops, whereas profit, received by proposed methods remains stable over time and, thus, can be considered as a dominant strategy for credit risk management systems.

Figure 8 shows the dynamics of machine learning methods efficiency against the proposed state–space DCC-GARCH model. Machine learning approaches display a similar pattern to the generic logit model, although slightly better efficiency. Gradient boosting method is by a small margin outperforms random forest. The proposed method is almost for all considered periods significantly outperforms tested machine learning methods and thus, can be considered as a preferable alternative.

Summarising the results of performed experiments it can be stated, that in the economic environment, when the assumption of constant true parameters is violated, it is more rational to apply forecasted coefficients’ estimates instead of ones, simply computed by the accumulated dataset that reflects already past interconnections within the data.

6. Conclusions

In this paper, a new method of dealing with time-varying parameters of scoring models is proposed, which is aimed at computing the default probability of a borrower. It was empirically shown that in a continuously changing economic environment factors’ influence on a target variable is also changing. Therefore, forecasting coefficients yields a better financial result, than simply applying parameters, obtained by accumulated statistics over past time periods. The paper fills the gap in methods incorporating adjustments of obtained coefficients, performed by Kalman filter since we get two independent estimates: first—by forecasting a parameters’ vector, second—by fitting parameters to accumulated statistics. It was shown by conducting an extensive simulation experiment, that proposed method is significantly better than the traditional approach and, thus, can be considered as a preferable alternative when dealing with credit risk assessment models in a volatile economic environment.

Author Contributions

Conceptualisation, A.S.; methodology, N.Z. and L.K.; validation, M.S.S.D., Writing—Original draft preparation, N.M. and A.M.; writing—review and editing, A.M. All authors have read and agreed to the published version of the manuscript.

Funding

N.M. and A.S. performed the study in the framework of the state task in the field of scientific activity of the Ministry of Science and Higher Education of the Russian Federation, project “Development of the methodology and a software platform for the construction of digital twins, intellectual analysis and forecast of complex economic systems”, grant no. FSSW- 2020-0008.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of Financial University under the Government of Russian Federation.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bansal, G.; Sinha, A.P.; Zhao, H. Tuning data mining methods for cost-sensitive regression: A study in loan charge-off forecasting. J. Manag. Inf. Syst. 2008, 25, 315–336. [Google Scholar] [CrossRef] [Green Version]
Zhang, H.; Legro, R.S.; Zhang, J.; Zhang, L.; Chen, X.; Huang, H.; Casson, P.R.; Schlaff, W.D.; Diamond, M.P.; Krawetz, S.A.; et al. Decision trees for identifying predictors of treatment effectiveness in clinical trials and its application to ovulation in a study of women with polycystic ovary syndrome. Hum. Reprod. 2010, 25, 2612–2621. [Google Scholar] [CrossRef] [Green Version]
Smith, L.D.; Lawrence, E.C. Forecasting losses on a liquidating long-term loan portfolio. J. Bank. Financ. 1995, 19, 959–985. [Google Scholar] [CrossRef]
Greenidge, K.; Grosvenor, T. Forecasting non-performing loans in Barbados. J. Bus. Financ. Econ. Emerg. Econ. 2010, 5, 80–108. [Google Scholar]
Abdou, H.A.; Pointon, J. Credit scoring, statistical techniques and evaluation criteria: A review of the literature. Intell. Syst. Acc. Financ. Manag. 2011, 18, 59–88. [Google Scholar] [CrossRef] [Green Version]
Darroch, J.N.; Ratcliff, D. Generalized iterative scaling for log-linear models. Ann. Math. Stat. 1972, 43, 1470–1480. [Google Scholar] [CrossRef]
Eddy, Y.L.; Engku Abu Bakar, E.M.N. Credit scoring models: Techniques and issues. J. Adv. Res. Bus. Manag. Stud. 2017, 7, 29–41. [Google Scholar]
Eletter, S.F.; Yaseen, S.G.; Elrefae, G.A. Neuro-based artificial intelligence model for loan decisions. Am. J. Econ. Bus. Adm. 2010, 2, 27. [Google Scholar] [CrossRef]
Khashman, A. Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes. Expert Syst. Appl. 2010, 37, 6233–6239. [Google Scholar] [CrossRef]
Finlay, S. Are we modelling the right thing? The impact of incorrect problem specification in credit scoring. Expert Syst. Appl. 2009, 36, 9065–9071. [Google Scholar] [CrossRef]
Kamalloo, E.; Saniee Abadeh, M. Credit risk prediction using fuzzy immune learning. Adv. Fuzzy Syst. 2014, 2014, 7. [Google Scholar] [CrossRef] [Green Version]
Durand, D. Risk Elements in Consumer Installment Financing; National Bureau of Economic Research: New York, NY, USA, 1941. [Google Scholar]
Makowski, P. Credit scoring branches out. Credit World 1985, 75, 30–37. [Google Scholar]
Angelini, E.; Di Tollo, G.; Roli, A. A neural network approach for credit risk evaluation. Q. Rev. Econ. Financ. 2008, 48, 733–755. [Google Scholar] [CrossRef]
Henley, W.; Hand, D.J. AK-Nearest-Neighbour Classifier for Assessing Consumer Credit Risk. J. R. Stat. Soc. Ser. D Stat. 1996, 45, 77–95. [Google Scholar]
Hurley, M.; Adebayo, J. Credit scoring in the era of big data. Yale J. Law Tech. 2016, 18, 148. [Google Scholar]
DAVIS, R.H.; Edelman, D.B.; Gammerman, A.J. Machine-learning algorithms for credit-card applications. IMA J. Manag. Math. 1992, 4, 43–51. [Google Scholar] [CrossRef]
Frydman, H.; Altman, E.I.; Kao, D.L. Introducing recursive partitioning for financial classification: The case of financial distress. J. Financ. 1985, 40, 269–291. [Google Scholar] [CrossRef]
Zhou, S.R.; Zhang, D.Y. A nearly neutral model of biodiversity. Ecology 2008, 89, 248–258. [Google Scholar] [CrossRef] [Green Version]
Jensen, H.L. Using neural networks for credit scoring. Manag. Financ. 1992, 18, 15–26. [Google Scholar] [CrossRef]
West, D. Neural network credit scoring models. Comput. Oper. Res. 2000, 27, 1131–1152. [Google Scholar] [CrossRef]
West, D.; Dellana, S.; Qian, J. Neural network ensemble strategies for financial decision applications. Comput. Oper. Res. 2005, 32, 2543–2559. [Google Scholar] [CrossRef]
Dietterich, T.G. Machine-learning research. AI Mag. 1997, 18, 97. [Google Scholar]
Huang, Z.; Chen, H.; Hsu, C.J.; Chen, W.H.; Wu, S. Credit rating analysis with support vector machines and neural networks: A market comparative study. Decis. Support Syst. 2004, 37, 543–558. [Google Scholar] [CrossRef]
Zhu, Y.; Xie, C.; Wang, G.J.; Yan, X.G. Comparison of individual, ensemble and integrated ensemble machine learning methods to predict China’s SME credit risk in supply chain finance. Neural Comput. Appl. 2017, 28, 41–50. [Google Scholar] [CrossRef]
Opitz, D.; Maclin, R. Popular ensemble methods: An empirical study. J. Artif. Intell. Res. 1999, 11, 169–198. [Google Scholar] [CrossRef]
Bitto, A.; Frühwirth-Schnatter, S. Achieving shrinkage in a time-varying parameter model framework. J. Econ. 2019, 210, 75–97. [Google Scholar] [CrossRef]
Chan, J.C.; Eisenstat, E. Bayesian model comparison for time-varying parameter VARs with stochastic volatility. J. Appl. Econ. 2018, 33, 509–532. [Google Scholar] [CrossRef]
Kalli, M.; Griffin, J.E. Time-varying sparsity in dynamic regression models. J. Econ. 2014, 178, 779–793. [Google Scholar] [CrossRef] [Green Version]
Orlando, G.; Pelosi, R. Non-performing loans for Italian companies: When time matters: An empirical research on estimating probability to default and loss given default. Int. J. Financ. Stud. 2020, 8, 68. [Google Scholar] [CrossRef]
Aslan, A.; Poppe, L.; Posch, P. Are sustainable companies more likely to default? Evidence from the dynamics between credit and ESG ratings. Sustainability 2021, 13, 8568. [Google Scholar] [CrossRef]
Orlova, E.V. Methodology and models for individuals’ creditworthiness management using digital footprint data and machine learning methods. Mathematics 2021, 9, 1820. [Google Scholar] [CrossRef]
An, J.; Mikhaylov, A.; Jung, S.-U. A Linear Programming Approach for Robust Network Revenue Management in the Airline Industry. J. Air Transp. Manag. 2021, 91, 101979. [Google Scholar] [CrossRef]
Mikhaylov, A. Development of Friedrich von Hayek’s theory of private money and economic implications for digital currencies. Terra Econ. 2021, 19, 53–62. [Google Scholar] [CrossRef]
An, J.; Mikhaylov, A.; Richter, U.H. Trade War Effects: Evidence from Sectors of Energy and Resources in Africa. Heliyon 2020, 6, e05693. [Google Scholar] [CrossRef] [PubMed]
Zhu, Y.; Zhou, L.; Xie, C.; Wang, G.-J.; Nguyen, T.V. Forecasting SMEs’ credit risk in supply chain finance with an enhanced hybrid ensemble machine learning approach. Int. J. Prod. Econ. 2019, 211, 22–33. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Coefficient value dynamics for closed credit sum.

Figure 2. Coefficient value dynamics for number of payments to loan term.

Figure 3. Coefficient value dynamics for

β_{t 0}

.

Figure 3. Coefficient value dynamics for

β_{t 0}

.

Figure 4. Coefficient value dynamics for

β_{t 1}

.

Figure 4. Coefficient value dynamics for

β_{t 1}

.

Figure 5. Coefficient value dynamics for

β_{t 2}

.

Figure 5. Coefficient value dynamics for

β_{t 2}

.

Figure 6. Coefficient value dynamics for

β_{t 3}

.

Figure 6. Coefficient value dynamics for

β_{t 3}

.

Figure 7. Amount of received profit for tested methods by time period.

Figure 8. Amount of received profit for tested methods by time period.

Table 1. Coefficients’ dynamics for a two-factor logit model.

Subsample Number	Time Range		Closed Credit Sum		Number of Payments to Loan Term Ratio
Subsample Number	From	To	Coef	p-Value	Coef	p-Value
1	12.01.2015	29.06.2015	−0.767	0.039	−0.147	0.718
2	01.07.2015	30.12.2015	−0.449	0.061	−0.432	0.126
3	11.01.2016	30.06.2016	−0.200	0.479	−0.375	0.225
4	01.07.2016	31.12.2016	−0.569	0.002	−0.505	0.011
5	01.01.2017	30.06.2017	−0.522	0.003	−0.403	0.038
6	01.07.2017	31.12.2017	−0.729	0.000	−0.521	0.001
7	01.01.2018	30.06.2018	−0.687	0.000	−0.462	0.000
8	01.07.2018	31.12.2018	−0.427	0.000	−0.386	0.000
9	01.01.2019	30.06.2019	−0.306	0.000	−0.304	0.000
10	01.07.2019	31.12.2019	−0.254	0.000	−0.188	0.000

Table 2. The numerical forecasting results of proposed models and other ML approaches.

Method	Profit Received	Number of Issued Loans	Number of Defaulted Loans	Percent of Defaulted Loans	Mean Squared Forecast Error	Mean Absolute Error
Generic logit	19,414	301,291	9741	3.23%	0.030	0.035
Random Forest	19,627	301,232	9542	3.17%	0.030	0.035
Gradient boosting	19,735	301,168	9438	3.13%	0.030	0.035
ARIMA	24,537.6	300,904	5048	1.68%	0.024	0.029
State space DCC-GARCH	28,531.8	300,831	1405	0.47%	0.016	0.021

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Moiseev, N.; Sorokin, A.; Zvezdina, N.; Mikhaylov, A.; Khomyakova, L.; Danish, M.S.S. Credit Risk Theoretical Model on the Base of DCC-GARCH in Time-Varying Parameters Framework. Mathematics 2021, 9, 2423. https://doi.org/10.3390/math9192423

AMA Style

Moiseev N, Sorokin A, Zvezdina N, Mikhaylov A, Khomyakova L, Danish MSS. Credit Risk Theoretical Model on the Base of DCC-GARCH in Time-Varying Parameters Framework. Mathematics. 2021; 9(19):2423. https://doi.org/10.3390/math9192423

Chicago/Turabian Style

Moiseev, Nikita, Aleksander Sorokin, Natalya Zvezdina, Alexey Mikhaylov, Lyubov Khomyakova, and Mir Sayed Shah Danish. 2021. "Credit Risk Theoretical Model on the Base of DCC-GARCH in Time-Varying Parameters Framework" Mathematics 9, no. 19: 2423. https://doi.org/10.3390/math9192423

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Credit Risk Theoretical Model on the Base of DCC-GARCH in Time-Varying Parameters Framework

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

4. Empirical Reasoning for the Method

5. Simulation Experiment

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI