Combination of linear discriminant analysis and expert opinion for the construction of credit rating models: The case of SMEs

Abstract The construction of an internal rating model is the main task for the bank in the framework of the IRB-foundation approach the fact that it is necessary to determine the probability of default by rating class. As a result, several statistical approaches can be used, such as logistic regression and linear discriminant analysis to express the relationship between the default and the financial, managerial and organizational characteristics of the enterprise. In this paper, we will propose a new approach to combine the linear discriminant analysis and the expert opinion by using the Bayesian approach. Indeed, we will build a rating model based on linear discriminant analysis and we will use the bayesian logic to determine the posterior probability of default by rating class. The reliability of experts’ estimates depends on the information collection process. As a result, we have defined an information collection approach that allows to reduce the imprecision of the estimates by using the Delphi method. The empirical study uses a portfolio of SMEs from a Moroccan bank. This permitted the construction of the statistical rating model and the associated Bayesian models; and to compare the capital requirement determined by these models.


PUBLIC INTEREST STATEMENT
Customer rating is an important tool for determining credit pricing.Indeed, the bank must construct a model capable of determining the real profile of the customer.This article proposes a approach to the conception of a rating model that integrates the quantitative and qualitative data of the company can be of great utility for professionals, student credit risk researchers and academics.
It also meets the need for a portfolio manager by combining their opinion with statistical estimation.
The method we have proposed to combine statistical estimation and expert opinion can be used in other risk management areas such as operational and market risk.

Introduction
Internal credit risk rating models are based on the modelling of the three risk components, which are probability of default (PD), Loss given default (LGD) and Exposure at default (EAD).As a result, the bank must estimate the three components for each customer exposure.
To model the probability of default (PD), a multitude of techniques can be used, such as linear discriminant analysis (ADL), the intelligence techniques (neural networks and genetic algorithms), bayesian Network and the probabilistic models.
These techniques are based on different logics and have been the subject of a multitude of research and studies conducted by academics and professionals such as: -Multidimensional linear discriminant analysis The prediction of default by linear discriminant analysis was developed by Altman (1968), Taffler (1982), Bardos (1998), Bel et al. (1990) and Grice et al.(2001).

-Intelligence techniques
The several studies have applied these techniques to predict the default of the corporates, such as those conducted by Bell, Ribar, and Verchio (1990), Liang and Wu (2003), Bose and Pal (2006) and Back et al. (1996), and Oreski et al. (2012).

-Bayesian Network
The Bayesian classifier (Friedman, Geiger, & Goldszmidt, 1997) is based on the calculation of a posterior probability.The opportunities of using probabilistic Bayesian networks in fundamental financial analysis is studied by Gemela (2001); Das, Fan, and Geng (2002) studied the changes in PDs related to changes in ratings, using a modified Bayesian model to calibrate the historical time series of probability of Default changes to historical rating transition matrices; Dwyer (2006) used the Bayesian approach to propose techniques to facilitate probability of default assessment in the absence of sufficient historical default data; Gôssl (2005) introduced a new Bayesian approach to the credit portfolio, and deduced, within a Bayesian framework, the law a posteriori from the probabilities of default and correlation and Tasche (2013) has describes how to implement the uninformed and conservative Bayesian estimators in the dependent one-and multi-period default data cases and compares their estimates with the upper confidence bound estimates.
Our study differs from previous research in that we treated the conception of the rating models of the credit portfolio based first on the multidimensional linear discriminant analysis (LDA), which permitted us to determine the probability of default by class.Then, we determined a mathematical passage that allows us to combine the probability of default from the statistical model and that estimated by the experts using Bayesian logic.Then, we developed an information gathering approach based on the Delphi method to ensure the reliability of the estimates.As a result, our approach tends to be more practical than theoretical and may be of interest to professionals in the field of credit risk management.In summary, in this article, we propose a practical approach with a solid theoretical basis to combine the probability of default emanating from the linear discriminant analysis and that emanating from the experts using Bayesian logic and the Delphi method.
The rest of this paper is as follows.Section 2 is devoted to credit risk measurement.We first give a definition of the credit risk situation.We then define the approaches to credit risk measurement and we defined the unexpected and expected credit loss.The third section is reserved to modelling of the probability of default and the construction of the statistical notation model and associated Bayesian models.The fourth section is reserved to the empirical study.

Credit risk measurement
The credit risk situation is composed of the following elements: • Probability of default (PD): Probability that a counterpart falls at default in a horizon one year.
• Loss given default (LGD): The share, expressed as a percentage of the amount a bank loses when a borrower falls at default on a credit.
• Exposure at default (EAD): The total value to which a bank is exposed when a credit is at default.
• Maturity (M): The effective maturity of credit.

The IRB-foundation approach
The Basel Committee on Banking Supervision (1999,2006,2016) provides for three risk measurement approaches: the standard approach, the foundation internal rating-based foundation approach (IRB À F) and the internal rating-based advanced approach (IRB À A).In our study, we will measure the risk according to the internal rating-based foundation (IRB-A).Under this approach, the bank must model the probability of default while the estimate of loss given default, exposure at default and maturity are provided by the Basel accords.
Indeed, for the loss given default (LGD) we use the standard estimate provided under the IRB À F approach, which is equal to 45%, while for the exposure at default we proceed as follows: Let be V efi the amount of the financing authorization granted by the bank to the customer, The exposure at default EAD is defined as the sum of: • The value accounted for in the balance sheet (VCB 0 Þ.
• The value of the unused funding commitment, accounted for off-balance sheet (VCHB 0 Þ multiplied by a credit conversion factor (CCFÞ.The standard estimate of CCFunder the IRB À F approach, is equal to 75%.
The mathematical formulation of the EADis given by the following relationship:

The expected loss (EL)
The Basel Committee (2015) has established the provision for the expected loss.Indeed, the amount of the expected loss is equal to the multiplication of the three components PD, LGD and EAD:

The unexpected loss (UL)
The unexpected loss (UL) and the risk-weighted assets are defined by the Basel Accords as follows: The parameter (K) called « Capital requirement» represents the weighting function calculated according to the PD, LGD, correlation R ð Þ and the effective maturity M.
In this paper, we will calculate the risk-weighted assets of a portfolio of SMEs.Indeed, the Basel Committee defines the parameter K relating to this segment by: • Capital requirement (KÞ: • The correlation R is determined by the following model 1 : 3. The modelling of probability of default 3.1.Constitution and treatment of the database

The constitution of the database (definition of variables)
In our study, we were able to identify 16 quantitative and 19 qualitative variables.The choice of variables is based on current financial analysis practices and the likely impact on business failure.
We present below the selected variables, the explanation and meaning of which are detailed in Appendix 1 (Table A1, Table A2).

Class
The quantitative variables (V j ,1 j 16) Net fixed assetsþWCR where WCR is working capital requirements Total purchase

Discretization of qualitative variables and their transformation into a score
For the discretization of the variables we will use the approach proposed in Benbachir and Habachi (2018) which is as follows: -Discretization of qualitative variables The qualitative variables (q m Þ; 1 m 19 are discretized into modalities.The number of modalities can be equal at 3 or 5 modalities.The rule of the modalities choice is based on the logical relationship between modalities and default. -Transformation of quantitative variable into score Let M qm;l À Á ; l ¼ 1; ::; l qm , be the modalities of the qualitative variable (q m Þ and (l qm ) defines the number of modalities (l qm 2 3; 5 f g).For each modality, the score varies between 0 and 100 points with a jump of 50 points per modality for the variables at three modalities and a jump of 25 points for the variables at five-modalities.The score taken by the modalities is: • Variables at three modalities : [0,50,100] Example: the modalities relating to the sector default rate are: 1-below average, 2-equal to average, 3-above average.In this case, the scores given are, respectively: 100, 50, 0.

Theme
The qualitative variables (q m ) T 1 : The sector of activity T 2 : The company's positioning and competition q 4 ¼ q 1T2 ð Þ : Competitive position and intensity q 5 ¼ q 2T2 ð Þ : Barriers and new entrants The concentration and position of the counterparty vis-à-vis its suppliers and customers Positions vis-à-vis suppliers and customers T 4 : Quality and structure of management q 10 ¼ q 1T4 ð Þ : Succession planning and business continuity Compliance with the accounting documents delivery schedule T 5 : The company's history with banking q 17 ¼ q 1T5 ð Þ : Number of payment incidents in the last 12 months q 18 ¼ q 2T5 ð Þ : Percentages of unpaid bills over the last 12 months T 6 : Relations with banks q 19 ¼ q 1T6 ð Þ : Number of banks related to the company Example: the modalities relating to natural risk are: 1-No risk, 2-Low risk and the adequate crisis plan, 3-High risk and the adequate crisis plan, 4-Low risk without crisis plan, 5-High risk without crisis plan.In this case, the scores are, respectively: 100, 75, 50, 25, 0.
The assessment of the logical relationship between the modalities of each variable and the default is determined on the basis of expert opinion.

Mathematical modelling of default
The default is modeled by a binary variable Y defined as follows: The relationship between the variable Y to be explained and the explanatory variables V j and q m is determined by the linear discriminant analysis.
To determine the explanatory variables to be used for modelling, we will use an univariate analysis for each variable in the chosen list.Indeed, the objective of this analysis is to determine the relationship between Y and each of its quantitative and qualitative variables V j et q m .

The linear discriminant analysis
Linear discriminant analysis provides a method for predicting the failure of a enterprise based on quantitative and qualitative discriminant variables.
In the case of the binary modeling given by the formulation (1), the classification function (score function) relating to a vector of characteristic x is written: with: • m 0 : is the average point of the group of failing companies.
• m 1 : is the average point of the group of healthy firms.
• S: is within-groups variance and covariance matrix.
If f x ð Þ >s the firm is a healthy otherwise the firm is in default.The threshold ðs) was determined by the model.The classification function f x ð Þcan be written: From X i , i ¼ 1; ::; p, are the quantitative and qualitative discriminating variables and β i are the discriminating coefficients.The linear discriminant analysis is based on the following assumptions: • The discriminating variables should not be overly correlated therebetween.
• The discriminating variables derives from a population with Gaussian distribution.
• The covariance matrices must be equal for each group.

Choice of variables
The choice of discriminant variables to be used is based on the univariate analysis.Indeed, the discriminant variables must verify the hypothesis of equality of group means is true.The statistical test for equality of group means is in Table 3.

Testing of the significance of the coefficients
The validation of the multivariate model depends on the following significance tests: -The Box's M test (the groups covariance matrices are all equal) The Box's M test is used to check whether two or more covariance matrices are equal (Homogeneity of variances).The null hypothesis H 0 that "The groups covariance matrices are all equal" and the test statistics are defined by: The sum of the populations of two groups.
• S i is the estimate of the covariance of the variables in the group (i): The decision of the test depends on the size of the group n i and the number of discriminating variables because the statistic can be a chi-square law or a Fisher law.Therefore, if the p-value is inferior to 5%, H 0 is rejected.
-Tests relating to the predictive capacity of the classification function (score function) To test the predictive capacity of the classification function, we use Wilks' lambda.Indeed, the test statistic is defined in Table 4.

The confusion matrix
To ensure that the discriminant function provides a good classification of companies into subgroups, we use the confusion matrix defined in the Table 5.This matrix permits to determine the capacity of the model to correctly classify the firm.Indeed, it is measured by the ratio: n 10 þn 21 n 1 þn 2 .This capacity is confirmed by the test Q presse defined in Giannelloni and Vernette (2001).The hypothesis H 0 is defined by « the equality of the number of individuals correctly classified by the discriminating function and by hazard ».
The test statistic is: with: n is the total number of companies, n c is the number of companies correctly classified and k is the number of groups.
The statistic Q presse is chi-square law (χ 2 ) at 1(one) degree of freedom.Indeed, if the p-value is inferior to 5%, H 0 is rejected.

The affectation threshold
The decision to affect a company's allocation is based on the affectation threshold defined by the functions at group centroids.The separation of groups is defined in Table 6.
The optimal separation point is the weighted mean of the values of α and β ( n 1 αþn 2 β n 1 þn 2 ).However, if both groups are the same size (n 1 ¼ n 2 Þ the separation point will be the arithmetic mean of α and β ( αþβ 2 Þ.

Discriminatory power (power stat)
The discriminatory power represents the model's ability to predict future situations.We will use the ROC curve to determine the discriminatory power of the model.The determination of the ROC curve will be done from the classification table of the sample of estimation of the variable Y which is presented in Table 7.
One indicates by sensitivity (SV), the proportion of the healthy companies classified well: SV ¼ TH THþFH and by specificity (SP), the proportion of the de companies is in default, classified well: SP ¼ FD

FDþTD
If one varies the "probability threshold" from which it is considered that a company must be regarded as healthy, the sensitivity and specificity varies.The curve of the points (1 À SP; SV) is the ROC curve.
• Definition of the area under the ROC curve (AUCÞ and the Accuracy ratio (AR) The mean of the scores The area under the ROC curve (AUC) provides an overall measure of model fit (Bewick, Cheek, and Ball (2004).The AUC varies from 0,5 (predictive capacity absence for the model) to 1 (perfect predictive aptitude for the model).
• Accuracy ratio (AR) The accuracy ratio is defined by the relationship: The AR takes values between 0 and 1.

-The determination of explanatory variables
To determine the explanatory variables to be retained for modelling, we will carry out a univariate linear discriminant analysis for each variable in the chosen list.
After selecting the explanatory variables on the basis of the decision rules mentioned above, we will study the correlation between the selected variables.The study of correlations makes it possible to eliminate strongly correlated variables.Indeed, if two or more variables have a correlation coefficient superior to 0,5 ρ !0; 5 ð Þthen the variable that represents the greatest AUC will be selected.

-The performance of the multivariate model
The discriminating capacity of the multivariate model is considered acceptable if the AUC is greater than 70%.

The canonical discriminant function
The canonical discriminant function, presented in Klecka (1980), is a linear combination of the discriminant variables.Indeed, it has the following mathematical form: with X i are the discriminating variables and u i are the canonical coefficients.
The maximum number of canonical functions is equal to the min k À 1; p ð Þwith k is the number of classes and p is the number of discriminating variables.
The canonical coefficients are determined in such a way as to maximize the distance between the group centroids.The canonical discriminant functions can be used to predict the most probable class of membership of an invisible case.
The discriminating canonical analysis is detailed in Palm (1999) and Klecka (1980).Indeed, it is similar to the main component analysis in that it replaces the initial discriminating variables with uncorrelated canonical variables as a linear combination of the initial variables.

The construction of the rating model
The conception of the rating model is based on the score function because the probability of default (PD) depends on the score attributed by the statistical model.Therefore, the conception process is as follows:

The determination of the score function by linear discriminant analysis
The modeling of default by linear discriminant analysis is done by the simultaneous treatment of quantitative and qualitative variables.Indeed, let be X j ; j ¼ 1; . . .; r, and T i ; i ¼ 1; . . .; 6, respectively, the quantitative and qualitative variables retained by the univariate linear discriminant analysis noted, respectively, X and T.
The score function of the linear discriminant analysis is defined by the relationship:

Determination of the rating grid
The determination of the rating grid consists in determining the score interval for each class.Indeed, the standardized score is defined over an interval of 0 to 100 ([0,100]).This interval will be segmented into eight (8) classes to determine the rating classes.

The prediction of healthy firms by the linear discriminant analysis
The prediction of healthy firms by the linear discriminant analysis is based on the function at group centroid defined by Table 6.Indeed, let x j ; j ¼ 1; . . .; r and t i ; i ¼ 1; . . .; 6, be, the characteristics of the firm (i).This firm is considered healthy if: The firms with a score function between α and β, defined by Table 6, overlap between the sound and default classification.Indeed, the classification in this case is based on the point of separation.Therefore, the firm (i) is considered healthy if: The classification function f c X; T ð Þ of firms can be defined as follows:

. The rating grid
The classification of healthy companies is based on the score function.Indeed, this classification gives rise to the rating grid composed of 8 rating classes.
Each company (i) is classified into a rating class, the classes vary between A and H, and are defined in the Table 8.  [65-76[ [55-65[ [46-55[ [40-46[ [30-40 The intervals of score are semi-open on the left to guarantee the independence of the rating classes.Indeed, the lower bound belongs to the "t" class and the upper bound of the interval belongs to the "t +1" class with the exception of class A which has a closed interval.

Calculation of the rating score
The rating score SN i of the firm (i) is defined by: where S i is a score fonction of the firm (i).

Calculation of the probability of default per rating class
The probability of default of the class K PD K ð Þ is defined by the probability of default of the company (i) knowing that the company (i) belonging to the class K: The probability of default can be determined by the probabilistic approach or by empirical calculation:

Theoretical calculation of the default probability of the rating class (K)
In the framework of linear discriminant analysis, the probability of default PD Ki of the firm (i) is expressed by Gurný et al. (2013) by the following formula: with π Y¼0 is the prior probability of default of the sample and α is expressed by the formula: where • X Y¼1 and X Y¼0 are the vectors composed of the mean values of the variables independent of the linear discriminant analysis.
• γ T is the transposed of the vector γ defined by: with ∑ is the variance-covariance matrix of the discriminatory variables.
As a result, the probability of default PD K of the class K is determined by the following formula: (11)

Empirical calculation of the probability of default
The probability of default PD K ð Þ can be calculated from empirical data.Indeed, it represents the proportion of firms at default belonging to the rating class K.As a result, it is calculated by the following formula: To define this probability of default per rating class, we will distribute the sample of healthy and default companies in Table 9.

. Definition of the Bayesian approach
Let be Y the random variables and θ its parameter, According to the Bayesian approach, the parameter θ is considered as random variable of density π θ .
The probability density function f Y; θ ð Þ of the random vector defined by: Where: • π θ ð Þ is the probability density of the parameters, it is called prior density function.
• π θ=Y ð Þ is the density of parameters given data Y, a so-called posterior density; • f Y; θ ð Þ is the joint density of observed data and parameters; • f Y=θ ð Þ is the density of observations for given parameters.
The Bayes' theorem says that the posterior density can be calculated as follows: Where from ) the posterior distribution can be viewed as a combination of a prior knowledge π θ ð Þ with a likelihood function for observed data f Y=θ ð Þ.Since f Y ð Þ is a normalisation constant, the posterior distribution is often written with the form (15) where ∝ is a symbol for proportionality.Indeed, the Bayesian estimator for the univariate case is defined as follows:

Calculation of the probability of default by the Bayesian approach
Let be Y i the random variable that models the company's failure (i) and let be θ the random variable that models the parameter of Y i .Indeed, the variable Y i is expressed by the relationship: 1 the company is in default with a probability of θ 0 the company is healthy with a probability of 1 À θ & Let be pðy i =θÞ the probability of Y i knowing θ, pðy i =θÞ is defined as follows: Let R k the number of companies in default in the class K and n k the number of companies that belong to the class K, R k is expressed by the following formula: Let be pðR k =θÞ the probability of R k knowing θ, the variable R k is modelled by the binomial distribution.Indeed, the probability pðR k =θÞ is expressed by the following formula: 3.6.3.Definition of the prior law π(θ) of θ theta The priori law of θ of the Binomial distribution is studied in Tasche[31].Indeed, the priori law for the distribution of θ that we will use is the distribution beta with the parameters (α; βÞ defined by the following formula: with Γ is the Euler gamma function defined by: The parameters α and β are estimated by the experts, the mathematical expectation and the standard deviation of the Beta law are determined according to α and β as follows: The posterior law conjugated at the prior law θ is defined by the formula (15) as follows: with C is a constant independent of θ, thus: 3.6.5.Determination of the bayesian estimator of the parameter θ The Bayesian estimator of the parameter θ is determined by the formula (16) as follows: 3.6.6.Determination of the bayesian default probability by class K The formula ( 19) can be written as follows: We pose αþβ αþβþn k ¼ ε thus θbay is written: with: is the prior probability of default defined by the experts (PD e;K Þ.
• R = n k is the empirical probability of default defined by the statistical model (PD K Þ. • θbay is the posterior probability of default (PD a;K ).
The formula (20) permits to expressing the posterior probability of default as follows: The formula (21) shows that the Bayesian probability of default is the weighted mean of the statistical probability of default and the probability of default of the experts.

The implementation of the Bayesian approach
The implementation of the bayesian approach to determine the probability of default that we propose in this paper is based on the Delphi technique defined in Helmer (1968) and Ayyub (2001).Indeed, the information solicited from experts concerns the probability of default by class PD e;K and the weighting of the expert opinion ε.

Estimation by experts of the probability of default by class PD e,K
The estimation by experts of the probability of default by class PD e;K is done in two different ways.Indeed, it can be determined implicitly or explicitly: 3.7.1.1.Explicit estimation of the probability of default.The explicit method consists in directly soliciting expert opinion for the default rate of each rating class, on the basis of the classification criteria of the statistical model, the risk profile and the score of each class.
In this case, the expert must furnish a probability of default per score class.Indeed, it must furnish the mean number of defaults for a theoretical sample with sizes equal, respectively, to 100, 1000 and 10,000 companies.The probability of default is equal to the number estimated by the expert divided by each size used.
3.7.1.2.Implicit estimation of the probability of default.In this case, we will implicitly deduce the probability of default of the experts using the expected loss by amounting (EL M Þ defined by the formula (1).Indeed, the expert must estimate the amount of the expected loss in form of a percentage of the credit volume of each class.
The probability of default PD e;K is calculated using the estimated of the loss given default (LGD) fixed by the IRB À F approach.For the implicit estimate, the expert must estimate the mean loss per score interval.Indeed, for our study, it must pronounce on the mean loss compared to an exposure of 100 000, 1000 000 and 10 000 000 MAD.
The probability of default will be determined by assuming that the expert will furnish the amount of the expected loss (EL M ) and using the IRB À F estimates for the loss given default(LGD) fixed at 45% and considering that the exposures relating to a credit conversion factor (CCF) of 75% are, respectively, equal to 100 000, 1 000 000 and 10 000 000 MAD.

Estimation the weighting of the expert opinion ε epsilon
The weighting of the expert opinion ε is effected via the participation of the counter-study function of credit dossiers, the permanent control function and the internal audit function.In our study, we used two weighting values that are 25% and 50%.Indeed, the selected experts must verify its two weights.

Definition of the interveners
The implicit and explicit estimation of the probability of default is made by the participation of the following interveners:

Credit portfolio managers (experts)
The portfolio managers must furnish an estimate of the number of defaults and the mean expected loss per rating class.

Internal auditors and permanent controllers (evaluator)
The internal audit and permanent control functions intervenes in weighting the experts' opinions.
3.7.4.Choice of interveners 3.7.4.1.Choice of experts.For the elaboration of Bayesian models, we will use two values for the weighting of the expert opinion, which are 25% and 50%.Therefore, we need to identify the experts who can be weighted, respectively, at 25% and 50%.
To do this, we drew a list of credit portfolio managers and scoured their profiles; then we retained only those whose estimates can be weighted at 25% and 50%.
The score of the credit portfolio managers is determined with the hierarchical managers and confirmed with the audit function and the permanent control function on the basis of a support composed of the following elements: • Pertinent expertise, university education and professional experience.
• The size of the portfolio managed and the rate of companies in difficulty.
• Mastery of the process of control, recovery and financial management.
• Excellent communication abilities, flexibility, impartiality and capacity to generalize and simplify.
The score must give a value in a grid of 10; 25; 50; 75 ð Þ .Indeed, each criterion must have a qualification between low, medium and high.To calculate the score, in Table 10, we assigned the following ratings to these qualifications: The score is equal to the sum of the ratings assigned to all criteria and the weighting is defined according to the score in Table 11.3.7.4.2.Choice of evaluator.The choice of evaluators at the level of permanent control and internal audit is based on expertise in credits.Therefore, we have defined the criteria for selection as follows: • Pertinent expertise, university education and professional experience.
• Expertise in credit risk.
• The number of control and audit missions of the credit activity.

The conduct of collection of data in experts
Once, we selected the experts and evaluators, we organized evaluation meetings in which the following elements were taken into consideration: • Definition of the objective of the study (modelling of probability of default, rating …) • The presentation of the portfolio of SMEs and their characteristics.
• The default rate in the portfolio of SMEs • The need for information: number of defaults and mean loss per rating class.
• Data collection from credit portfolio managers and validation with the hierarchy.
• Selection and weighting of experts and presentation of criteria for weighting.
• Choice of evaluators and presentation of the criteria for choice.

Description of the database
In this study, we used a database of small and medium-sized companies from a Moroccan bank composed of 1447 companies.The quantitative and qualitative information concerns 31-12-2017 and the situation of the companies (healthy or in default) is observed during 2018.In terms of default, the portfolio structure is presented in Table 12.
The financing authorizations (V efi ) and balance sheet values (VCB 0 Þof this portfolio by rating class are detailed in Table 13.
4.2.Choice of quantitative and qualitative variables

Univariate discriminant analysis
The univariate linear discriminant analysis of quantitative and qualitative variables permitted to determine the discriminant variables.The choice of quantitative variables is based on the Fisher ratio.Indeed, the non-discriminatory variables are presented in Table 14.
The Fisher ratio test for the quantitative and qualitative variables presented in the previous table shows that the p-value is superior to 0,05.Therefore, these variables are not discriminating and will not be retained for the multivariate linear discriminant analysis.On contrary, the selected discriminant variables are in Table 15.

Analysis of the correlation
To eliminate the impact of the correlation of the variables on the prediction of the default, the Table 16 represents the correlation between the quantitative variables: Analysis of the correlation of the discriminant variables shows that variables V 8 and V 9 are highly correlated because the correlation coefficient is equal to 0,6.Therefore, we only retain the variable V 9 for modeling.
The choice to retain the variable V 9 is based on the curve of ROC.Indeed, the AUC of the variable V 9 is superior to that of the variable V 8 .The results of the performance analysis of the two variables are as follows:

Multivariate analysis and determination of the classification function
Let be X j ; 1 j 12 the quantitative discriminating variables and T i ; 1 i 6 the qualitative discriminating variables with: • ðT i ; 1 i 6Þ ¼ q 1 ; q 6 ; q 8 ; q 12 ; q 17 ; q 19 ð Þ .

Testing of the significance
The results of the tests of significance defined in the third section are presented as:

The Box's M test
The results of the test are presented in the Table 17.
Since the calculated p-value is lower than the significance level α=0,05, the null hypothesis H 0 must be rejected.This means that the model distinguishes between defaulting and healthy enterprises.

Tests relating to the predictive capacity of the score function (Wilks' lambda)
The results of the test are presented in the Table 18.Since the calculated p-value is lower than the significance level α=0,05, the null hypothesis H 0 must be rejected.This means that the model has an acceptable capacity to predict companies in default.

The Confusion matrix
The confusion matrix is presented in the Table 19.
The capacity of the model to classify the company correctly is in the order of 93,7%.This signifies that the model has an excellent capacity to correctly classify enterprises.
Confirmation of this capacity is done using the Q test presented in the third section.The empirical value of the test statistic is: The critical value of the χ 2 with 1 degree of freedom is equal to 3,84.Since Q presse is superior to the critical value, the null hypothesis H 0 must be rejected.This confirms that the model has an excellent capacity to correctly classify enterprises and that it offers a better classification compared to the hazard classification.

The performance of the model
The AUC of the model is equal to 0,798 which represents an acceptable performance.The results are represented by Figure 2.

The canonical correlation
The canonical correlation of variables is given by the Table 20.
The canonical correlation of variables is equal to 0,4430.This value does not permit to decide on the discriminating capacity of the model.Indeed, we will support the discriminating capacity of the model on the results of the AUC.

The functions at group centroids
The functions at group centroids are presented in the Table 21.

The canonical discriminant function
The number of canonical discriminant functions in the case of two groups is limited to a single function.In our study, this function is presented as follows: The separation point is equal to zero ( 114Ã À1;6880 ð Þ þ 1333Ã0;1440 1447 Þ.Indeed, for the prediction of default, a enterprise is considered healthy if f c ! 0. The canonical discriminant function shows that the variables most correlated with the score are the return on equity (X 4 = Net Profit Equity ), the seniority of the principal operational staff (T 4 ) and the number of payment incidents in the last 12 months (T 5 ).

Conception of the rating model by linear discriminant analysis
The results of the process of the conception of the rating model are presented in the Table 22.
The distribution of healthy and defaulting enterprises and the probability of default by rating class are presented in the Figure 3.

Conception of the notation model using the bayesian approach
For the conception of Bayesian rating models, we opted for an aggregation of the opinions of the selected experts regardless of their weightings.Indeed, the estimation is made with a working group that contains the two categories of experts.Indeed, the results will be weighted at the same time by 25% and 50% to determine the Bayesian models associated with the LDA model.

The explicit probability of default
The results of the assessment of the explicit default probability by rating class are presented as follows:

The implicit probability of default
The implicit probability of default is defined in the Table 23.
The results of the assessment of the implicit probability of default by rating class are presented in the Table 24.
The probability of default by experts (PD e;K Þ retained is the mean of the explicit and implicit probability of default with a minimum of 0,03% for class A: The Table 25 summarizes the results as follows:

Determination of bayesian rating models
The Bayesian probability of default is a weighting of the experts' default probability and the default probability determined by the statistical models: . The distribution of healthy and defaulting enterprises and the probability of default by rating class.The Bayesian models according to the weighting of expert opinion are presented in the Table 26.

Impact of Bayesian modelling on the calculation of unexpected loss (UL)
In Morocco, the legislator has adjusted the formula (4) to take into consideration the characteristics of Moroccan SMEs 3 .Indeed, it provides the following correlation formula for each company (i): For the IRB À F approach, maturity M for companies is equal to 2,5.As a result, the formula (3) becomes: The weighted assets for each enterprise (i) are expressed by: The unexpected loss incurred with each company ði) is expressed as follows: The weighted assets are determined by rating class for each model.Let be n ic ; c ¼ A; . . .H, the size of the class (c), the unexpected loss UL c of class c ð Þ is defined as follows: Knowing that the loss given default (LGD) in the framework of the IRB À F approach is equal to 45% and that the probability of default (PD c ) of class (c) is the same for all enterprises in this class, the previous formula is written: The total unexpected loss (UL) is equal to: 4.6.1.The determination of the unexpected loss by LDA The unexpected loss (UL 0 Þ by rating class according to the LDA model is presented in the Table 27.

The Bayesian unexpected loss associated with linear discriminant analysis
The bayesian unexpected loss associated with LDA is presented in the Table 28.
The results show that the Bayesian approach according to the methodology presented above reduces the unexpected loss, respectively, by 4,7% and 10,1%.

Conclusion
The measurement of credit risk is a major preoccupation for banks because they have to determine the expected loss to be covered by provisions in the framework of IFRS9 and the unexpected loss that represents the regulatory capital requirement.
The conception of the rating model that classifies counterparties according to their risk profile is the core of the IRB approach.Indeed, several techniques can be used to model the default and determine the different rating classes.
In this paper, we used the linear discriminant analysis to construct a statistical rating model by determining the relationship between the default and the quantitative and qualitative variables of the enterprises.
Then, we proposed a Bayesian approach that permits to integrate the experts' estimation to calculate the posterior probability of default by rating class.The rating model constructs at the same performance as the statistical model because the adjustment only concerns the probability of default by class.
The proposed approach has several advantages in that it permits the capture of events not taken into account by the statistical model by using the expert opinion such as changes in the economic situation, an increase in the default rate, the various incidents giving rise to the termination of the banking relationship, and changes in the control and decision-making processes.
However, the effectiveness of this approach depends on the rigour of information collection procedures for avoiding an underestimation of default probabilities.Indeed, the Bayesian approach has a major disadvantage related to the quality of information provided by professional experts.As a result, we have presented an approach based on the Delphi technique, proposed the tools for selecting experts and evaluators, and we have determined the steps needed to collect reliable information.
The calculation of the unexpected loss showed that the Bayesian approach reduces the capital requirement.In our empirical study, the lost profit varies between 4,7% and 10,1%.
In summary, the Bayesian approach permits the adjustment of statistical models in order to conform as closely as possible to economic conjuncture and the internal changes in terms of control and decision-making.However, the collection procedures must be very rigorous in order to reduce the risk of the reliability of the information collected from experts.
whose turnover is less than 10 million MADare treated as equivalent to this amount.

Class
The quantitative variables (V j ,1 j 16) C 1 : Activity This class determines the impact of the size of the company on the probability of default.Indeed, the size is determined by the turnover, the number of employees, Profit growth and the age of the company; This ratio measures the return on equity.Indeed, low return can mean that the company is having difficulty establishing itself in the market and high return can mean that the company is taking unmeasured risks This ratio measures the proportion of net profit in turnover.Indeed, a low ratio may mean that the company practices low margins dictated by market conditions or has difficulties mastering the costs of production and marketing.Whereas a high ratio means that sales turnover is high or that the company is dominant in its market so that it can impose these prices by mastering costs.
C 3 : Solvency Solvency is measured by the proportion of financial expense in turnover as well as the proportion of debt to equity in the company's financing.Indeed, high financial costs can be a sign of the company's difficulty and a high proportion of the debt can mean that the company is over-indebted.
This ratio measures the proportion of financial charges in turnover.a high ratio may mean that the company regularly uses expensive banking services due to long customer lead times payment, lack of liquidity or difficulties in debt collection.
V 8 V 8 ¼ DLMT Debt long and meduim term

ð Þ Equity
This ratio measures the proportion of debt log and medium term in the company's financing.Indeed, a high level of credit financing can mean that the company is over-indebted or small capitalized.

Class
The quantitative variables (V j ,1 j 16) C 4 : Liquidity Liquidity represents the ability of current assets to cover current liabilities.Indeed, default depends on the company's ability to honor these short-term commitments.
V 10 Let be: A =Accounts Receivable; B = assets:Cash; C = Liabilities:Cash; D = working capital requirements; F =inventory: This ratio represents the rate of coverage of current liabilities and cash liabilities by accounts receivable and cash assets.Indeed, a coverage in excess of 100% is a sign of the company's solvency.
C 5 : financial structure The financial structure represents the balance between long-term assets and long-term liabilities as well as the proportion of total assets financed by equity.

Current Assets
This ratio represents the proportion of current assets financed by excess of permanent capital over fixed assets.indeed, the company can be in a comfortable position if the coverage rate is significant without going so far as to have an inactive cash position.
This ratio represents the proportion of equity in the financing of the company's total assets.Indeed, a low ratio is synonymous with undercapitalization C 6 : Turnover This class is dedicated to the study of the rotation of certain aggregates such as enterprise value, inventory, Accounts receivable and Accounts payable.Indeed, a high rotation is significant for business dynamism and business continuity.
Net fixed assetsþWCR where WCR is working capital requirement The sum of Net fixed assets and WCR of represents the economic value of the company.As a result, this ratio represents the number of times per year that turnover can repay the economic value of the company.Indeed, it means that the company's activity can repay capital, medium and longterm debts and the working capital requirement a number of times per year equivalent to the value of the ratio.
This ratio determines the velocity of stock rotation.Indeed, a high rotation can be synonymous with a dynamic company on its market This ratio makes it possible to assess the payment terms granted to customers and the effectiveness of the collection policy.Indeed, a low ratio shows that the company practices short lead times, which can be synonymous with a strong market position and recovery efficiency.

Total purchase
This ratio measures the turnover velocity of supplier credit in relation to total purchases.Indeed, it measures the company's ability to negotiate payment terms with these suppliers.Supplier credit is measured in number of days of purchases.A high ratio can be synonymous with a solid position of the company with regard to these suppliers.

Theme
The qualitative variables (q m ,1 19) T 1 : The sector of activity This theme contains factors external to the company that may correlate with the company's failure, such as industry, regulation and natural risk.
This variable can explain the company's default.Indeed, companies in sectors with a high failure rate may have a higher probability of default than companies in sectors with a low failure rate.
q 2 q 2 ¼ Regulatory impact next year Changes in regulations, such as import or export restrictions, increases in taxes, payment delays and labour law, may have a negative impact on the company's business.consequently, the assessment of the impact of the regulations on the company may influence the default prediction q 3 q 3 ¼ Exposure to natural risk Climatic conditions can affect the good functioning of certain companies, particularly those in the agri-food sector.As a result, exposure to natural risks such as flooding and drought can have an impact on the probability of default T 2 : The company's positioning and competition This theme concerns the competitive environment of the company and the difficulties that the company may encounter when setting up.Indeed, the company's position on the market as well as barriers to entry and national and international competition can have an impact on the company's failure.
q 4 q 4 ¼ Competitive position and intensity The company's position may be monopoly, market leader or among the top 5, top 10 or other.The position may be synonymous with solidity or fragility as a result, it may in some cases explain the default.
q 5 q 5 ¼ Barriers and new entrants The company may be an old or new entrant company and the sector may be open access or requires entry conditions that represent barriers for the company such as technical expertise, a given level of capital, regulatory authorizations etc.The existence or absence of barriers may explain the defaults in certain cases.
q 6 q 6 ¼ International Competition Competition can be national or international.Indeed, for SMEs, international competition can be a major constraint.Consequently, the existence of international competition can encourage default and increase the probability of default.
T 3 : The concentration and position of the counterparty vis-à-vis its suppliers and customers The company's relationship and position with its customers and suppliers can be decisive for its survival and business continuity.Indeed, the company can be a maker or a taker of prices and deadlines with its customers and suppliers, which affects its liquidity.It can also be dependent or independent of its customers and suppliers.As a result, this theme can influence the probability of default.
q 7 q 7 ¼ Customer concentration Customer consultation can be crucial for the survival of the company.Indeed, the company may have a single customer, a small number of customers or a diversification of customers.In the first case, the company is dependent on its customer because it can cease its activity following the failure of its customer, the non-conformity of these products and services with the requirements of its customer or following a conflict with him.As a result, the client's concentration may influence the probability of default.
q 8 q 8 ¼ Supplier concentration The concentration of suppliers can also be decisive for the survival of the company.Indeed, the company may be in a position of strength or weakness vis-à-vis these suppliers because it may suffer if there is only one supplier from the latter's abuses in terms of price-fixing and payment terms.As a result, supplier concertation can influence the probability of default.

Theme
The qualitative variables (q m ,1 19) The company can be a maker or taker of prices and payment terms with its customers and suppliers, which allows it to define a set of situations defined by the couple (position with regard to the customer, position with regard to the supplier).Each situation can have an impact on the liquidity and cost of products and services.As a result, this variable can influence the probability of default.
T 4 : Quality and structure of management This theme integrates all the variables relating to the structure and quality of internal management adopted by the company.Indeed, it covers the capital structure, the experience of the chairmen and senior managers, the successor plan, the timely production of accounting and financial information, historical performance in the event of a crisis and the existence of insurance to cover corporate officers.These variables can influence the probability of default.
q 10 q 10 ¼ Succession planning and business continuity The existence of a continuity and succession plan is very important for the survival of the company in the short and long term.Indeed, it ensures that the company is able to ensure the transition in the event of the absence of current managers.The absence of this plan can impact the probability of default.
q 11 q 11 ¼ Experience Chairman The president's experience in managing the company, measured in number of years, is an important factor in the company's survival.Indeed, confirmed experience allows to assure partners about the company's future, particularly in terms of commercial conflict, debt recovery, banking relations and market mastery.As a result, it can have an impact on the probability of default.
q 12 q 12 ¼ Seniority of the principal operational staff This variable includes the top managers in the assessment of the probability of default.Indeed, a confirmed experience of the latter can ensure partners on the future of the company, notably in the absence of the president and a succession plan.
q 13 q 13 ¼ Capital distribution The concentration of capital in a single hand or in a family represents a high risk because the company will be dependent on the person who has a majority share in the capital.Indeed, a division of capital reduces this risk.As a result, the concentration may increase the probability of default q 14 q 14 ¼ Compliance with the accounting documents delivery schedule The timely production of accounting and financial information and its submission to the banker can be synonymous with transparency and better monitoring of the company's financial situation.The can increase partners' uncertainty about the company's future, which can lead to an increase in the probability of default.
q 15 q 15 ¼ Performance last crisis Experience in crisis management is an important indicator of the company's capacity to surmount difficulties, notably, in terms of liquidity and financing.As a result, this variable permits to adjust the probability of default.
q 16 q 16 ¼ Existence of insurance for corporate officers Existence of insurance for corporate officers to cover the loss caused by these officers in the event of mismanagement.As a result, it can reduce the probability of default T 5 : The company's history with banking This theme includes non-payment incidents registered by the bank.Indeed, these incidents are signs of the deterioration of the company's risk profile.The existence of recent incidents may increase the probability of default.
q 17 q 17 ¼ Number of payment incidents in the last12months (Continued)

Theme
The qualitative variables (q m ,1 19) This variable represents the number payment incidents.It shows that the company has liquidity problems q 18 q 18 ¼ Percentages of unpaid bills over the last12months Similarly, for this variable, it complements the previous one by including commercial paper emitted by the customer and rejected by the bank.It shows that the company has liquidity problems.
T 6 : Relationswithbanks This theme concerns the number of banks in relation to the company.Indeed, for small and medium companies, it is preferable for the relationship to be limited to a single bank; this allows the company to benefit from the bank's advice and to limit the management charge to a single bank.For big companies with a significant financing requirement, it is preferable for banks to form a consortium to finance it in order to prevent the risk of concentration.
q 19 q 19 ¼ Number of banks related to the company The number of banks in the company must be proportional to its size.The multitude of banks for small and medium-sized structures can be a sign of lack of liquidity and financial mismanagement.As a result, this variable can impact the probability of default You are free to: Sharecopy and redistribute the material in any medium or format.Adaptremix, transform, and build upon the material for any purpose, even commercially.The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms: Attribution -You must give appropriate credit, provide a link to the license, and indicate if changes were made.You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

No additional restrictions
You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
PD K ¼ Number of firms in default belonging to the class K Total number of firms belonging to the class K (12)

Figure 1 .
Figure 1.Comparison of the performance of correlated variables V 8 and V 9 .

Figure 2 .
Figure 2. Performance of the linear discriminant analysis model.

©
2019 The Author(s).This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license.

Cogent
Business & Management (ISSN: 2331-1975) is published by Cogent OA, part of Taylor & Francis Group.Publishing with Cogent OA ensures: • Immediate, universal access to your article on publication • High visibility and discoverability via the Cogent OA website as well as Taylor & Francis Online • Download and citation statistics for your article • Rapid online publication • Input from, and dialog with, expert editors and editorial boards • Retention of full copyright of your article • Guaranteed legacy preservation of your article • Discounts and waivers for authors in developing regions Submit your manuscript to a Cogent OA journal at www.CogentOA.com

Table 3 .
Univariate analysis and choice of discriminant variables

Table 8 .
The rating grid

Table 9 .
The probability of default by class

Table 10 .
Rating of the scoring criteria for credit portfolio managers

Table 12 .
The portfolio structure in terms of default

Table 11 .
Weighting of credit portfolio managers as a function of the score Score

Table 15 .
The quantitative and qualitative discriminating variables

Table 17 .
Results of the Box's M test

Table 19 .
The Confusion matrix

Table 21 .
The functions at group centroids

Table 22 .
Rating model based on linear discriminant analysis

Table 20 .
The canonical correlation of variables

Table 23 .
Explicit estimation of the probability of default by class according to experts

Table 25 .
The probability of default of the experts retained for the modelling

Table 26 .
Bayesian rating models

Table 24 .
Estimate of expected losses by class according to experts

Table 27 .
The unexpected loss according to the LDA model

Table 28 .
The Bayesian unexpected loss DebtEquityNet debt includes Debt long and meduim term (DLMT) and short-term debt (DST).As a result, it is an extension of the V 8 ratio. in fact, it measures the company's dependence on debt both in the short term and in the medium and long term