A Modified Support Vector Machine model for Credit Scoring

This paper presents a novel quantitative credit scoring model based on support vector machine (SVM) with adaptive genetic algorithm, gr-GA-SVM. In this study, two real world credit datasets in the University of California Irvine Machine Learning Repository are selected for the numerical experiments. SVM, GA-SVM and gr-GA-SVM, are employed to predict the accuracy of credit scoring in two datasets. Numerical results indicate that gr-GA-SVM is more accurate and efficient than SVM and GA-SVM.


Introduction
In the past, banks used credit reports, personal histories and judgment to make credit decisions.But over the past 25 years, credit scoring has become widely used in issuing credit cards and in other types of consumer lending, such as auto loans and home equity loans.Although some models have been developed to estimate the default probabilities of large firms, they have been based on the performance of corporate bonds of publicly traded companies.It is not at all clear that these models would accurately predict the default performance of bank loans to these or other companies.To develop a more accurate loan scoring model for larger businesses, a necessary first step would be the collection of a vast array of data on many different types of businesses along with the performance of loans made to these businesses; the data would have to include a large number of bad, as well as good, loans.Since the typical default rate on business loans is in the range of 1 percent to 3 percent annually, banks would have to pool their data.Such data-collection efforts are currently und fault and the factors; the logistic model ass ong & Selvi [11] ).In Tsai and Wu [12] the aut hich is presented in this study.Section 3 give results of different models in two real sity of California Irvine Machine Learning Repository.Finally, conclusions are ction 4.

s and Materials
r plane that has the largest dis data th x the constrain d points, the formul ne by riable er way.But the fact that loans to large businesses vary in so many dimensions will make the development of a credit scoring model for these types of loans very difficult.
In credit scoring models, there are two methods, statistical methods and machine learning methods.Several statistical methods are used to develop credit scoring systems, including linear probability models, logistic models and probabilistic models.They are standard statistical techniques for estimating the probability of default based on historical data on loan performance and characteristics of the borrower.These techniques differ in that the linear probability model assumes there is a linear relationship between the probability of de umes that the probability of default is logistically distributed; and the probabilistic model assumes that the probability of default has a cumulative normal distribution. [1]everal financial decision-making methods based on machine learning (examples of machine learning techniques used to solve the above financial decisionmaking problems are Atiya [3] ; Huang, Chen, Hsu, Chen, & Wu [4] ; Lee, Chiu, Chou, & Lu [5] ) use the multi-layer perceptron (MLP) as classifier.Other tested classifiers are the Decision Tree and the Support Vector Machine [6] [7] [8] .We want to stress that these studies show that the machine learning based systems are better than the traditional (statistical) methods for bankruptcy prediction and credit scoring problems (Huang et al. [4] ; Ong, Huang, & Tzeng [ 9 ] ; Vellido, Lisboa, & Vaughan [10] ; W hors compare a single MLP classifier with multiple classifiers and diversified multiple classifiers on three datasets.However, they conclude that there is no an exact winner.
The structure of this paper is as following, section 2 introduces related credit scoring models and the novel models-gr-GA-SVM, w credit datasets from Univer presented in Se

SVM
Support vector machines [ 13 ][ 14 ] [ 15 ] (SVM) are a set of related supervised learning methods used for classification and regression.A support vector machine constructs a hyper plane or set of hyper planes in a highdimensional space, which can be used for classification, regression or other tasks.Intuitively, a good separation is achieved by the hype tance to the nearest training data points of any class (so-called functional margin), since in general the larger the margin the lower the generalization error of the classifier.
In order to extend the SVM meth o le od logy to hand at is not fully lin ts slightly to allow early separable r misclassifie , we rela fo ation is following (1.1) and This is do (1.2). introducing a positive slack va Which can be combined into ( )   F .2. Radial Basis Kernel [11] ig moidal Kernel Where a and b are parameters defining the kernel's behavior.
In order to use SVM to solve a classification or regression problem on dataset that is nonlinearly separable, we need to first choose a kernel and relevant parameters which you expect might map the nonlinearly separable data into a feature space where it is linearly sep t science and d error.Sen ation, we need to: arable.This is mo ac can be achieved em l an re of an art than an ex pirically -e.g. by tria sible kernels to start with are Polynomial , Radial Basis and Sigmoid kernels.
For clas fic si

C
α so that, [17][18] Genetic algorithms are implemented in a computer simulation in which a population of abstract representations (called chromosomes or the genotype of the genome) of candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem evolves toward better solutions.Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible.The evolution usually starts from a population of randomly generated individuals and happens in generations.In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population.The new population is then used in the next iteration of the algorithm.Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population.A standard representation of the solution is as an array of bits.Arrays of other types and structures can be used in essentially the same way.The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size, which facilitates simple crossover operations.Variable length representations may a lementation is more complex in this case.Tree-like representations are explored in genetic programming and graph-form representations are explored in evolutionary programming.
The fitness function is defined over the genetic representation and measures the quality of the represented solution.The fitness function is always problem dependent.For instance, in the knapsack problem one wants to maximize the total value of objects that can be put in a knapsack of some fixed capacity.A representation of a solution might be an array of bits, where each bit represents a different object, and the value of the bit (0 or 1) represents whether or not the object is in the knapsack.Not every such representation is valid, as the size of objects may exceed the capacity of the knapsack.The fitness of the solution is the sum resentation is valid or 0 otherwise.In some problems, it is hard or even impossible to define the fitness expression; in these cases, interactive genetic algorithms are used.
Once we have the genetic re population of through repetitive application of mutation, crossover, inversion and selectio . i z or j z′ is an example, or one feature vector for SVM.An intuition xamples that affect the the pro idea is to find out the important e classification results greatly.If these feature vectors are removed, the separating boundary changes the most.
The key important question is how to find out these important training data from all the examples with GA.
2) Defining parameters [21] [22] The value of parameters in Support Vector Machines is important to algorithm's performance.Ángel Kuri-Morales and Iván Mejía-Guevara presented a methodology to train SVM where the regularization parameter (C) was determined automatically via an efficient Genetic Algorithm in order to solve multiple category classification problems.
In previous works, the support vectors have been determined from the application of Lagrange Multipliers, but are not applicable to search for "C".In fact, GA are used to solve the constrained QP.One advantage of us d of problems is that restrictions are not imposed in the form of the objective function: a neither the objective function nor the constraints of ing GA for this kin blem must be derivable in order to solve problems.In some cases, each individual represents a LM ( ), where N is the number of points in the training set for the dual SVM problem.
This algorithm combing GA and SVM has been applied many fields, such as fau de lt tection [23] , protein cation [ 24 ][ 25 ] , network intrusion flow forecasting [27] , Short-term Load icult to produce new schemas r individuations.In this work, an adaptive technique is applied i models, the rate tion are changeab sequences classifi detection [26] , daily Forecasting [28] , Evaluation of competitiveness of power plants [29] , stock index forecasting [30] .

gr-GA-SVM
In parameters of genetic algorithm, the rate of crossover and the rate of mutation affect the speed of convergence.The rates of crossover affect new population.Higher the rate of crossover is, faster new individuation product.High rate of crossover will destroy the schemas of good individuations, while low rate of crossover will postpone production of new individuations.The rate of mutation is a key factor for algorithms to step out from local optimal solution.If the rate of mutation is larger, GA will be a random search algorithm.If the rate of mutation is lesser, it is diff fo n a new models, gr-GA-SVM.In new of crossover and the rate of mut le as following: a ( ) Step eter settings.

Step 2:
Classify dataset by SVM e that will be the fitness function in Genetic algorithm.
Step 3: Set the fitness function.The flowchart of gr-GA-SVM algorithm is showed in fig. 3 in detail.

Materials
In this study, numerical experiments use two datasets, German credit dataset and Australian credit dataset from UCI Machine Learning Repository [31] .
German credit card dataset has 1000 instances.There are twenty-four numerical attributes in this dataset.Number of customers as "good" is 700, and number of customers as "bad" is 300.Australian credit card dataset has 690 Instances.Number of attributes of each instance is 14.This dataset is a good mix of attributes --continuous, nominal with small numbers of values, and nominal with larger numbers of values.There are six numerical and eight categorical attributes in this dataset.There are 307 customers as "good" and 383 customers as "bad".In two datasets, "0" denotes a choose 800 and 500 instances and as train datasets respectively.Rests of t ntinuo used to train and test. .Table 2 lists the app

Results and discussion
For the compare of performance between traditional SVM, GA-SVM and gr-GA-SVM, these models are run several times.ropriate values of these parameters in three algorithms.In order to use SVM to solve credit scoring problems on two datasets that is nonlinearly separable, we first choose a radial basis kernel because we find that SVM based on radial basis kernel is faster than SVM based on polynomial kernel.
Table 3 shows the accuracy comparison of SVM, the accuracy of gr-GA-SVM is 86.84%,GA-SVM is 85.26% and SVM is 81.58%.The results of empirical analysis showed that the predictive ability of all the models is acceptable.Ho GA-SVM and gr-GA-SVM.For the German dataset, the accuracy of gr-GA-SVM is 75.50 72.50% and SVM is 70 0%.For the Australian d wever, the gr-GA-SVM results outperformed than the other methods.

Conclusion
In the last few decades, several credit scoring models have been developed for the credit granting decision.The objective of quantitative credit scoring models is to assign credit applicants to one of two groups: a "good credit" group that is likely to repay the financial obligation, or a "bad credit" group that should be denied credit because of a high likelihood of defaulting on the financial obligation.I the derivatives to zero: Fig.1.Optimal Hyper plane ncti There are t Polynomi) fitness functio ing the processing of the iterations, du f vg is the average of the fitness function during the processing f ′ is the value of the fitness function in one iteration.
-SVM algorithm can be showed as following in detail, 1: Param Setting the population size-PopSize, crossover probability P c, mutation probability P m, .