Integration of Iterative Dichotomizer 3 and Boosted Decision Tree to Form Credit Scoring Profile

Loan is becoming essential need in this modern life. Banks need to keep their NPL ratio low in order to maintain their financial health. One of customer’s screening techniques is credit scoring. This studi is conducted to implement credit scoring profile using Integration of Iterative Dichotomizer 3 and Boosted Decision Tree. Decision tree is a simple method to classify a condition into two different classes using given classifier, and widely used to perform credit scoring in the financial industry. We integrated Iterative Dichotomizer 3 and Boosted Decision Tree methods and used Microsoft Azure Machine Learning tools to perform credit score profiling. This study is cross sectional in time and using 600 instances data of loan submission in Tangerang, Indonesia. The result shows good performance with performance evaluation metric of accuracy, precision, recall, and F1 score are 0.85, 0.885, 0.793 and 0.836 respectively. Keywords— Boosted Decision Tree, Credit Scoring, Iterative Dichotomizer 3


I. INTRODUCTION
Loan is becoming essential need in this modern life. Almost in every financial needs, we may apply for loans. For example, car loan, home loan, business loan, and student loan. In some cases, debitors can't pay their loans back. Bank refers it as non-performing loan (NPL). The NPL ratio of Indonesia is one of the highest NPL ratio of ASEAN countries [1]. It was 2.73% as October 2019, compared to 2.2% of Philippines, Malaysia's 1.6%, 1.3% of Singapore, and 2% of Vietnam.
Banks need to keep their NPL ratio low in order to maintain their financial health. Screening and profile analysis for new customers are mandatory. One of screening techniques is credit scoring. Credit scoring is an efficient method to measure the systematic risk when financing the individual customers as well as the small and medium sized enterprises (SMEs) [2].
In this study, we will use iterative dichotomizer 3 and two-class boosted decision tree techniques to develop credit scoring method, and analyze its advantages compared to other decision tree techniques.

II. LITERATURE STUDIES
Basically, decision tree is a simple method to classify a condition into two different classes using given classifier. For example, we will classify balls into two classes called -big‖ and -small‖. We use classifier -if the diameter is under 10 cm, it called small. If the diameter is 10 cm or above, it called big‖.  Decision tree method was developed and expanded into many types to classify any specific conditions, including to develop credit score which used in lending and banking industries. A boosted decision trees method was used to develop credit scoring model that help lenders decide whether to grant or reject credit to applicants [3]. Basically, boosted decision trees is a technique in which a result class from a decision tree is weighted to be developed as a new classifier to expand the tree and give more specific result based on more specific classifiers. Figure 2 may help to figure out this understanding.

Figure 2. Boosted Decision Tree
Boosting is a procedure that aggregates many -weak‖ classifiers in order to build a new -strong‖ classifier. One of boosting techniques is AdaBoost or Adaptive Boosting proposed by Yoav Freund and Robert Schapire in 1996. The boosting process done by building a model from the training data, then creating a second model that attempts to correct the errors from the first model. This process repeated until the training data perfectly predicted. Each instance of the training dataset is weighted. The initial weight is set to -weight(xi) = 1/n‖ where xi is the i'th training instance and n is the number of training instances. According to Bastos, boosted decision trees outperformed the multilayer perceptron and the support vector machines on two real world credit card application datasets.
Another credit scoring analysis was conducted using integration between decision tree and neural network techniques called Decision Tree -Neuro Based Credit Risk Evaluation System [4]. They combined the advantages of decision tree such as easy to understood and fast learning, with the advantage of neural network such as capability to handle noised training data. As we can see in figure 3, the decision tree technique was used to handles bank rules and criterions to give loan to customers, and the output was further processed with neural network technique to make final decision of the loan approval. They found that the accuracy rate of the decision treeneuro based algorithm was 0.88, higher than decision tree's 0.68 and neural network's 0.75.
Iterative Dichotomizer 3 (ID3) Decision Tree also been used to develop credit scoring analyzer [5]. The Iterative Dichotomizer 3 (ID3) algorithm is used to create the shallowest decision trees possible and was invented by John Ross Quinlan in 1986. There are two different values that form the tree, entropy value and information gain value. Entropy value determines whether a node will be splitted (closer to 1) or not (closer to 0). When entropy value is zero, then it determines the class (leaf of tree). When entropy value closer to one, then the attribute should be splitted and a new node will be formed by using the higher information gain value of the attributes.
Suppose we have dataset as seen in table 1. We can develop decision tree using ID3 algorithm as seen in figure 4. Another studies related to credit scoring also been conducted using neural network technique [6], segmentation technique [7], and fuzzy technique [8], [9], and [2].

III. METHODOLOGY
This study is cross sectional in time and using 600 instances data of loan submission in Tangerang, Indonesia. The data was normalized and had attributes CIFno, age, gender, region, income, marital status, number of child, car ownership, saving account ownership, checking account ownership, mortgage, and loan approval. The dataset then been processed by using Microsoft Azure Machine Learning with 480 instances data was used as training dataset. The accuracy and precision rate then be analyzed. IV. RESULTS AND DISCUSSION Basically, Microsoft Azure named its boosted decision tree feature as two-class boosted decision tree. It only differentiates two-class boosted decision tree with multiclass boosted decision tree features, as twoclass boosted decision tree is perfectly fits to binary classification problems and multiclass boosted decision tree may handle complex classification better.
First, we calculate entropy value for each instances data using formula Where: C = number of attribute A = particular attribute For the attribute with many outcomes, information gain tends to be biased. That means it prefers the attribute with a large number of distinct values. Gain ratio handles the issue of bias by normalizing the information gain using Split Information. Split information can be calculated by using formula

………..(4)
We performed Microsoft Azure Machine Learning calculation to form the trees using parameter as seen in table 2. After node is formed, we evaluate and boost the previous node to form next node by using AdaBoost algorithm.

V. MODEL PERFORMANCE EVALUATION
We used four metrics to evaluate the performance of our model, which are accuracy, precision, recall, and F1 score. Accuracy is the most intuitive performance measure and it is simply a ratio of correctly predicted observation to the total observations. Accuracy can be calculated using formula The second metric is precision, which is the ratio of correctly predicted positive observations to the total predicted positive observations. Precision can be calculated by using formula ( ) …..…. (6) The third metric is recall (sensitivity), which is the ratio of correctly predicted positive observations to the all observations in actual class -YES‖. Good recall should have value of 0.5 and above. Recall can be calculated as ( ) ……..… (7) And the forth metric is F1 Score, which is the weighted average of precision and recall. Therefore, this score takes both false positives and false negatives into account. If we have an uneven class distribution, F1 score gives us better look rather than accuracy, while accuracy works best if false positives and false negatives have similar cost. F1 score can be calculated as ………………...(8) Figure 6 shows complete performance of our model. The result shows that our mixed method yields high accuracy, precision, recall, and F1 score of 0.85, 0.885, 0.793 and 0.836 respectively.
Compared to other method like neural network, decision tree is still better fits to handle binary classification such as credit scoring, since it only has two class for the final output, approved and rejected. Although neural network have capability to handle complex attributes and scenarios, it still has limitation, especially due to its black box nature, which is difficult to explained [6].
Combination of decision tree and neural network may perform better performance, especially for cases with multi-class attributes [4]. The decision tree part may provide clear and distinct perceptrons for neural network part. It made the model much more adaptive to handle complex decision making cases.

VI. LIMITATION AND FUTURE
WORK We integrated Iterative Dichotomizer 3 and Boosted Decision Tree methods to form credit scoring profile and the result shows good performance on this technique. However, the boosted decision tree is one of the memory-intensive learners and the current implementation uses relatively high amount of memory. Therefore, we suggest to continue this research and combine with another method to minimize this limitation and to gain better performance of the larger dataset handling.