Bank Transaction Analyze for Recognize of Money Laundering Using Decision Tree Algorithm (DTA)

Bank Transaction Analyze for Recognize of Money Laundering Using DTA One of the important problems of the banking systems is illegal transactions based on fraud and money laundering that can destroy the economy and financial foundation of a country. Fraud and money laundering are used to escape from tax payment or inject dirty money to the economy cycle. The offenders use lots of transactions to show their illegal funds rightful. One important problem of fraud and money laundering recognize in banking system is high complexity of it. We need to use knowledge discover methods like learning machine and data mining. We can mostly recognize the hidden pattern of illegal transactions by using various learning machines or data mining like Bayesian Network, Decision Tree, Support Vector Machine or Artificial Neural Network. To recognize the pattern and classify the banking transactions, Learning machine methods need to classify in two categories Normal and Abnormal to find suitable features for increase their accuracy. Actually the feature selection is too important in bank fraud recognition and it is a kind of optimization too because the suitable feature selection causes recognize error go down by techniques like Decision Tree. In this research we use Decision Tree Algorithm to select important features related with bank fraud to decide and knowledge discovery which has a good speed in discover of bank fraud. When the features of bank fraud select in right way and with good accuracy then the accuracy of Decision Tree will be increased and we must select those features that are more important to get better accuracy in recognition of fraud and money laundering.


Introduction
Nowadays, most financial and banking services are provided through the Internet, and in other words, e-commerce desperately needs web banking to advance its goals. In ebanking, money transfer services are done through the Internet with great ease [1] and this issue has led to a high level of acceptance of banking services on the Internet and the volume of banking transactions are increasing day by day. Despite the variety of e-banking services [2], this service has various challenges such as fraud and money laundering that limit its use. These challenges can be addressed by using different machine learning methods. One method of machine learning is the decision tree, which uses a tree structure to make decisions and discover knowledge [3] and has a high speed to detect bank fraud. However, its accuracy depends on the choice of important features associated with banking transactions. In other words, the accuracy of the decision tree in detecting bank fraud increases [4] when its features are selected correctly and with high accuracy and features are used for learning that are more important. The purpose of this article is to increase the accuracy of knowledge discovery methods such as decision tree using group intelligence methods and increase trust in electronic banking and combat tax evasion [5]. 1 In this paper, it is shown that with the help of a technique such as decision tree, the average error of distinguishing money laundering transactions from normal can be reduced.

Decision Tree
The decision tree is a supervised learning method used to classify and identify patterns, and in this method data mining of tree structures is used to information analyze.
In this simple form, the decision tree for classifying information defines a set of rules on its nodes, according to which the tree is scrolled and each data or instance is attributed to a sheet, which is a class number, based on the attribute used, is mapped.
A specimen scrolls left or right under a tree according to a rule to meet the leaves and place them in its class. The recommended flowchart steps for selecting important features in detecting bank fraud and classifying transactions based on the decision tree mechanism based on grasshopper optimization algorithm are shown in Figure 1: START Use optimal feature vector as a modeling decision tree algorithm to detect fraud Initial parameters such as the size of the initial population and the number of iterations of the locust An initial population of locusts is formed in such a way that each locust is a subspecies of a feature.
Grasshoppers with corresponding feature vectors are used for machine learning Each grasshopper by corresponding feature vector evaluated by the objective function of the problem The optimal grasshopper determined by the feature vector that has the least error in classifying fraudulent Each grasshopper with corresponding feature vector is updated by grasshopper algorithm Each solution with feature vectors is affected via mapped and binary The above steps are repeated and finally the optimal grasshopper feature vector of the population is selected END

Figure1-recommended flowchart for detecting fraud and money laundering in the banking system
In the recommended flowchart create a population of attribute vectors based on data related to banking transactions which is considered as a grasshopper and evaluated by the objective function. There are two layers in recommended flowchart that in the layer of feature selection algorithm is done by the grasshopper optimization algorithm then, by selection of features and delivering them to the second layer, which is the decision tree, a classification of transactions is performed and the amount of error in detecting legal and illegal transaction is reported and again, the grasshopper optimization algorithm selects other features to recalculate the model error rate, and the algorithm tries to reduce its error rate by selection important features as much as possible.

Simulation review results
To analyze the recommended method, its accuracy needs to be compared with several data mining methods in detecting bank fraud. The Bayesian network method, Decision tree, Support Vector Machine and Artificial Neural Network are practical methods for classifying and detecting bank fraud that can be compared with the recommended method. Weka application software can apply classification methods on bank fraud data, several outputs of these techniques can be seen in the figures (11-4), (12-4), (13-4), (14-4) to detect bank fraud in Weka.    The analysis of the graphs shows that the lowest sensitivity is related to the Bayesian Network and the highest sensitivity is related to the Support Vector Machine and the Artificial Neural Network.
In the next section, by selecting the appropriate features, an attempt is made to increase the sensitivity and accuracy of the recommended method in detecting bank fraud and to increase the efficiency of the decision tree for detecting fraud.
To analyze the recommended algorithm in distinguishing fake transactions from legal ones based on feature selection and error reduction, it is necessary that the variables and parameters related to the grasshopper optimization algorithm are adjusted and quantified according to It is observed that increasing the size of the population leads to a greater reduction in the error rate possibly more optimal features are selected.   Figure 14 comparison of the sensitivity index of the recommended method and other methods in detecting bank fraud

Conclusion
According to the experiments, it can be seen that error average of the decision tree is about 0.542 and this amount of error for the recommended method is equal to 0.426 and the recommended method was able to reduce error by 18.70%.
The analysis of the accuracy and sensitivity index also shown that the decision tree has an accuracy of 70,63 while the accuracy index in the recommended method is 20,73 and its sensitivity is 20,77 and in other words, the recommended method improves the accuracy of the decision tree, index in detecting fraud by about 21,19% and its sensitivity improved index by 21%.
Our analysis and evaluation shows that the accuracy of the recommended method has the highest possible value then it is in the second place of Bayesian network technique and the worst value of the accuracy index is related to the Support Vector Machine.
On the other hand, the sensitivity index analysis shows that the maximum index is related to the recommended method and the lowest is related to the Bayesian network.

DECLARATIONS
• Ethics approval and consent to participate The participation and publishing of this research and the knowledge within, has been approved by the ethical factors. There has been no problem found within the publication of this paper and it is all permitted.
• Availability of data and materials All of the data and the materials included and used through this research are available. •