ABSTRACT
The great expansion of data availability gives rise to some consumer financial problems, including credit card fraud. It is an urge to look into this fraudulent behavior and try to make successful detection, so the continuous analysis from raw data is necessary [1]. It should be noted that it is hard to get the original dataset because it will involve personal information. Besides, credit card data fraud transition tends to be highly imbalanced. In this paper, the experiment is based on one credit card dataset, and I mainly assess the performance of three classification models: logistic regression, random forests and boosted regression trees, which are commonly used under imbalanced situation. Sampling methods are also used on the data for a better result. The experiment adopts AUC, H-measure and detection rate as assessment metrics, trying to figure out the best classification model for the credit card fraud.
- Richard J Bolton and David J Hand. Statistical fraud detection: A review. Statistical science, pages 235–249, 2002.Google Scholar
- Niall M Adams, David J Hand, Giovanni Montana, David J Weston, and CW Whitrow.Fraud detection in consumer credit. EXPERT, 9(1):21, 2006.Google Scholar
- Gustavo EAPA Batista, Ronaldo C Prati, and Maria Carolina Monard. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD explorations newsletter, 6(1):20–29, 2004.Google Scholar
- Haibo He and Edwardo A Garcia. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9):1263–1284, 2009.Google ScholarDigital Library
- Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. Smote:synthetic minority over-sampling technique. Journal of artificial intelligence research,16:321–357, 2002.Google Scholar
- Hui Han, Wen-Yuan Wang, and Bing-Huan Mao. Borderline-smote: a new over-sampling method in imbalanced data sets learning. In International conference on intelligent computing, pages 878–887. Springer, 2005.Google Scholar
- Trevor Fitzpatrick and Christophe Mues. An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market. European Journal of Operational Research, 249(2):427–439, 2016.Google ScholarCross Ref
- Siddhartha Bhattacharyya, Sanjeev Jha, Kurian Tharakunnel, and J Christopher Westland. Data mining for credit card fraud: A comparative study. Decision Support Systems, 50(3):602–613, 2011.Google ScholarDigital Library
- Gary M Weiss and Foster Provost. The effect of class distribution on classifier learning: an empirical study. 2001.Google Scholar
- David J Hand and Christoforos Anagnostopoulos. A better beta for the h measure of classification performance. Pattern Recognition Letters, 40:41–46, 2014.Google ScholarDigital Library
- Foster Provost. Machine learning from imbalanced data sets 101. In Proceedings of the AAAI 2000 workshop on imbalanced data sets, volume 68, pages 1–3. AAAI Press, 2000.Google Scholar
Recommendations
Credit Card Fraud Detection with NCA Dimensionality Reduction
SIN 2020: 13th International Conference on Security of Information and NetworksCredit card transactions for online payments have increased dramatically and fraud attempts on these payments have become prevalent with more advanced attacks. Thus, conventional fraud detection mechanisms are inadequate to provide acceptable accuracy ...
Securing credit card transactions with one-time payment scheme
Traditional credit card payment is not secure against credit card frauds because an attacker can easily know a semi-secret credit card number that is repetitively used. Recently one-time transaction number has been proposed by some researchers and ...
Detecting credit card fraud by Modified Fisher Discriminant Analysis
We introduce Fisher Linear Discriminant Analysis (FLDA).We modify it to be sensitive toward profitable instances.We applied them together in credit card fraud detection problem.The results are compared in terms of total obtained profit with three well-...
Comments