skip to main content
10.1145/3446922.3446925acmotherconferencesArticle/Chapter ViewAbstractPublication PagesebeeConference Proceedingsconference-collections
research-article

A Comparison between Classifiers on Credit Card Fraud Detection Problem

Published:19 March 2021Publication History

ABSTRACT

The great expansion of data availability gives rise to some consumer financial problems, including credit card fraud. It is an urge to look into this fraudulent behavior and try to make successful detection, so the continuous analysis from raw data is necessary [1]. It should be noted that it is hard to get the original dataset because it will involve personal information. Besides, credit card data fraud transition tends to be highly imbalanced. In this paper, the experiment is based on one credit card dataset, and I mainly assess the performance of three classification models: logistic regression, random forests and boosted regression trees, which are commonly used under imbalanced situation. Sampling methods are also used on the data for a better result. The experiment adopts AUC, H-measure and detection rate as assessment metrics, trying to figure out the best classification model for the credit card fraud.

References

  1. Richard J Bolton and David J Hand. Statistical fraud detection: A review. Statistical science, pages 235–249, 2002.Google ScholarGoogle Scholar
  2. Niall M Adams, David J Hand, Giovanni Montana, David J Weston, and CW Whitrow.Fraud detection in consumer credit. EXPERT, 9(1):21, 2006.Google ScholarGoogle Scholar
  3. Gustavo EAPA Batista, Ronaldo C Prati, and Maria Carolina Monard. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD explorations newsletter, 6(1):20–29, 2004.Google ScholarGoogle Scholar
  4. Haibo He and Edwardo A Garcia. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9):1263–1284, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. Smote:synthetic minority over-sampling technique. Journal of artificial intelligence research,16:321–357, 2002.Google ScholarGoogle Scholar
  6. Hui Han, Wen-Yuan Wang, and Bing-Huan Mao. Borderline-smote: a new over-sampling method in imbalanced data sets learning. In International conference on intelligent computing, pages 878–887. Springer, 2005.Google ScholarGoogle Scholar
  7. Trevor Fitzpatrick and Christophe Mues. An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market. European Journal of Operational Research, 249(2):427–439, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  8. Siddhartha Bhattacharyya, Sanjeev Jha, Kurian Tharakunnel, and J Christopher Westland. Data mining for credit card fraud: A comparative study. Decision Support Systems, 50(3):602–613, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gary M Weiss and Foster Provost. The effect of class distribution on classifier learning: an empirical study. 2001.Google ScholarGoogle Scholar
  10. David J Hand and Christoforos Anagnostopoulos. A better beta for the h measure of classification performance. Pattern Recognition Letters, 40:41–46, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Foster Provost. Machine learning from imbalanced data sets 101. In Proceedings of the AAAI 2000 workshop on imbalanced data sets, volume 68, pages 1–3. AAAI Press, 2000.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    EBEE '20: Proceedings of the 2020 2nd International Conference on E-Business and E-commerce Engineering
    December 2020
    79 pages
    ISBN:9781450388924
    DOI:10.1145/3446922

    Copyright © 2020 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 19 March 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)32
    • Downloads (Last 6 weeks)3

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format