Modeling Insurance Fraud Detection Using Imbalanced Data Classification

Hassan, Amira Kamil Ibrahim; Abraham, Ajith

doi:10.1007/978-3-319-27400-3_11

Amira Kamil Ibrahim Hassan^8,9 &
Ajith Abraham^8,10

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 419))

1486 Accesses
20 Citations

Abstract

This paper proposes an innovative insurance fraud detection method to deal with the imbalanced data distribution. The idea is based on building insurance fraud detection models using Decision tree (DT), Support vector machine (SVM) and Artificial Neural Network (ANN), on data partitions derived from under-sampling (with-replacement and without-replacement) of the majority class and merging it with the minority class. Throughout the paper, ten-fold cross validation method of testing is used. Its originality lies in the use of several partitioning under-sampling approaches and choosing the best. Results from a publicly available automobile insurance fraud detection data set demonstrate that DT performs slightly better than other algorithms, so DT model was used to compare between different partitioning-under-sampling approaches. Empirical results illustrate that the proposed model gave better results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Phua, C., Alahakoon, Damminda, Lee, Vincent: Minority report in fraud detection: classification of skewed data. ACM SIGKDD Explor. Newsl. 6, 50–59 (2004)
Article Google Scholar
Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1, 67–82 (1997)
Article Google Scholar
Pérez, J.M., Muguerza, J., Arbelaitz, O., Gurrutxaga, I., Martín, J.I.: Consolidated tree classifier learning in a car insurance fraud detection domain with class imbalance. In: Pattern Recognition and Data Mining (ed), pp. 381–389. Springer (2005)
Google Scholar
Farquad, M., Ravi, V., Raju, S.B.: Analytical CRM in banking and finance using SVM: a modified active learning–based rule extraction approach. Int. J. Electron. Customer Relat. Manag. 6, 48–73 (2012)
Article Google Scholar
Sundarkumar, G.G., Ravi, V.: A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance. Eng. Appl. Artif. Intell. 37, 368–377 (2015)
Article Google Scholar
Ibarguren, I., Pérez, M., Muguerza, J., Gurrutxaga, I., Arbelaitz, O.: Coverage based resampling: building robust consolidated decision trees. Knowl.-Based Syst. (2015)
Google Scholar
Hassan, A.K.I., Abraham, A.: Computational intelligence models for insurance fraud detection: a review of a decade of research. J. Netw. Innovative Comput. 1, 341–347 (2013)
Google Scholar
Sternberg, M., Reynolds, R.G.: Using cultural algorithms to support re-engineering of rule-based expert systems in dynamic performance environments: a case study in fraud detection. IEEE Trans. Evol. Comput. 1, 225–243 (1997)
Article Google Scholar
Brockett, P.L., Xia, X., Derrig, R.A.: Using Kohonen’s self-organizing feature map to uncover automobile bodily injury claims fraud. J. Risk Insur. 245–274 (1998)
Google Scholar
Tennyson, S., Salsas-Forn, P.: Claims auditing in automobile insurance: fraud detection and deterrence objectives. J. Risk. Insur. 69, 289–308 (2002)
Article Google Scholar
Artı́s, M., Ayuso, M., Guillén, M.: Modelling different types of automobile insurance fraud behaviour in the Spanish market. Insur.: Math. Econ. 24, 67–81 (1999)
Google Scholar
Artís, M., Ayuso, M., Guillén, M.: Detection of automobile insurance fraud with discrete choice models and misclassified claims. J. Risk Insur. 69, 325–340 (2002)
Article Google Scholar
Caudill, S.B., Ayuso, M., Guillen, M.: Fraud detection using a multinomial logit model with missing information. J. Risk Insur. 72, 539–550 (2005)
Article Google Scholar
Belhadji, E.B., Dionne, G., Tarkhani, F.: A model for the detection of insurance fraud. In: Geneva Papers on Risk and Insurance. Issues and Practice, pp. 517–538 (2000)
Google Scholar
Pinquet, J., Ayuso, M., Guillen, M.: Selection bias and auditing policies for insurance claims. J. Risk Insur. 74, 425–440 (2007)
Article Google Scholar
Viaene, S., Derrig, R.A., Baesens, B., Dedene, G.: A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. J. Risk Insur. 69, 373–421 (2002)
Article Google Scholar
Viaene, S., Derrig, R.A., Dedene, G.: A case study of applying boosting Naive Bayes to claim fraud diagnosis. IEEE Trans. Knowl. Data Eng. 16, 612–620 (2004)
Article Google Scholar
Viaene, S., Dedene, G., Derrig, R.A.: Auto claim fraud detection using Bayesian learning neural networks. Expert Syst. Appl. 29, 653–666 (2005)
Article Google Scholar
Xu, W., Wang, S., Zhang, D., Yang, B., Random rough subspace based neural network ensemble for insurance fraud detection. In: 2011 Fourth International Joint Conference on Computational Sciences and Optimization (CSO), pp. 1276–1280 (2011)
Google Scholar
Vasu, M., Ravi, V.: A hybrid under-sampling approach for mining unbalanced datasets: applications to banking and insurance. Int. J. Data Min. Model. Manage. 3, 75–105 (2011)
Google Scholar
Viaene, S., Ayuso, M., Guillen, M., Van Gheel, D., Dedene, G.: Strategies for detecting fraudulent claims in the automobile insurance industry. Eur. J. Oper. Res. 176, 565–583 (2007)
Article MATH Google Scholar
Bhowmik, R.: Detecting auto insurance fraud by data mining techniques. J. Emerg. Trends Comput. Inf. Sci. 2, 156–162 (2011)
MathSciNet Google Scholar
Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. Intell. Syst. Appl. IEEE 14, 67–74 (1999)
Article Google Scholar
Chan, P.K., Stolfo, S.J.: A comparative evaluation of voting and meta-learning on partitioned data. In ICML, pp. 90–98 (1995)
Google Scholar
Tomar, D., Agarwal, S.: A survey on Data Mining approaches for Healthcare. Int. J. Bio-Sci. Bio-Technol. 5, 241–266 (2013)
Article Google Scholar
Apté, C., Weiss, S.: Data mining with decision trees and decision rules. Future Gener. Comput. Syst. 13, 197–210 (1997)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge university press (2000)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines (ed). Cambridge University Press (2000)
Google Scholar
Silver, M., Sakata, T., Su, H.-C., Herman, C., Dolins, S.B., Shea, M.J.O.: Case study: how to apply data mining techniques in a healthcare data warehouse. J. Healthc. Inf. Manage. 15, 155–164 (2001)
Google Scholar
Phua, C., Alahakoon, D., Lee, V.: Minority report in fraud detection: classification of skewed data. ACM SIGKDD Explor. Newsl. 6, 50–59 (2004)
Article Google Scholar
Hassan, A.K.I., Abraham, A.: Modeling consumer loan default prediction using neural netware. In: 2013 International Conference on Computing, Electrical and Electronics Engineering (ICCEEE), pp. 239-243 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of computer science, Sudan University of Science and Technology, Khartoum, Sudan
Amira Kamil Ibrahim Hassan & Ajith Abraham
Machine Intelligence Research Labs (MIR Labs), Auburn, WA, USA
Amira Kamil Ibrahim Hassan
IT4Innovations, VSB - Technical University of Ostrava, Ostrava, Czech Republic
Ajith Abraham

Authors

Amira Kamil Ibrahim Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Ajith Abraham
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amira Kamil Ibrahim Hassan .

Editor information

Editors and Affiliations

School of Mathematics, Statistics and CS, University of KwaZulu-Natal, Pietermaritzburg, South Africa
Nelishia Pillay
Sch of IT, Dept of CS, University of Pretoria, Pretoria, South Africa
Andries P. Engelbrecht
Sci N/w for Innova and Research Exc, Machine Intelligence Research Labs, Auburn, Washington, USA
Ajith Abraham
Department of Computing Sciences,, Nelson Mandela Metropolitan University, Port Elizabeth, South Africa
Mathys C. du Plessis
Faculty of Electrical Engineering and, Technical University of Ostrava, Ostrava-Poruba, Czech Republic
Václav Snášel
Fakulti Teknologi Maklumat dan Komunikas, Universiti Teknikal Malaysia Melaka, Durian Tunggal, Malaysia
Azah Kamilah Muda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hassan, A.K.I., Abraham, A. (2016). Modeling Insurance Fraud Detection Using Imbalanced Data Classification. In: Pillay, N., Engelbrecht, A., Abraham, A., du Plessis, M., Snášel, V., Muda, A. (eds) Advances in Nature and Biologically Inspired Computing. Advances in Intelligent Systems and Computing, vol 419. Springer, Cham. https://doi.org/10.1007/978-3-319-27400-3_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-27400-3_11
Published: 18 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27399-0
Online ISBN: 978-3-319-27400-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics