A support vector machine (SVM) approach to imbalanced datasets of customer responses: comparison with other customer response models

Kim, Gitae; Chae, Bongsug Kevin; Olson, David L.

doi:10.1007/s11628-012-0147-9

A support vector machine (SVM) approach to imbalanced datasets of customer responses: comparison with other customer response models

Published: 24 May 2012

Volume 7, pages 167–182, (2013)
Cite this article

Service Business Aims and scope Submit manuscript

Gitae Kim¹,
Bongsug Kevin Chae² &
David L. Olson³

2923 Accesses
24 Citations
3 Altmetric
Explore all metrics

Abstract

Customer response is a crucial aspect of service business. The ability to accurately predict which customer profiles are productive has proven invaluable in customer relationship management. An area that has received little attention in the literature on direct marketing is the class imbalance problem (the very low response rate). We propose a customer response predictive model approach combining recency, frequency, and monetary variables and support vector machine analysis. We have identified three sets of direct marketing data with a different degree of class imbalance (little, moderate, high) and used random undersampling method to reduce the degree of the imbalance problem. We report the empirical results in terms of gain values and prediction accuracy and the impact of random undersampling on customer response model performance. We also discuss these empirical results with the findings of previous studies and the implications for industry practice and future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Decoding Customer Behaviour: Relevance of Web and Purchasing Behaviour in Predictive Response Modeling

Predicting direct marketing response in banking: comparison of class imbalance methods

Article 02 January 2017

Customer Response Modeling Using Ensemble of Balanced Classifiers: Significance of Web Metrics

References

Baesens B, Viaene S, Van den Poel D, Vanthienen J, Dedene G (2002) Bayesian neural network learning for repeat purchase modelling in direct marketing. Eur J Oper Res 138:191–211
Article Google Scholar
Blattberg R, Kim B, Neslin S (2008) Database marketing: analyzing and managing customers, Chapt. 2 RFM analysis. Springer, New York
Bose I, Chen X (2009) Quantitative models for direct marketing: a review from systems perspective. Eur J Oper Res 195:1–16
Article Google Scholar
Burez J, Van den Poel D (2009) Handling class imbalance in customer churn prediction. Expert Syst Appl 36:4626–4636
Article Google Scholar
Clarke R, Ressom H, Wang A, Xuan J, Liu M, Gehan E, Wang Y (2008) The properties of high-dimensional data spaces: implications for exploring gene and protein expression data. Nat Rev 8:37–49
Article Google Scholar
Cui D, Curry D (2005) Prediction in marketing using the support vector machine. Mark Sci 24:595–615
Article Google Scholar
Cui G, Wong M, Zhang G, Li L (2008) Model selection for direct marketing: performance criteria and validation methods. Mark Intell Plan 26:275–292
Article Google Scholar
Drummond C, Holte R (2003) C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Workshop on learning from imbalanced data sets at the 17th international conference on machine learning. Washington, DC, pp 1–8
Ha K, Cho S, Maclachlan D (2005) Response models based on bagging neural networks. J Interactive Mark 19:17–30
Article Google Scholar
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann, San Francisco
Google Scholar
He H, Garcia E (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21:1263–1284
Article Google Scholar
Hughes A (2005) Strategic database marketing, 3rd edn. McGraw-Hill, New York
Google Scholar
Joo Y, Kim Y, Yang S (2011) Valuing customers for social network services. J Bus Res 64:1239–1244
Article Google Scholar
Khoshgoftaar T, Van Hulse J, Napolitano A (2010) Supervised neural network modeling: an empirical investigation into learning from imbalanced data with labeling errors. IEEE Trans Neural Netw 21:813–830
Article Google Scholar
Khoshgoftaar T, Van Hulse J, Napolitano A (2011) Comparing boosting and bagging techniques with noisy and imbalanced data. IEEE Trans Syst Man Cybern Part A 41:552–568. doi:10.1109/Tsmca.2010.2084081
Article Google Scholar
Lessmann S, Voß S (2009) A reference model for customer-centric data mining with support vector machines. Eur J Oper Res 199:520–530
Article Google Scholar
Ling C, Li C (1998) Data mining for direct marketing: problems and solutions. In: Proceeding of 4th international conference on knowledge discovery and data mining (KDD’98). AAAI Press, New York, pp 73–79
Linoff G, Berry M (2011) Data mining techniques, 3rd edn. Wiley, Indianapolis
Google Scholar
McCarthy J, Hastak M (2007) Segmentation approaches in data-mining: a comparison of RFM, CHAID, and logistic regression. J Bus Res 60:656–662
Article Google Scholar
Ngai E, Xiu L, Chau D (2009) Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst Appl 36:2592–2602. doi:10.1016/j.eswa.2008.02.021
Article Google Scholar
Olson D (2007) Data mining in business services. Serv Bus 1:181–193. doi:10.1007/s11628-006-0014-7
Article Google Scholar
Olson D, Delen D (2008) Advanced data mining techniques. Springer, Heidelberg
Google Scholar
Olson D, Cao Q, Gu C, Lee D (2009) Comparison of customer response models. Serv Bus 3:117–130
Article Google Scholar
Schölkopf B, Smola A, Williamson R, Bartlett P (2000) New support vector algorithms. Neural Comput 12:1207–1245
Article Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Google Scholar
Verhaert G, Van den Poel D (2011) Empathy as added value in predicting donation behavior. J Bus Res 64:1288–1295
Article Google Scholar
Verhoef P, Spring P, Hoekstra J, Leeflang P (2003) The commerical use of segmentation and predictive modeling techniques for database marketing in the Netherlands. Decis Support Syst 34:471–481
Article Google Scholar
Verhoef P, Venkatesan R, McAlister L, Malthouse E, Krafft M, Ganesan S (2010) CRM in data-rich multichannel retailing environments: a review and future research directions. J Interactive Mark 24:121–137
Article Google Scholar
Viaene S, Baesens B, Van Gestel T, Suykens J, Van den Poel D, Vanthienen J, De Moor B, Dedene G (2001) Knowledge discovery in a direct marketing case using least squares support vector machines. Int J Intell Syst 16:1023–1036
Article Google Scholar
Wang K, Zhou S, Yang Q, Yeung J (2005) Mining customer value: from association rules to direct marketing. Data Min Knowl Disc 11:57–79. doi:10.1007/s10618-005-1355-x
Article Google Scholar
Weiss G (2004) Mining with rarity: a unifying framework. ACM SIGKDD Explor Newsl 6:7–19
Article Google Scholar
Wu J, Roy J, Stewart W (2010) Prediction modeling using EHR data. Med Care 48:S106–S113
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial and Manufacturing Systems Engineering, Kansas State University, Manhattan, KS, USA
Gitae Kim
Department of Management, Kansas State University, Manhattan, KS, USA
Bongsug Kevin Chae
Department of Management, University of Nebraska, Lincoln, NE, USA
David L. Olson

Authors

Gitae Kim
View author publications
You can also search for this author in PubMed Google Scholar
Bongsug Kevin Chae
View author publications
You can also search for this author in PubMed Google Scholar
David L. Olson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bongsug Kevin Chae.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, G., Chae, B.K. & Olson, D.L. A support vector machine (SVM) approach to imbalanced datasets of customer responses: comparison with other customer response models. Serv Bus 7, 167–182 (2013). https://doi.org/10.1007/s11628-012-0147-9

Download citation

Received: 27 April 2012
Accepted: 03 May 2012
Published: 24 May 2012
Issue Date: March 2013
DOI: https://doi.org/10.1007/s11628-012-0147-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A support vector machine (SVM) approach to imbalanced datasets of customer responses: comparison with other customer response models

Abstract

Access this article

Similar content being viewed by others

Decoding Customer Behaviour: Relevance of Web and Purchasing Behaviour in Predictive Response Modeling

Predicting direct marketing response in banking: comparison of class imbalance methods

Customer Response Modeling Using Ensemble of Balanced Classifiers: Significance of Web Metrics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A support vector machine (SVM) approach to imbalanced datasets of customer responses: comparison with other customer response models

Abstract

Access this article

Similar content being viewed by others

Decoding Customer Behaviour: Relevance of Web and Purchasing Behaviour in Predictive Response Modeling

Predicting direct marketing response in banking: comparison of class imbalance methods

Customer Response Modeling Using Ensemble of Balanced Classifiers: Significance of Web Metrics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation