Abstract
By diverting funds away from legitimate partners (a.k.a publishers), click fraud represents a serious drain on advertising budgets and can seriously harm the viability of the internet advertising market. As such, fraud detection algorithms which can identify fraudulent behavior based on user click patterns are extremely valuable. Based on the BuzzCity dataset, we propose a novel approach for click fraud detection which is based on a set of new features derived from existing attributes. The proposed model is evaluated in terms of the resulting precision, recall and the area under the ROC curve. A final ensemble model based on 6 different learning algorithms proved to be stable with respect to all 3 performance indicators. Our final model shows improved results on training, validation and test datasets, thus demonstrating its generalizability to different datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Metwally, A., Agrawal, D., El Abbadi, A.: Duplicate detection in click streams. In: Proc. 14th ACM International Conference on World Wide Web (WWW), pp. 12–21 (2005)
Kantardzic, M., Walgampaya, C., Wenerstrom, B., Lozitskiy, O., Higgins, S., King, D.: Improving click fraud detection by real time data fusion. In: Proc. 2008 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), pp. 69–74 (2008)
Li, X., Liu, Y., Zeng, D.: Publisher click fraud in the pay-per-click advertising market: Incentives and consequences. In: Proc. 2011 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 207–209 (2011)
Perera, K.S., Neupane, B., Faisal, M.A., Aung, Z., Woon, W.L.: A novel approach based on ensemble learning for fraud detection in mobile advertising. Technical report, International Worshop on Fraud Detection in Mobile Advertising (FDMA) Competition, Singapore (2012)
Oentaryo, R.J., et al.: International workshop on fraud detection in mobile advertising (FDMA) competition. In: Conjunction with the 4th Asian Conference on Machine Learning (ACML), Singapore (2012), http://palanteer.sis.smu.edu.sg/fdma2012
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Faisal, M.A., Aung, Z., Williams, J.R., Sanchez, A.: Securing advanced metering infrastructure using intrusion detection system with data stream mining. In: Chau, M., Wang, G.A., Yue, W.T., Chen, H. (eds.) PAISI 2012. LNCS, vol. 7299, pp. 96–111. Springer, Heidelberg (2012)
Mladenii, D., Grobelnik, M.: Feature selection for unbalanced class distribution and naive Bayes. In: Proc. 16th International Conference on Machine Learning (ICML), pp. 258–267 (1999)
Hugo, W., Song, F., Aung, Z., Ng, S.K., Sung, W.K.: SLiM on Diet: Finding short linear motifs on domain interaction interfaces in Protein Data Bank. Bioinformatics 26, 1036–1042 (2010)
Phua, C., Alahakoon, D., Lee, V.: Minority report in fraud detection: Classification of skewed data. ACM SIGKDD Explorations Newsletter 6, 50–59 (2004)
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
Sahin, Y., Duman, E.: Detecting credit card fraud by decision trees and support vector machines. In: Proc. 2011 International MultiConference of Engineers and Computer Scientists (IMECS), vol. I, pp. 1–6 (2011)
Fan, G., Zhu, M.: Detection of rare items with TARGET. Statistics and Its Interface 4, 11–17 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Perera, K.S., Neupane, B., Faisal, M.A., Aung, Z., Woon, W.L. (2013). A Novel Ensemble Learning-Based Approach for Click Fraud Detection in Mobile Advertising. In: Prasath, R., Kathirvalavakumar, T. (eds) Mining Intelligence and Knowledge Exploration. Lecture Notes in Computer Science(), vol 8284. Springer, Cham. https://doi.org/10.1007/978-3-319-03844-5_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-03844-5_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03843-8
Online ISBN: 978-3-319-03844-5
eBook Packages: Computer ScienceComputer Science (R0)