Abstract
Classification (Carvalho et al. Evaluating the Correlation Between Objective Rule Interestingness Measures and Real Human Interest. Springer, New York, 2005) is an important data mining task, where the value of a discrete (dependent) variable is predicted, based on the values of some independent variables. Classification models should provide correct predictions on new unseen data instances. This accuracy measure is often the only performance requirement used. However, comprehensibility of the model is a key requirement as well in any domain where the model needs to be validated before it can be implemented. Whenever comprehensibility is needed, justifiability will be required as well, meaning the model should be in line with existing domain knowledge. Although recent academic research has acknowledged the importance of comprehensibility in the last years, justifiability is often neglected. By providing comprehensible, justifiable classification models, they become acceptable in domains where previously such models are deemed too theoretical and incomprehensible. As such, new opportunities emerge for data mining. A classification model that is accurate, comprehensible, and intuitive is defined as acceptable for implementation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
D. W. Aha, D. F. Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6:37–66, 1991.
E. Altendorf, E. Restificar, and T.G. Dietterich. Learning from sparse data by exploiting monotonicity constraints. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, Edinburgh, Scotland, 2005.
I. Askira-Gelman. Knowledge discovery: Comprehensibility of the results. In HICSS ’98: Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 5, p. 247, Washington, DC, USA, 1998. IEEE Computer Society.
B. Baesens. Developing intelligent systems for credit scoring using machine learning techniques. PhD thesis, K.U. Leuven, 2003.
B. Baesens, R. Setiono, C. Mues, and J. Vanthienen. Using neural network rule extraction and decision tables for credit-risk evaluation. Management Science, 49(3):312–329, 2003.
B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, J. Suykens, and J. Vanthienen. Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54(6):627–635, 2003.
A. Ben-David. Monotonicity maintenance in information-theoretic machine learning algorithms. Machine Learning, 19(1):29–43, 1995.
D. Billman and D. Davila. Consistency is the hobgoblin of human minds: People care but concept learning models do not. In Proceedings of the 17th Annual Conference of the Cognitive Science Society, pp. 188–193, 1995.
C.M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, Oxford, UK, 1996.
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Chapman 8 Hall, New York, 1984.
D. R. Carvalho, A. A. Freitas, and N. F. F. Ebecken. Evaluating the correlation between objective rule interestingness measures and real human interest. In Alìpio Jorge, Luís Torgo, Pavel Brazdil, Rui Camacho, and João Gama, editors, PKDD, volume 3721 of Lecture Notes in Computer Science, pp. 453–461. Springer, 2005.
P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3(4):261–283, 1989.
W. W. Cohen. Fast effective rule induction. In Armand Prieditis and Stuart Russell, editors, Proc. of the 12th International Conference on Machine Learning, pp. 115–123, Tahoe City, CA, 1995. Morgan Kaufmann.
N. Cristianini and J. Shawe-Taylor. An introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, New York, 2000.
B. Cumps, D. Martens, M. De Backer, S. Viaene, G. Dedene, R. Haesen, M. Snoeck, and B. Baesens. Inferring rules for business/ict alignment using ants. Information and Management, 46(2):116–124, 2009.
H. Daniels and M. Velikova. Derivation of monotone decision models from noisy data. IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications and Reviews, 36(5):705–710, 2006.
P. Domingos. The role of occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery, 3(4):409–425, 1999.
R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification. John Wiley and Sons, New York, second edition, 2001.
Federal Trade Commission for the Consumer. Facts for consumers: Equal credit opportunity. Technical report, FTC, March 1998.
A.J. Feelders. Prior knowledge in economic applications of data mining. In Proceedings of the fourth European conference on principles and practice of knowledge discovery in data bases, volume 1910 of Lecture Notes in Computer Science, pp. 395–400. Springer, 2000.
A.J. Feelders and M. Pardoel. Pruning for monotone classification trees. In Advanced in intelligent data analysis V, volume 2810, pp. 1–12. Springer, 2003.
D. Hand. Pattern detection and discovery. In D. Hand, N. Adams, and R. Bolton, editors, Pattern Detection and Discovery, volume 2447 of Lecture Notes in Computer Science, pp. 1–12. Springer, 2002.
D. Hand. Protection or privacy? Data mining and personal data. In Advances in Knowledge Discovery and Data Mining, 10th Pacific-Asia Conference, PAKDD 2006, Singapore, April 9-12, volume 3918 of Lecture Notes in Computer Science, pp. 1–10. Springer, 2006.
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Springer, New York, 2001.
J. Huysmans, B. Baesens, D. Martens, K. Denys, and J. Vanthienen. New trends in data mining. In Tijdschrift voor economie en Management, volume L, pp. 697–711, 2005.
J. Huysmans, C. Mues, B. Baesens, and J. Vanthienen. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. 2007.
D.G. Kleinbaum, L.L. Kupper, K. E. Muller, and A. Nizam. Applied Regression Analysis and Multivariable Methods. Duxbury Press, North Scituate, MA, 1997.
Y. Kodratoff. The comprehensibility manifesto. KDD Nuggets (94:9), 1994.
D. Martens, B. Baesens, and T. Van Gestel. Decompositional rule extraction from support vector machines by active learning. IEEE Transactions on Knowledge and Data Engineering, 21(2):178–191, 2009.
D. Martens, B. Baesens, T. Van Gestel, and J. Vanthienen. Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research, 183(3):1466–1476, 2007.
D. Martens, L. Bruynseels, B. Baesens, M. Willekens, and J. Vanthienen. Predicting going concern opinion with data mining. Decision Support Systems, 45(4):765–777, 2008.
D. Martens, M. De Backer, R. Haesen, B. Baesens, C. Mues, and J. Vanthienen. Ant-based approach to the knowledge fusion problem. In Proceedings of the Fifth International Workshop on Ant Colony Optimization and Swarm Intelligence, Lecture Notes in Computer Science, pp. 85–96. Springer, 2006.
D. Martens, M. De Backer, R. Haesen, M. Snoeck, J. Vanthienen, and B. Baesens. Classification with ant colony optimization. IEEE Transaction on Evolutionary Computation, 11(5):651–665, 2007.
R.S. Michalski. A theory and methodology of inductive learning. Artificial Intelligence, 20(2):111–161, 1983.
O.O. Maimon and L. Rokach. Decomposition Methodology For Knowledge Discovery And Data Mining: Theory And Applications (Machine Perception and Artificial Intelligence). World Scientific Publishing Company, July 2005.
M. Ohsaki, S. Kitaguchi, K. Okamoto, H. Yokoi, and T. Yamaguchi. Evaluation of rule interestingness measures with a clinical dataset on hepatitis. In PKDD ’04: Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 362–373, New York, NY, USA, 2004. Springer-Verlag New York, Inc.
L. Passmore, J. Goodside, L. Hamel, L. Gonzales, T. Silberstein, and J. Trimarchi. Assessing decision tree models for clinical in-vitro fertilization data. Technical Report TR03-296, Dept. of Computer Science and Statistics, University of Rhode Island, 2003.
M. Pazzani. Influence of prior knowledge on concept acquisition: Experimental and computational results. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17(3):416–432, 1991.
M. Pazzani and S. Bay. The independent sign bias: Gaining insight from multiple linear regression. In Proceedings of the Twenty First Annual Conference of the Cognitive Science Society, pp. 525–530., 1999.
M. Pazzani, S. Mani, and W. Shankle. Acceptance by medical experts of rules generated by machine learning. Methods of Information in Medicine, 40(5):380–385, 2001.
M. Pazzani. Learning with globally predictive tests. In Discovery Science, pp. 220–231, 1998.
J. R. Quinlan. C4.5 Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, 1993.
J.W. Seifert. Data mining and homeland security: An overview. CRS Report for Congress, 2006.
R. Setiono, B. Baesens, and C. Mues. Risk management and regulatory compliance: A data mining framework based on neural network rule extraction. In Proceedings of the International Conference on Information Systems (ICIS 2006), 2006.
R. Setiono, B. Baesens, and C. Mues. Recursive neural network rule extraction for data with mixed attributes. IEEE Transactions on Neural Networks, Forthcoming.
A. Silberschatz and A. Tuzhilin. On subjective measures of interestingness in knowledge discovery. In KDD, pp. 275–281, 1995.
J. Sill. Monotonic networks. In Advances in Neural Information Processing Systems, volume 10. The MIT Press, Cambridge, MA, 1998.
E. Sommer. An approach to quantifying the quality of induced theories. In Claire Nedellec, editor, Proceedings of the IJCAI Workshop on Machine Learning and Comprehensibility, 1995.
J. A. K. Suykens, T. Van Gestel, J. De Brabanter, B. De Moor, and J. Vandewalle. Least Squares Support Vector Machines. World Scientific, Singapore, 2002.
P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Pearson Education, Boston, MA, 2006.
L. Thomas, D. Edelman, and J. Crook, editors. Credit Scoring and its Applications. SIAM, Philadelphia, PA, 2002.
T. Van Gestel, B. Baesens, and L. Thomas. Introduction to Modern Credit Scoring. Oxford University Press, Oxford, Forthcoming.
T. Van Gestel, B. Baesens, P. Van Dijcke, J. Garcia, J.A.K. Suykens, and J. Vanthienen. A process model to develop an internal rating system: sovereign credit ratings. Decision Support Systems, 42(2):1131–1151, 2006.
T. Van Gestel, B. Baesens, P. Van Dijcke, J.A.K. Suykens, J. Garcia, and T. Alderweireld. Linear and nonlinear credit scoring by combining logistic regression and support vector machines. Journal of Credit Risk, 1(4), 2005.
T. Van Gestel, D. Martens, B. Baesens, D. Feremans, J Huysmans, and J. Vanthienen. Forecasting and analyzing insurance companies’ ratings. International Journal of Forecasting, 23(3):513–529, 2007.
T. Van Gestel, J.A.K. Suykens, B. Baesens, S. Viaene, J. Vanthienen, G. Dedene, B. De Moor, and J. Vandewalle. Benchmarking least squares support vector machine classifiers. Machine Learning, 54(1):5–32, 2004.
O. Vandecruys, D. Martens, B. Baesens, C. Mues, M. De Backer, and R. Haesen. Mining software repositories for comprehensible software fault prediction models. Journal of Systems and Software, 81(5):823–839, 2008.
J. Vanthienen, C. Mues, and A. Aerts. An illustration of verification and validation in the modelling phase of KBS development. Data and Knowledge Engineering, 27(3):337–352, 1998.
V. N. Vapnik. The nature of statistical learning theory. Springer-Verlag New York, Inc., New York, 1995.
M. Velikova and H. Daniels. Decision trees for monotone price models. Computational Management Science, 1(3–4):231–244, 2004.
M. Velikova, H. Daniels, and A. Feelders. Solving partially monotone problems with neural networks. In Proceedings of the International Conference on Neural Networks, Vienna, Austria, March 2006.
M. P. Wellman. Fundamental concepts of qualitative probabilistic networks. Artificial Intelligence, 44(3):257–303, 1990.
I. H. Witten and E. Frank. Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann Publishers Inc., San Francisco, CA, 2000.
Acknowledgments
We extend our gratitude to the guest editor and the anonymous reviewers, as their many constructive and detailed remarks certainly contributed much to the quality of this chapter. Further, we would like to thank the Flemish Research Council (FWO, Grant G.0615.05) for financial support.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Martens, D., Baesens, B. (2010). Building Acceptable Classification Models. In: Stahlbock, R., Crone, S., Lessmann, S. (eds) Data Mining. Annals of Information Systems, vol 8. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-1280-0_3
Download citation
DOI: https://doi.org/10.1007/978-1-4419-1280-0_3
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-1279-4
Online ISBN: 978-1-4419-1280-0
eBook Packages: Computer ScienceComputer Science (R0)