Abstract
Although multiple criteria mathematical program (MCMP), as an alternative method of classification, has been used in various real-life data mining problems, its mathematical structure of solvability is still challengeable. This paper proposes a regularized multiple criteria linear program (RMCLP) for two classes of classification problems. It first adds some regularization terms in the objective function of the known multiple criteria linear program (MCLP) model for possible existence of solution. Then the paper describes the mathematical framework of the solvability. Finally, a series of experimental tests are conducted to illustrate the performance of the proposed RMCLP with the existing methods: MCLP, multiple criteria quadratic program (MCQP), and support vector machine (SVM). The results of four publicly available datasets and a real-life credit dataset all show that RMCLP is a competitive method in classification. Furthermore, this paper explores an ordinal RMCLP (ORMCLP) model for ordinal multi-group problems. Comparing ORMCLP with traditional methods such as One-Against-One, One-Against-The rest on large-scale credit card dataset, experimental results show that both ORMCLP and RMCLP perform well.
Similar content being viewed by others
References
Vapnik V N. The Nature of Statistical Learning Theory. 2nd ed. New York: Springer, 2000
Vapnik V, Golowich S E, Smola A. Support vector method for function approximation, regression estimation, and signal processing. In: Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 1997. 281–287
Osuna E, Freund R, Griosi F. An improved training algorithm for support vector machines. In: Neural Networks for Signal Processing, Amelia Island, FL, USA, 1997. 276–285
Burges C J, Scholkopf B. Improving the accuracy and speed of support vector machines. In: Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 1997. 375–381
Zanni L, Serafini T, Zanghirati G. Parallel software for training large scale support vector machines on multiprocessor systems. J Mach Learn Res, 2006, 7: 1467–1492
Platt J C. Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel Methods: Support Vector Learning. Cambridge MA: MIT Press, 1999
Collobert R, Svmtorch S B. Support vector machines for largescale regression problems. J Mach Learn Res, 2001, 1: 143–160
Ferris M, Munson T. Interior-point methods for massive support vector machines. SIAM J Optimiz, 2003, 3: 783–804
Bennett K P, Hernandez P E. The interplay of optimization and machine learning research. J Mach Learn Res, 2006, 7: 1265–1281
Mangasarian O L. Mathematical programming in data mining. Data Min Knowl Disc, 1997, 2(1): 183–201
Bradley P S, Mangasarian O L. Mathematical programming approaches to machine learning and data mining. Dissertation for the Doctoral Degree. The University of Wisconsin-Madison, 1998
Bradley P S, Fayyad U M, Mangasarian O L. Mathematical programming for data mining: formulations and challenges. INFORMS J Comput, 1999, 11: 217–238
Mangasarian O L. Generalized support vector machines. In: Advances in Large Margin Classifiers. Cambridge, MA: MIT Press, 2000
Charnes A, Cooper W W. Management Models and Industrial Applications of Linear Programming. New York: Wiley, 1961
Freed N, Glover F. Simple but powerful goal programming models for discriminant problems. Europ J Operat Res, 1981, 7: 44–60
Freed N, Glover F. Evaluating alternative linear programming models to solve the two-group discriminant problem. Decision Sci, 1986, 17: 151–162
Olson D, Shi Y. Introduction to Business Data Mining. McGraw-Hill/Irwin, 2007
Shi Y. Multiple Criteria and Multiple Constraint Levels Linear Programming: Concepts, Techniques and Applications. New Jersey: World Scientific Pub Co Inc, 2001
He J, Liu X, Shi Y, et al. Classifications of credit card holder behavior by using fuzzy linear programming. Int J Inf Tech Decis Making, 2004, 3: 633–650
Kou G, Liu X, Peng Y, et al. Multiple criteria linear programming approach to data mining: models, algorithm designs and software development. Optimiz Method Softw, 2003, 18: 453–473
Shi Y, Peng Y, Kou G, et al. Classifying credit card accounts for business intelligence and decision making: a multiple-criteria quadratic programming approach. Int J Inf Tech Decis Making, 2005, 4: 581–600
Shi Y, Wise W, Lou M, et al. Multiple Criteria Decision Making in Credit Card Portfolio Management. In: Multiple Criteria Decision Making in New Millennium. Ankara, Turquie, 2001. 427–436
Shi Y, Y Peng, Xu W, et al. Data mining via multiple criteria linear programming: applications in credit card portfolio management. Int J Inf Tech Decis Making, 2002, 1: 131–151
Zhang J, Zhuang W, Yan N, et al. Classification of HIV-1 mediated neuronal dendritic and synaptic damage using multiple criteria linear programming. Neuroinformatics, 2004, 2: 303–326
Shi Y, Zhang X, Wan J, et al. Prediction the distance range between antibody interface residues and antigen surface using multiple criteria quadratic programming. Int J Comput Math, 2004, 84: 690–707
Peng Y, Kou G, Sabatka A, et al. Application of classification methods to individual disability income insurance fraud detection. In: ICCS 2007, Lecture Notes in Computer Science, Beijing, China, 2007. 852–858
Kou G, Peng Y, Chen Z, et al. A multiple-criteria quadratic programming approach to network intrusion detection. In: Chinese Academy of Sciences Symposium on Data Mining and Knowledge Management. Berlin: Springer, 2004. 7: 12–14
Kou G, Peng Y, Yan N, et al. Network intrusion detection by using multiple-criteria linear programming. In: International Conference on Service Systems and Service Management, Beijing, China, 2004. 7: 19–21
Kwak W, Shi Y, Eldridge S, et al. Bankruptcy prediction for Japanese firms: using multiple criteria linear programming data mining approach. Int J Data Min Busin Intell, 2006, 1(4): 401–416
Cottle R W, Pang J S, Stone R E. The Linear Complementarity Problem. New York: Academic Press, 1992
Murphy P M, Aha D W, UCI repository of machine learning databases. Available online at: www.ics.uci.edu/mlearn/MLRepository.html, 1992
Plutowski M E. Survey: cross-validation in theory and in practice. Available online at: http://www.emotivate.com/CvSurvey.doc, 1996
Peng Y, Kou G, Chen Z, et al. Cross-validation and ensemble analyses on multiple-criteria linear programming classification for credit cardholder behavior. In: ICCS 2004, Lecture Notes in Computer Science, Krakow, Poland, June 6–9, 2004. 931–939
Weston J, Watkins C. Multi-class support vector machines. Technical Report CSD-TR-98-04, Royal Holloway, University of London. 1998
Pontil M, Verri A. Support vector machines for 3-d object recognition. IEEE Trans Patt Anal Mach Intell, 1998, 20: 637–646
Hsu C W, Lin C J. A comparison of methods for multi-class support vector machines. IEEE Trans Neur Netw, 2002, 13(2), 415–425
Peng Y, Kou G, Shi Y, et al. Multiclass creditcardholers behaviors classification methods. In: ICCS 2006, Part IV, LNCS 3994, May 28–31, Reading UK, 2006. 485–492
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Natural Science Foundation of China (Grant Nos. 70621001, 70531040, 70501030, 10601064, 70472074), the Natural Science Foundation of Beijing (Grant No. 9073020), the National Basic Research Program of China (Grant No. 2004CB720103), Ministry of Science and Technology, China, the Research Grants Council of Hong Kong and BHP Billiton Co., Australia
Rights and permissions
About this article
Cite this article
Shi, Y., Tian, Y., Chen, X. et al. Regularized multiple criteria linear programs for classification. Sci. China Ser. F-Inf. Sci. 52, 1812–1820 (2009). https://doi.org/10.1007/s11432-009-0126-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-009-0126-5