Least squares recursive projection twin support vector machine for classification
Highlights
► We propose a least squares projection twin support vector machine (LSPTSVM). ► A regularization term is added in our LSPTSVM. ► The optimization problems of LSPTSVM are positive definite and resulted better generalization ability. ► We just solve two systems of linear equations for LSPTSVM. ► LSPTSVM can easily handle large datasets.
Introduction
Support vector machines (SVMs) [1], [2], have gained a great deal of attention due to its generalization performance. In contrast with conventional artificial neural networks (ANNs) which aim at reducing empirical risk, SVM is principled and implements the structural risk minimization (SRM) that minimizes the upper bound of the generalization error [3], [4], [5]. As a powerful tool for supervised learning, SVMs have been successfully applied to a variety of real-world problems like particle identification, text categorization, bioinformatics and financial applications [6], [7], [8].
Though SVM owns better generalization performance compared with many other machine learning methods, the training stage of SVM involves the solution of a quadratic programming problem (QPP). Its computational complexity is , where l is the total size of training samples. This drawback restricts the application of SVM to large-scale problems. To address this problem, many improved algorithms have been proposed, e.g. Chunking [9], SMO [10], SVMLight [11], Libsvm [12], and Liblinear [13]. Traditionally, the above algorithms solve the optimization problem of SVM by optimizing a small subset of the variables in the dual during the iteration procedure. On the other hand, different from the aforementioned methods which aim at fast resolving the optimization problem in standard SVM, a number of new SVM models [14], [15], [16] were proposed in recent years. Least squares SVM (LSSVM) [14], [15] has been proposed as a way to replace the QPP in SVM with a linear system by using a squared loss function instead of the hinge one resulting in a extremely fast training speed.
Recently, [17] proposed generalized eigenvalue proximal support vector machine (GEPSVM), which aims at generating two nonparallel hyperplanes such that each hyperplane is closer to its class and is as far as possible from the other class. Subsequently, Jayadeva et al. [18] proposed the twin support vector machine (TWSVM) in the light of GEPSVM. However, instead of solving two generalized eigenvalue problems in GEPSVM, TWSVM solves two related SVM-type problems to obtain the hyperplanes. Comparing to the conventional SVM, TWSVM [18], [19] resulting in that is as competitive as in terms of performance whereas is around four times faster. Least squares TWSVM (LSTSVM) [20] has been proposed as a way to replace the convex QPP in TWSVM with a convex linear system by using a squared loss function instead of the hinge one. LSTSVM possess extremely fast training speed since their separating hyperplanes are determined by solving a single system of linear equations. Different from TWSVM which improves GEPSVM by seeking a hyperplane for each class using SVM-type formulation, a multi-weight vector projection support vector machine (MVSVM) [21] was proposed to enhance the performance of GEPSVM [17] by seeking one weight vector, such that the samples of one class are closest to its class mean while the samples of different classes are separated as far as possible [21]. The weight vectors of MVSVM can be found by solving a pair of eigenvalue problems. Later, [22] proposed the projection twin support vector machine (PTSVM) in the light of MVSVM. Instead of solving two generalized eigenvalue problems in MVSVM, PTSVM solves two related SVM-type problems to obtain the two projection directions. It is implemented by solving two smaller QPPs similar to TWSVM. Furthermore, PTSVM can generate multiple projection axes by using the proposed recursive algorithm. Experimental results in [22] show the superiority of PTSVM over MVSVM.
In this paper, in order to enhance the performance of PTSVM further, we propose a least squares version of PTSVM, called least squares projection twin support vector machine (LSPTSVM) using the idea in LSSVM [14], [15] and LSTSVM [20]. It should be pointed out that our LSPTSVM is not a direct least squares version of PTSVM. In fact, in the primal problems of PTSVM, only the empirical risk is minimized. However, in our LSPTSVM we add an extra regularization term, not only ensuring the QPPs are positive definite but also resulting in better generalization ability. In addition, the QPPs of our LSPTSVM have only equality constraints while inequality constraints appear in PTSVM. Thus the solution of our LSPTSVM follows directly from solving two systems of linear equations as opposed to solving two QPPs and two systems of linear equations in PTSVM. Therefore our algorithm is able to solve large datasets accurately without any external optimizers. Computational comparisons of our LSPTSVM against PTSVM, TWSVM, LSTSVM and MVSVM, in terms of classification accuracy and computing time, have been made on 11 UCI datasets and several artificial datasets, showing its superiority and ability to handle large datasets.
This paper is organized as follows. Section 2 briefly dwells on the GEPSVM, TWSVM, MVSVM and PTSVM. Section 3 proposes our LSPTSVM and experimental results are described in Section 4. At last concluding remarks are given in Section 5.
Section snippets
Brief introduction of SVMs
Consider a binary classification problem in the n-dimensional real space Rn. The set of training data points is represented by , where is the jth input belonging to class i and , are corresponding outputs. We further organize the m1 inputs of Class +1 by matrix A in and the m2 inputs of Class −1 by matrix B in .
Primal problems
Different from PTSVM, our decision function is obtained from the primal problems directly. The primal problems are modified versions of the primal problems (20), (21) of PTSVM in least squares sense and constructed following the idea of PSVM proposed in [16]. Different from the primal problems (20), (21) with the inequality constraints, or primal problems have only equality constraints as follows:
(LSPTSVM1)
Experimental results
In order to evaluate the proposed LSPTSVM, we investigate its classification accuracies and computational efficiencies on three artificial datasets [17], [21], [22], 11 real-world UCI benchmark datasets [35] and David Musicant's NDC Data Generator [36] datasets. In experiments, we focus on the comparison between the proposed algorithms and some state-of-the-art multiple-surface classification methods, including TWSVM, LSTSVM, MVSVM and PTSVM. All the classification methods are implemented in
Conclusions
In this paper, we have improved PTSVM and constructed a new algorithm, called least squares PTSVM (LSPTSVM) for binary classification. Our LSPTSVM is an extremely simple algorithm using two projection directions. Different from solving two dual quadratic programming problems (QPPs) and two systems of linear equations. In LSPTSVM, we solve two primal problems by solving just two systems of linear equations. This allows LSPTSVM to classify large datasets easily, while for which PTSVM requires
Acknowledgment
This work is supported by the National Natural Science Foundation of China (No. 10971223, No. 10926198 and No. 11071252).
Yuan-Hai Shao received his Bachelor's degree in College of Mathematics from Jilin University, and received Doctor's degree in College of Science from China Agricultural University, China, in 2006 and 2011, respectively. Currently, he is a lecturer at the Zhijiang College, Zhejiang University of Technology. His research interests include optimization methods, machine learning and data mining.
References (42)
- et al.
Multi-weight vector projection support vector machines
Pattern Recognition Letters
(2010) - et al.
Quotient vs. difference: comparison between the two discriminant criteria
Neurocomputing
(2010) Why can LDA be performed in PCA transformed space?
Pattern Recognition
(2003)- et al.
On minimum class locality preserving variance support vector machine
Pattern Recognition
(2010) - et al.
Distance difference and linear programming nonparallel plane classifier
Expert Systems with Applications
(2011) - et al.
Face recognition based on the uncorrelated discriminant transformation
Pattern Recognition
(2001) - et al.
What's wrong with the Fisher criterion?
Pattern Recognition
(2002) - et al.
Support vector networks
Machine Learning
(1995) A tutorial on support vector machines for pattern recognition
Data Mining and Knowledge Discovery
(1998)Statistical Learning Theory
(1998)
An Introduction to Support Vector Machines: And Other Kernel-Based Learning Methods
Kernel Methods in Computational Biology
Support vector machine for regression and applications to financial forecasting
Support vector networks
Machine Learning
Fast training of support vector machines using sequential minimal optimization
Making Large-Scale Support Vector Machine Learning Practical, Advances in Kernel Methods: Support Vector Learning
LIBLINEAR: a library for large linear classification
Journal of Machine Learning Research
Least squares support vector machine classifiers: a large scale algorithm
Cited by (130)
Regularized Least Squares Twin SVM for Multiclass Classification
2022, Big Data ResearchFuzzy least squares projection twin support vector machines for class imbalance learning
2021, Applied Soft ComputingCitation Excerpt :Unified form of fuzzy C-means and K-means algorithms and its partitional implementation [44] leads to better performance in the clustering problems. Motivated by robust fuzzy least squares twin SVMs (RFLSTSVM) [34] and least squares recursive projection twin SVM (LSRPTSVM) [39], we propose a novel fuzzy least squares projection twin support vector machines for class imbalance learning (FLSPTSVM-CIL). In fuzzy twin support vector machines (FTWSVM), existing fuzzy membership function [45–47] based on the distance of the data from the centroid is used in TSVM model [11].
Non-parallel hyperplanes ordinal regression machine
2021, Knowledge-Based SystemsLeast squares projection twin support vector clustering (LSPTSVC)
2020, Information SciencesNPrSVM: Nonparallel sparse projection support vector machine with efficient algorithm
2020, Applied Soft Computing Journal
Yuan-Hai Shao received his Bachelor's degree in College of Mathematics from Jilin University, and received Doctor's degree in College of Science from China Agricultural University, China, in 2006 and 2011, respectively. Currently, he is a lecturer at the Zhijiang College, Zhejiang University of Technology. His research interests include optimization methods, machine learning and data mining.
Nai-Yang Deng received the MSc degrees in Department of Mathematics from Peking University, China, in 1967. Now, he is a professor in College of Science, China Agricultural University, he is a honorary director of China Operations Research Society, Managing Editor Journal of Operational Research, International Operations Research Abstracts Editor. His research interests mainly including operational research, optimization, machine learning and data mining. He has published over 100 papers.
Zhi-Min Yang received his Doctor's degree in College of Science from China Agricultural University, China, in 2005. Now, he is a professor in Zhijiang College, Zhejiang University of Technology. His research interests include uncertainty theory, optimization methods, machine learning and data mining.