Elsevier

Pattern Recognition

Volume 45, Issue 6, June 2012, Pages 2299-2307
Pattern Recognition

Least squares recursive projection twin support vector machine for classification

https://doi.org/10.1016/j.patcog.2011.11.028Get rights and content

Abstract

In this paper we formulate a least squares version of the recently proposed projection twin support vector machine (PTSVM) for binary classification. This formulation leads to extremely simple and fast algorithm, called least squares projection twin support vector machine (LSPTSVM) for generating binary classifiers. Different from PTSVM, we add a regularization term, ensuring the optimization problems in our LSPTSVM are positive definite and resulting better generalization ability. Instead of usually solving two dual problems, we solve two modified primal problems by solving two systems of linear equations whereas PTSVM need to solve two quadratic programming problems along with two systems of linear equations. Our experiments on publicly available datasets indicate that our LSPTSVM has comparable classification accuracy to that of PTSVM but with remarkably less computational time.

Highlights

► We propose a least squares projection twin support vector machine (LSPTSVM). ► A regularization term is added in our LSPTSVM. ► The optimization problems of LSPTSVM are positive definite and resulted better generalization ability. ► We just solve two systems of linear equations for LSPTSVM. ► LSPTSVM can easily handle large datasets.

Introduction

Support vector machines (SVMs) [1], [2], have gained a great deal of attention due to its generalization performance. In contrast with conventional artificial neural networks (ANNs) which aim at reducing empirical risk, SVM is principled and implements the structural risk minimization (SRM) that minimizes the upper bound of the generalization error [3], [4], [5]. As a powerful tool for supervised learning, SVMs have been successfully applied to a variety of real-world problems like particle identification, text categorization, bioinformatics and financial applications [6], [7], [8].

Though SVM owns better generalization performance compared with many other machine learning methods, the training stage of SVM involves the solution of a quadratic programming problem (QPP). Its computational complexity is O(l3), where l is the total size of training samples. This drawback restricts the application of SVM to large-scale problems. To address this problem, many improved algorithms have been proposed, e.g. Chunking [9], SMO [10], SVMLight [11], Libsvm [12], and Liblinear [13]. Traditionally, the above algorithms solve the optimization problem of SVM by optimizing a small subset of the variables in the dual during the iteration procedure. On the other hand, different from the aforementioned methods which aim at fast resolving the optimization problem in standard SVM, a number of new SVM models [14], [15], [16] were proposed in recent years. Least squares SVM (LSSVM) [14], [15] has been proposed as a way to replace the QPP in SVM with a linear system by using a squared loss function instead of the hinge one resulting in a extremely fast training speed.

Recently, [17] proposed generalized eigenvalue proximal support vector machine (GEPSVM), which aims at generating two nonparallel hyperplanes such that each hyperplane is closer to its class and is as far as possible from the other class. Subsequently, Jayadeva et al. [18] proposed the twin support vector machine (TWSVM) in the light of GEPSVM. However, instead of solving two generalized eigenvalue problems in GEPSVM, TWSVM solves two related SVM-type problems to obtain the hyperplanes. Comparing to the conventional SVM, TWSVM [18], [19] resulting in that is as competitive as in terms of performance whereas is around four times faster. Least squares TWSVM (LSTSVM) [20] has been proposed as a way to replace the convex QPP in TWSVM with a convex linear system by using a squared loss function instead of the hinge one. LSTSVM possess extremely fast training speed since their separating hyperplanes are determined by solving a single system of linear equations. Different from TWSVM which improves GEPSVM by seeking a hyperplane for each class using SVM-type formulation, a multi-weight vector projection support vector machine (MVSVM) [21] was proposed to enhance the performance of GEPSVM [17] by seeking one weight vector, such that the samples of one class are closest to its class mean while the samples of different classes are separated as far as possible [21]. The weight vectors of MVSVM can be found by solving a pair of eigenvalue problems. Later, [22] proposed the projection twin support vector machine (PTSVM) in the light of MVSVM. Instead of solving two generalized eigenvalue problems in MVSVM, PTSVM solves two related SVM-type problems to obtain the two projection directions. It is implemented by solving two smaller QPPs similar to TWSVM. Furthermore, PTSVM can generate multiple projection axes by using the proposed recursive algorithm. Experimental results in [22] show the superiority of PTSVM over MVSVM.

In this paper, in order to enhance the performance of PTSVM further, we propose a least squares version of PTSVM, called least squares projection twin support vector machine (LSPTSVM) using the idea in LSSVM [14], [15] and LSTSVM [20]. It should be pointed out that our LSPTSVM is not a direct least squares version of PTSVM. In fact, in the primal problems of PTSVM, only the empirical risk is minimized. However, in our LSPTSVM we add an extra regularization term, not only ensuring the QPPs are positive definite but also resulting in better generalization ability. In addition, the QPPs of our LSPTSVM have only equality constraints while inequality constraints appear in PTSVM. Thus the solution of our LSPTSVM follows directly from solving two systems of linear equations as opposed to solving two QPPs and two systems of linear equations in PTSVM. Therefore our algorithm is able to solve large datasets accurately without any external optimizers. Computational comparisons of our LSPTSVM against PTSVM, TWSVM, LSTSVM and MVSVM, in terms of classification accuracy and computing time, have been made on 11 UCI datasets and several artificial datasets, showing its superiority and ability to handle large datasets.

This paper is organized as follows. Section 2 briefly dwells on the GEPSVM, TWSVM, MVSVM and PTSVM. Section 3 proposes our LSPTSVM and experimental results are described in Section 4. At last concluding remarks are given in Section 5.

Section snippets

Brief introduction of SVMs

Consider a binary classification problem in the n-dimensional real space Rn. The set of training data points is represented by T={(xji,yj)|i=1,2,j=1,,mi}, where xjiRn is the jth input belonging to class i and m=m1+m2, yj{+1,1} are corresponding outputs. We further organize the m1 inputs of Class +1 by matrix A in Rm1×n and the m2 inputs of Class −1 by matrix B in Rm2×n.

Primal problems

Different from PTSVM, our decision function is obtained from the primal problems directly. The primal problems are modified versions of the primal problems (20), (21) of PTSVM in least squares sense and constructed following the idea of PSVM proposed in [16]. Different from the primal problems (20), (21) with the inequality constraints, or primal problems have only equality constraints as follows:

(LSPTSVM1)minw112i=1m1w1xi(1)w11m1j=1m1xj(1)2+c12k=1m2ξk2+c32w12s.t.w1xk(2)w11m1j=1m1xj(

Experimental results

In order to evaluate the proposed LSPTSVM, we investigate its classification accuracies and computational efficiencies on three artificial datasets [17], [21], [22], 11 real-world UCI benchmark datasets [35] and David Musicant's NDC Data Generator [36] datasets. In experiments, we focus on the comparison between the proposed algorithms and some state-of-the-art multiple-surface classification methods, including TWSVM, LSTSVM, MVSVM and PTSVM. All the classification methods are implemented in

Conclusions

In this paper, we have improved PTSVM and constructed a new algorithm, called least squares PTSVM (LSPTSVM) for binary classification. Our LSPTSVM is an extremely simple algorithm using two projection directions. Different from solving two dual quadratic programming problems (QPPs) and two systems of linear equations. In LSPTSVM, we solve two primal problems by solving just two systems of linear equations. This allows LSPTSVM to classify large datasets easily, while for which PTSVM requires

Acknowledgment

This work is supported by the National Natural Science Foundation of China   (No. 10971223, No. 10926198 and No. 11071252).

Yuan-Hai Shao received his Bachelor's degree in College of Mathematics from Jilin University, and received Doctor's degree in College of Science from China Agricultural University, China, in 2006 and 2011, respectively. Currently, he is a lecturer at the Zhijiang College, Zhejiang University of Technology. His research interests include optimization methods, machine learning and data mining.

References (42)

  • N. Cristianini et al.

    An Introduction to Support Vector Machines: And Other Kernel-Based Learning Methods

    (2000)
  • N.Y. Deng, Y.J. Tian, C.H. Zhang, Support Vector Machines: Theory, Algorithms and Extensions, CRC Press,...
  • W.S. Noble

    Kernel Methods in Computational Biology

  • S. Lee, A. Verri, Pattern recognition with support vector machines, in: First International Workshop, Springer, Niagara...
  • H. Ince et al.

    Support vector machine for regression and applications to financial forecasting

  • C. Cortes et al.

    Support vector networks

    Machine Learning

    (1995)
  • J. Platt

    Fast training of support vector machines using sequential minimal optimization

  • T. Joachims

    Making Large-Scale Support Vector Machine Learning Practical, Advances in Kernel Methods: Support Vector Learning

    (1999)
  • C. Chang, C. Lin, LIBSVM: A Library for Support Vector Machines, Technical Report, Department of Computer Science and...
  • R.E. Fan et al.

    LIBLINEAR: a library for large linear classification

    Journal of Machine Learning Research

    (2008)
  • J.A.K. Suykens et al.

    Least squares support vector machine classifiers: a large scale algorithm

  • Cited by (130)

    • Fuzzy least squares projection twin support vector machines for class imbalance learning

      2021, Applied Soft Computing
      Citation Excerpt :

      Unified form of fuzzy C-means and K-means algorithms and its partitional implementation [44] leads to better performance in the clustering problems. Motivated by robust fuzzy least squares twin SVMs (RFLSTSVM) [34] and least squares recursive projection twin SVM (LSRPTSVM) [39], we propose a novel fuzzy least squares projection twin support vector machines for class imbalance learning (FLSPTSVM-CIL). In fuzzy twin support vector machines (FTWSVM), existing fuzzy membership function [45–47] based on the distance of the data from the centroid is used in TSVM model [11].

    View all citing articles on Scopus

    Yuan-Hai Shao received his Bachelor's degree in College of Mathematics from Jilin University, and received Doctor's degree in College of Science from China Agricultural University, China, in 2006 and 2011, respectively. Currently, he is a lecturer at the Zhijiang College, Zhejiang University of Technology. His research interests include optimization methods, machine learning and data mining.

    Nai-Yang Deng received the MSc degrees in Department of Mathematics from Peking University, China, in 1967. Now, he is a professor in College of Science, China Agricultural University, he is a honorary director of China Operations Research Society, Managing Editor Journal of Operational Research, International Operations Research Abstracts Editor. His research interests mainly including operational research, optimization, machine learning and data mining. He has published over 100 papers.

    Zhi-Min Yang received his Doctor's degree in College of Science from China Agricultural University, China, in 2005. Now, he is a professor in Zhijiang College, Zhejiang University of Technology. His research interests include uncertainty theory, optimization methods, machine learning and data mining.

    View full text