Least squares support vector machines with fast leave-one-out AUC optimization on imbalanced prostate cancer data

Wang, Guanjin; Teoh, Jeremy Yuen-Chun; Lu, Jie; Choi, Kup-Sze

doi:10.1007/s13042-020-01081-y

Least squares support vector machines with fast leave-one-out AUC optimization on imbalanced prostate cancer data

Original Article
Published: 27 February 2020

Volume 11, pages 1909–1922, (2020)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Guanjin Wang ORCID: orcid.org/0000-0002-5258-0532^1,2,
Jeremy Yuen-Chun Teoh³,
Jie Lu⁴ &
…
Kup-Sze Choi²

431 Accesses
25 Citations
Explore all metrics

Abstract

Quite often, the available pre-biopsy data for early prostate cancer detection are imbalanced. When the least squares support vector machines (LS-SVMs) are applied to such scenarios, it becomes naturally desirable for us to introduce the well-known AUC performance index into the LS-SVMs framework to avoid bias towards majority classes. However, this may result in high computational complexity for the minimal leave-one-out error. In this paper, by introducing the parameter $\lambda $, a generalized Area under the ROC curve (AUC) performance index $R_{AUCLS}$ is developed to theoretically guarantee that $R_{AUCLS}$ linearly depends on the classical AUC performance index $R_{AUC}$. Based on both $R_{AUCLS}$ and the classical LS-SVM, a new AUC-based least squares support vector machine called AUC-LS-SVMs is proposed for directly and effectively classifying imbalanced prostate cancer data. The distinctive advantage of the proposed classifier AUC-LS-SVMs exists in that it can achieve the minimal leave-one-out error by quickly optimizing the parameter $\lambda $ in $R_{AUCLS}$ using the proposed fast leave-one-out cross validation (LOOCV) strategy. The proposed classifier is first evaluated using generic public datasets. Further experiments are then conducted on a real-world prostate cancer dataset to demonstrate the efficacy of our proposed classifier for early prostate cancer detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

A review on extreme learning machine

Article Open access 22 May 2021

Survey on SVM and their application in image classification

Article 11 January 2018

References

Cancer stat facts: prostate cancer. https://seer.cancer.gov/statfacts/html/prost.html. Accessed 30 Apr 2018
From development to use in clinical practice - ERSPC prostate cancer risk calculator. http://www.prostatecancer-riskcalculator.com/from-development-to-use-in-clinical-practice-erspc-prostate-cancer-risk-calculator. Accessed 30 Apr 2018
LIBSVM data: classification (binary Class). https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html. Accessed 30 Apr 2018
UCI machine learning repository. https://archive.ics.uci.edu/ml/datasets.html. Accessed 30 Apr 2018
(2004) Optimising area under the ROC curve using gradient descent. In: Proceedings of the Twenty-first international conference on machine learning, ACM, p 49
Ablin R, Pfeiffer L, Gonder M, Soanes W (1968) Precipitating antibody in the sera of patients treated cryosurgically for carcinoma of the prostate. Exp Med Surg 27(4):406–410
Google Scholar
Artan Y, Haider MA, Langer DL, Van der Kwast TH, Evans AJ, Yang Y, Wernick MN, Trachtenberg J, Yetik IS (2010) Prostate cancer localization with multispectral mri using cost-sensitive support vector machines and conditional random fields. IEEE Trans Image Process 19(9):2444–2455
MathSciNet MATH Google Scholar
Brefeld U, Scheffer T (2005) AUC maximizing support vector learning. In: Proceedings of the international conference on machine learning (ICML) 2005 workshop on ROC analysis in machine learning
Calders T, Jaroszewicz S (2007) Efficient AUC optimization for classification. In: European conference on principles of data mining and knowledge discovery, Springer, pp 42–53
Catalona W, Hudson M, Scardino P, Richie J, Ahmann F, Flanigan R, DeKernion J, Ratliff T, Kavoussi L, Dalkin B (1994) Selection of optimal prostate specific antigen cutoffs for early detection of prostate cancer: receiver operating characteristic curves. J Urol 152(6 Pt 1):2037–2042
Google Scholar
Catalona W, Richie J, Ahmann F, Hudson M, Scardino P, Flanigan R, Dekernion J, Ratliff T, Kavoussi L, Dalkin B (1994) Comparison of digital rectal examination and serum prostate specific antigen in the early detection of prostate cancer: results of a multicenter clinical trial of 6,630 men. J Urol 151(5):1283–1290
Google Scholar
Cawley GC (2006) Leave-one-out cross-validation based model selection criteria for weighted ls-svms. In: The 2006 IEEE international joint conference on neural network proceedings, IEEE, pp 1661–1668
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27
Google Scholar
Chawla NV, Japkowicz N, Kotcz A (2004) Special issue on learning from imbalanced data sets. ACM SIGKDD Explor Newsl 6(1):1–6
Google Scholar
Çınar M, Engin M, Engin EZ, Ateşçi YZ (2009) Early prostate cancer diagnosis by using artificial neural networks and support vector machines. Expert Syst Appl 36(3):6357–6361
Google Scholar
Cortes C, Mohri M (2004) AUC optimization vs. erlror rate minimization. In: advances in neural information processing systems, pp 313–320
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Elkan C (2001) The foundations of cost-sensitive learning. In: International joint conference on artificial intelligence, Lawrence Erlbaum Associates Ltd, vol 17, pp 973–978
Gao W, Jin R, Zhu S, Zhou ZH (2013) One-pass AUC optimization. In: International conference on machine learning, pp 906–914
Gao W, Zhou ZH (2015) On the consistency of AUC pairwise optimization. In: International joint conference on artificial intelligence (IJCAI), pp 939–945
Ghazikhani A, Monsefi R, Yazdi HS (2014) Online neural network model for non-stationary and imbalanced data stream classification. Int J Mach Learn Cybern 5(1):51–62
Google Scholar
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29–36
Google Scholar
Holst A et al (2008) Efficient AUC maximization with regularized least-squares. In: Tenth Scandinavian conference on artificial intelligence: SCAI 2008, IOS Press, vol 173, p 12
Joachims T (2005) A support vector method for multivariate performance measures. In: Proceedings of the 22nd international conference on machine learning, ACM, pp 377–384
Krawczyk B (2016) Learning from imbalanced data: open challenges and future directions. Prog Artif Intell 5(4):221–232
Google Scholar
Lee W, Jun CH, Lee JS (2017) Instance categorization by support vector machines to adjust weights in AdaBoost for imbalanced data classification. Inf Sci 381:92–103
Google Scholar
Li S, Zhang Y, Xu J, Li L, Zeng Q, Lin L, Guo Z, Liu Z, Xiong H, Liu S (2014) Noninvasive prostate cancer screening based on serum surface-enhanced raman spectroscopy and support vector machine. Appl Phys Lett 105(9):091104
Google Scholar
Liu Y (2004) Active learning with support vector machine applied to gene expression data for cancer classification. J Chem Inf Comput Sci 44(6):1936–1941
Google Scholar
Mao W, Wang J, Xue Z (2017) An ELM-based model with sparse-weighting strategy for sequential data imbalance problem. Int J Mach Learn Cybern 8(4):1333–1345
Google Scholar
Nadji M, Tabei SZ, Castro A, Chu TM, Murphy GP, Wang MC, Morales AR (1981) Prostatic-specific antigen: an immunohistologic marker for prostatic neoplasms. Cancer 48(5):1229–1232
Google Scholar
Rakotomamonjy A (2004) Optimizing area under ROC curve with SVMs. In: ROCAI, pp 71–80
Rezvani S, Wang X, Pourpanah F (2019) Intuitionistic fuzzy twin support vector machines. IEEE Trans Fuzzy Syst 27(11):2140–2151
Google Scholar
Riedel KS (1992) A Sherman-Morrison-Woodbury identity for rank augmenting matrices with application to centering. SIAM J Matrix Anal Appl 13(2):659–662
MathSciNet MATH Google Scholar
Suykens J, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machine classifiers. World Scientific, Singapore
MATH Google Scholar
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999
Google Scholar
Wang G, Lu J, Choi KS, Zhang G (2018) A transfer-based additive LS-SVM classifier for handling missing data. IEEE Trans Cybern 50(2):739–752
Google Scholar
Wang G, Zhang G, Choi K, Lu J (2019) Deep additive least squares support vector machines for classification with model transfer. IEEE Trans Syst Man Cybern Syst 49(7):1527–1540
Google Scholar
Ye J, Xiong T (2007) SVM versus least squares SVM. In: Artificial intelligence and statistics, pp 644–651
Ying Y, Wen L, Lyu S (2016) Stochastic online AUC maximization. In: Advances in neural information processing systems, pp 451–459
Zhang C, Zhou Y, Guo J, Wang G, Wang X (2018) Research on classification method of high-dimensional class-imbalanced datasets based on SVM. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-018-0853-2
Article Google Scholar
Zhang K, Kwok JT (2010) Simplifying mixture models through function approximation. IEEE Trans Neural Netw 21(4):644–658
Google Scholar
Zhao P, Hoi SC, Jin R, YANG T (2011) Online AUC maximization. In: Proceedings of the 28th international conference on machine learning ICML. International Machine Learning Society
Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77
MathSciNet Google Scholar
Zhu Z, Wang Z, Li D, Du W (2019) Multiple empirical kernel learning with majority projection for imbalanced problems. Appl Soft Comput 76:221–236
Google Scholar

Download references

Acknowledgements

The work was supported by the Innovation and Technology Commission of the Government of the Hong Kong SAR under the ITF-MRP project (MRP/015/18), the Australian Research Council (ARC) under Discovery Grant DP170101632 and G. Wang is supported by Murdoch New Staff Startup Grant (SEIT NSSG).

Author information

Authors and Affiliations

Discipline of Information Technology, Mathematics and Statistics, Murdoch University, Perth, Australia
Guanjin Wang
Centre for Smart Health, School of Nursing, The Hong Kong Polytechnic University, Hong Kong, China
Guanjin Wang & Kup-Sze Choi
Division of Urology, Department of Surgery, Prince of Wales Hospital, The Chinese University of Hong Kong, Hong Kong, China
Jeremy Yuen-Chun Teoh
Centre for Artificial Intelligence, School of Software, Faculty of Engineering and Information Technology, University of Technology Sydney, Broadway, NSW, 2007, Australia
Jie Lu

Authors

Guanjin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy Yuen-Chun Teoh
View author publications
You can also search for this author in PubMed Google Scholar
Jie Lu
View author publications
You can also search for this author in PubMed Google Scholar
Kup-Sze Choi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guanjin Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Equation (10) can be reformulated as

$$\begin{aligned}&\ \begin{array}{ll} \min \limits _{\varvec{w},b,{\varvec{\xi }}} &{}\ \frac{1}{2}\varvec{w}^T\varvec{w} +\frac{\gamma }{2}\sum \limits _{i=1}^{N}\xi _{i}^2\\ &{}\ +\frac{C}{2}\sum \limits _{k\in N_+}\sum \limits _{l\in N_-} \frac{\left( \lambda -\varvec{w}^T(\varphi {(\varvec{x}_k)} -\varphi {(\varvec{x}_l)})\right) ^2}{n_+n_-} \end{array}\nonumber \\&\begin{array}{ll} \text {s.t} &{} \ y_{i}=\varvec{w}^T\varphi (\varvec{x}_{i}) +b+\xi _{i}, i=1,2,\ldots ,N\\ &{}\ (N=n^+ + n^-) \end{array} \end{aligned}$$

(23)

To derive the dual problem by constructing the Lagrangian, we formulate the Lagrangian J for Eq. (23)

$$\begin{aligned} J&=\frac{1}{2}\varvec{w}^2 + \frac{C}{2}\sum _{k\in N_+} \sum _{l\in N_-}\frac{\left( \lambda -\varvec{w}^T(\varphi {(\varvec{x}_k)} -\varphi {(\varvec{x}_l)})\right) ^2}{n_+n_-}\nonumber \\&\quad +\frac{\gamma }{2}\sum _{i=1}^{N}\xi _{i}^2 +\sum _{i=1}^{N}\alpha _{i}(y_{i}-\varvec{w}^T \varphi (\varvec{x}_{i})-b-\xi _{i}) \end{aligned}$$

(24)

where ${\varvec{\alpha }}_{i}=(\alpha _1,\alpha _2,\ldots ,\alpha _{N})$ is the vector of Lagrangian multipliers. The conditions for optimality are given by

$$\begin{aligned} \frac{\partial J}{\partial \varvec{w}}&=0 \Rightarrow \varvec{w}+C\sum _{k\in N^+}\sum _{l\in N^-} \frac{\left( \lambda -\varvec{w}^T\left( \varphi (\varvec{x}_k) -\varphi (\varvec{x}_l)\right) \right) }{n^+n^-}\nonumber \\&\quad \left( -\left( \varphi (\varvec{x}_k)-\varphi (\varvec{x}_l)\right) \right) -\sum _{i=1}^{N}\alpha _i\varvec{x}_i=0\nonumber \\&\quad \Rightarrow \varvec{w}+\frac{C}{n^+n^-} \sum _{k\in N^+}\sum _{l\in N^-}\left( \varphi (\varvec{x}_l) -\varphi (\varvec{x}_k)\right) \nonumber \\&\quad \left( \lambda -\varvec{w}^T\left( \varphi (\varvec{x}_k) -\varphi (\varvec{x}_l)\right) \right) -\sum _{i=1}^{N}\alpha _i \varphi {(\varvec{x}_i)}=0 \end{aligned}$$

(25)

Since $\varvec{w}^T\left( \varphi (\varvec{x}_k) -\varphi (\varvec{x}_l)\right) $ is scalar, $\varvec{w}^T \left( \varphi (\varvec{x}_k)-\varphi (\varvec{x}_l)\right) =\left( \varphi (\varvec{x}_k)-\varphi (\varvec{x}_l)\right) ^T\varvec{w}$. We can further write Eq. (25) into

$$\begin{aligned} \frac{\partial J}{\partial \varvec{w}}&=0 \Rightarrow \varvec{w}+\frac{\lambda C}{n^+ n^-} \sum _{k \in N^+} \sum _{l \in N^-}\left( \varphi (\varvec{x}_l) -\varphi (\varvec{x}_k)\right) \nonumber \\&\quad +\frac{C}{n^+ n^-}\sum _{k \in N^+} \sum _{l \in N^-} \left( \varphi (\varvec{x}_k)-\varphi (\varvec{x}_l)\right) \left( \varphi (\varvec{x}_k)\right. \nonumber \\&\quad \left. -\varphi (\varvec{x}_l)\right) ^T\varvec{w} -\sum _{i=1}^{N}\alpha _{i}\varphi {(\varvec{x}_{i})}=0\nonumber \\&\quad \Rightarrow \varvec{w}=\mathbf{H} \left( \sum _{i=1}^{N} \alpha _{i}\varphi (\varvec{x}_{i})+\frac{\lambda C}{n_+n_-} \sum _{k\in N+}\sum _{l\in N-}\right. \nonumber \\&\quad \left. \left( \varphi {(\varvec{x}_k)}-\varphi (\varvec{x}_l)\right) \right) \end{aligned}$$

(26)

where $\mathbf{H} =\left[ \varvec{I}+\frac{C}{n^+ n^-}\sum _{k\in N+} \sum _{l\in N-}\left( \varphi (\varvec{x}_k) -\varphi (\varvec{x}_l)\right) \left( \varphi (\varvec{x}_k) -\varphi (\varvec{x}_l)\right) ^T\right] ^{-1}$, $\varvec{I}$ is the $N\times N$ identity matrix and $\left( \varphi {(\varvec{x}_k)} -\varphi (\varvec{x}_l)\right) \left( \varphi {(\varvec{x}_k)}-\varphi (\varvec{x}_l)\right) ^T$ is an $N\times N$ matrix.

$$\begin{aligned} \frac{\partial J}{\partial b}= & {} 0 \Rightarrow \sum _{i=1}^{N}\alpha _{i}=0 \end{aligned}$$

(27)

$$\begin{aligned} \frac{\partial J}{\partial \xi _{i}}= & {} 0 \Rightarrow \alpha _{i}=\gamma \xi _{i} \end{aligned}$$

(28)

$$\begin{aligned} \frac{\partial J}{\partial \alpha _{i}}= & {} 0 \Rightarrow y_{i}=\varvec{w}^T\varphi {(\varvec{x}_{i})}+b+\xi _{i} \end{aligned}$$

(29)

According to Sherman-Morrison-Woodbury formula [33], given an invertible (nonsingular) matrix $\mathbf{A} $ and column vectors $\varvec{u}$ and $\varvec{v}$, assuming $1+\varvec{v}^{T}{} \mathbf{A} ^{-1} \varvec{u}\ne 0$, we have

$$\begin{aligned} (\mathbf{A }+{\varvec{uv}}^{T})^{-1}=\mathbf{A }^{-1} -\frac{\mathbf{A }^{-1}{\varvec{uv}}^{T}\mathbf{A }^{-1}}{1+{\varvec{v}}^{T}\mathbf{A }^{-1}{\varvec{u}}} \end{aligned}$$

(30)

In particular if $\mathbf{A} =\varvec{I}$, we immediately have $(\varvec{I}+\varvec{u}\varvec{v}^T)^{-1}=\varvec{I} -\frac{\varvec{u}\varvec{v}^T}{1+\varvec{v}^T\varvec{u}}$. By applying this formula to H, we can rewrite H into

$$\begin{aligned} \mathbf{H}&= \varvec{I} -\frac{C}{n^+ n^-}\sum _{k \in N^+}\sum _{l \in N^-}\nonumber \\&\quad \frac{\left( \varphi (\varvec{x}_k)-\varphi (\varvec{x}_l)\right) \left( \varphi (\varvec{x}_k)-\varphi (\varvec{x}_l)\right) ^T}{\left[ 1+\frac{C}{n^+ n^-}\sum _{k \in N^+}\sum _{l \in N^-} \left( \varphi (\varvec{x}_k)-\varphi (\varvec{x}_l)\right) ^T \left( \varphi (\varvec{x}_k)-\varphi (\varvec{x}_l)\right) \right] }\nonumber \\&= \varvec{I} \nonumber \\&\quad -\frac{\sum _{k \in N^+}\sum _{j\in N^-}\left( \varphi (\varvec{x}_k) -\varphi (\varvec{x}_l)\right) \left( \varphi (\varvec{x}_k) -\varphi (\varvec{x}_l)\right) ^T}{\frac{n^+n^-}{C}+\sum _{k \in N^+} \sum _{l \in N^-}\left( k(\varvec{x}_k,\varvec{x}_k) +k(\varvec{x}_l,\varvec{x}_l)-2 k(\varvec{x}_k,\varvec{x}_l)\right) } \end{aligned}$$

(31)

We notice that the denominator in Eq. (31) is a scalar. If we use M to represent it, Eq. (31) can be simplified into

$$\begin{aligned} \mathbf{H}&=\varvec{I}-\frac{\sum _{k \in N^+} \sum _{l \in N^-}\left( \varphi {(\varvec{x}_k)} -\varphi {(\varvec{x}_l)}\right) \left( \varphi {(\varvec{x}_k)} -\varphi (\varvec{x}_l)\right) ^T}{M} \end{aligned}$$

(32)

and accordingly Eq. (26) can be simplified into

$$\begin{aligned} \varvec{w}&=\left( \varvec{I}-\frac{1}{M}\sum _{k \in N^+} \sum _{l \in N^-}\left( \varphi {(\varvec{x}_k)}-\varphi {(\varvec{x}_l)}\right) \left( \varphi {(\varvec{x}_k)}-\varphi {(\varvec{x}_l)}\right) ^T\right) \nonumber \\&\quad \left( \sum _{i=1}^{N}\alpha _i\varphi {(\varvec{x}_k)}+\frac{\lambda C}{n^+ n^-} \sum _{k \in N^+}\sum _{l \in N^-}\left( \varphi (\varvec{x}_k) -\varphi (\varvec{x}_l)\right) \right) \end{aligned}$$

(33)

By eliminating $\varvec{w}$ and $\xi _i$, we can get the following solution

$$\begin{aligned} y_i&=\varphi ^T{(\varvec{x}_i)}\left( \varvec{I} -\frac{1}{M}\sum _{k \in N^+}\sum _{l \in N^-} \left( \varphi {(\varvec{x}_k)}-\varphi {(\varvec{x}_l)}\right) \left( \varphi {(\varvec{x}_k)}-\varphi {(\varvec{x}_l)}\right) ^T\right) \nonumber \\&\quad \left( \sum _{i=1}^{N}\alpha _i\varphi {(\varvec{x}_i)} +\frac{\lambda C}{n^+n^-}\sum _{k \in N^+}\sum _{l \in N^-} \left( \varphi (\varvec{x}_k)-\varphi (\varvec{x}_l)\right) \right) +b+\frac{\alpha _i}{\gamma }\nonumber \\&=\left( \varphi ^T(\varvec{x}_i)-\frac{1}{M}\sum _{k \in N^+}\sum _{l \in N^-} \left( k(\varvec{x}_i,\varvec{x}_k)-k(\varvec{x}_k,\varvec{x}_l)\right) \left( \varphi {(\varvec{x}_k)}\right. \right. \nonumber \\&\quad \left. \left. -\varphi {(\varvec{x}_l)}\right) ^T\right) \left( \sum _{i=1}^{N}\alpha _i\varphi {(\varvec{x}_i)}+\frac{\lambda C}{n^+n^-} \sum _{k \in N^+}\sum _{l \in N^-}\left( \varphi (\varvec{x}_k) -\varphi (\varvec{x}_l)\right) \right) \nonumber \\&\quad +b+\frac{\alpha _i}{\gamma }\nonumber \\&=\sum _{k=1}^{N}\alpha _k\left[ k(\varvec{x}_i,\varvec{x}_k) -\frac{1}{M}\sum _{p \in N^+}\sum _{l \in N^-} \left( k(\varvec{x}_i,\varvec{x}_p) -k(\varvec{x}_i,\varvec{x}_l)\right) \right. \nonumber \\&\quad \left. \left( k(\varvec{x}_p,\varvec{x}_k)-k(\varvec{x}_l,\varvec{x}_k) \right) \right] +\frac{\lambda C}{n^+ n^-}\sum _{k \in N^+}\sum _{l \in N^-} \left\{ \left( k(\varvec{x}_i,\varvec{x}_k)\right. \right. \nonumber \\&\quad \left. -k(\varvec{x}_i,\varvec{x}_l)\right) -\frac{1}{M}\sum _{p\in N^+} \sum _{q\in N^-}\left( k(\varvec{x}_i,\varvec{x}_p) -k(\varvec{x}_i,\varvec{x}_q)\right) \sum _{k\in N^+}\sum _{l \in N^-}\nonumber \\&\quad \left. \left( k(\varvec{x}_p,\varvec{x}_k)-k(\varvec{x}_p,\varvec{x}_l) -k(\varvec{x}_q,\varvec{x}_k)+k(\varvec{x}_q,\varvec{x}_l)\right) \right\} +b+\frac{\alpha _i}{\gamma } \end{aligned}$$

(34)

We denote $k(\varvec{x}_i,\varvec{x}_k)-\frac{1}{M}\sum _{p \in N^+}\sum _{l \in N^-}\left( k(\varvec{x}_i,\varvec{x}_p) -k(\varvec{x}_i,\varvec{x}_l)\right) \left( k(\varvec{x}_p, \varvec{x}_k)-k(\varvec{x}_l,\varvec{x}_k)\right) $ as $\tilde{k}(\varvec{x}_i,\varvec{x}_k)$, and $\frac{C}{n^+ n^-} \sum _{k \in N^+}\sum _{l \in N^-}\left\{ \left( k(\varvec{x}_i, \varvec{x}_k)-k(\varvec{x}_i,\varvec{x}_l)\right) -\frac{1}{M}\sum _{p\in N^+} \sum _{q\in N^-} \left( k(\varvec{x}_i, \varvec{x}_p)-k(\varvec{x}_i,\varvec{x}_q)\right) \sum _{k\in N^+}\sum _{l \in N^-}\left( k(\varvec{x}_p,\varvec{x}_k) -k(\varvec{x}_p,\varvec{x}_l) -k(\varvec{x}_q, \varvec{x}_k)+k(\varvec{x}_q,\varvec{x}_l)\right) \right\} $ as $ f(\varvec{x}_i)$, therefore we can rewrite Eq. (34) into

$$\begin{aligned} y_i=\sum _{k=1}^{N}\alpha _k \tilde{k}(\varvec{x}_i,\varvec{x}_k) +\lambda f(\varvec{x}_i)+b+\frac{\alpha _i}{\gamma } \end{aligned}$$

(35)

We can further write the above linear equation in the matrix form

$$\begin{aligned} \begin{bmatrix} \tilde{\mathbf{K }}+\frac{\varvec{I}}{\gamma } &{} \varvec{1} \\ \varvec{1}^T &{} 0 \end{bmatrix} \begin{bmatrix} {\varvec{\alpha }}\\ b \end{bmatrix} = \begin{bmatrix} \varvec{y}-\lambda \varvec{f}\\ 0 \end{bmatrix} \end{aligned}$$

(36)

where $\varvec{y}=[y_1;\ldots ;y_N]^T$, $\varvec{1}=[1;\ldots ;1]$, $\varvec{f} =[f(\varvec{x}_1);\ldots ;f(\varvec{x}_N)]^T$, and $\tilde{\mathbf{K }}=(\tilde{k}(\varvec{x}_i,\varvec{x}_k))_{N \times N}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, G., Teoh, J.YC., Lu, J. et al. Least squares support vector machines with fast leave-one-out AUC optimization on imbalanced prostate cancer data. Int. J. Mach. Learn. & Cyber. 11, 1909–1922 (2020). https://doi.org/10.1007/s13042-020-01081-y

Download citation

Received: 27 August 2019
Accepted: 01 February 2020
Published: 27 February 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s13042-020-01081-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Least squares support vector machines with fast leave-one-out AUC optimization on imbalanced prostate cancer data

Abstract

Access this article

Similar content being viewed by others

Learning from imbalanced data: open challenges and future directions

A review on extreme learning machine

Survey on SVM and their application in image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Least squares support vector machines with fast leave-one-out AUC optimization on imbalanced prostate cancer data

Abstract

Access this article

Similar content being viewed by others

Learning from imbalanced data: open challenges and future directions

A review on extreme learning machine

Survey on SVM and their application in image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation