Skip to main content
Log in

A new privacy-preserving proximal support vector machine for classification of vertically partitioned data

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

A new privacy-preserving proximal support vector machine (P3SVM) is formulated for classification of vertically partitioned data. Our classifier is based on the concept of global random reduced kernel which is composed of local reduced kernels. Each of them is computed using local reduced matrix with Gaussian perturbation, which is privately generated by only one of the parties, and never made public. This formulation leads to an extremely simple and fast privacy-preserving algorithm, for generating a linear or nonlinear classifier that merely requires the solution of a single system of linear equations. Comprehensive experiments are conducted on multiple publicly available benchmark datasets to evaluate the performance of the proposed algorithms and the results indicate that: (a) Our P3SVM achieves better performance than the recently proposed privacy-preserving SVM via random kernels in terms of both classification accuracy and computational time. (b) A significant improvement of accuracy is attained by our P3SVM when compared to classifiers generated only using each party’s own data. (c) The generated classifier has comparable accuracy to an ordinary PSVM classifier trained on the entire dataset, without releasing any private data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Agrawal R (1999) Data mining: crossing the chasm. In: Proceedings of the 5th ACM SIGKDD International Conference on knowledge discovery and data mining, San Diego. doi:10.1145/312129.312167

  2. Agrawal R, Srikant R (2000) Privacy-preserving data mining. ACM SIGMOD Rec 29(2):439–450. doi:10.1145/342009.335438

    Article  Google Scholar 

  3. Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of database systems. ACM, pp 247–255. doi:10.1145/375551.375602

  4. Vaidya J, Clifton C (2002) Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the eighth ACM SIGKDD International Conference on Knowledge discovery and data mining. ACM, pp 639–644. doi:10.1145/775047.775142

  5. Bertino E, Lin D, Jiang W (2008) Privacy-preserving data mining, advances in database systems. In: Aggarwal C, Yu P (eds) A survey of quantification of privacy preserving data mining algorithms, 34th edn. Springer, US, pp 183–205. doi:10.1007/978-0-387-70992-5_8

    Chapter  Google Scholar 

  6. Chen K, Liu L (2005) Privacy preserving data classification with rotation perturbation. In: Proceedings of Fifth IEEE International Conference on Data Mining (ICDM’05), pp 589–592. doi:10.1109/ICDM.2005.121

  7. Xiao MJ, Huang LS, Luo YL, et al. (2005) Privacy preserving ID3 algorithm over horizontally partitioned data. In: Sixth International Conference on parallel and distributed computing applications and technologies (PDCAT’05), pp 239–243. doi:10.1109/PDCAT.2005.191

  8. Verykios VS, Bertino E, Fovino IN et al (2004) State-of-the-art in privacy preserving data mining. ACM SIGMOD Rec 33(1):50–57. doi:10.1145/974121.974131

    Article  Google Scholar 

  9. Vaidya J, Clifton C (2005) Data and applications security XIX. In: Jajodia S, Wijesekera D (eds) Privacy-preserving decision trees over vertically partitioned data, 3654th edn., Lecture notes in computer scienceSpringer, Berlin, pp 139–152. doi:10.1007/11535706_11

    Google Scholar 

  10. Vapnik VN (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  11. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. doi:10.1007/bf00994018

    MATH  Google Scholar 

  12. Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the Sixteenth International Conference on Machine Learning (ICML’99), pp 200–209

  13. Wang X-Z, He Q, Chen D-G, Yeung D (2005) A genetic algorithm for solving the inverse problem of support vector machines. Neurocomputing 68:225–238

    Article  Google Scholar 

  14. Chen W-J, Shao Y-H, Hong N (2013) Laplacian smooth twin support vector machine for semi-supervised classification. Int J Mach Learn Cybern. doi:10.1007/s13042-013-0183-3

    Google Scholar 

  15. Yu H, Jiang X, Vaidya J (2006) Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data. In: Proceedings of the 2006 ACM symposium on Applied computing (SAC’06). ACM, New York, pp 603–610. doi:10.1145/1141277.1141415

  16. Yu H, Vaidya J, Jiang X (2006) Advances in knowledge discovery and data mining. In: Ng W-K, Kitsuregawa M, Li J, Chang K (eds) Privacy-preserving SVM classification on vertically partitioned data, 3918th edn., Lecture notes in computer scienceSpringer, Berlin, pp 647–656. doi:10.1007/11731139_74

    Google Scholar 

  17. Mangasarian OL, Wild EW, Fung GM (2008) Privacy-preserving classification of vertically partitioned data via random kernel. ACM Trans Knowl Discov Data (TKDD) 2(3):1–16. doi:10.1145/1409620.1409622

    Article  Google Scholar 

  18. Mangasarian OL, Wild EW (2008) Privacy-preserving classification of horizontally partitioned data via random kernel. In: Proceedings of the DMIN08, vol 2, pp 473–479

  19. Mangasarian OL, Wild EW (2010) Data mining. In: Stahlbock R, Crone SF, Lessmann S (eds) Privacy-preserving random kernel classification of checkerboard partitioned data, 8th edn., Annals of information systemsSpringer, US, pp 375–387. doi:10.1007/978-1-4419-1280-0_17

    Google Scholar 

  20. Lin KP, Chen MS (2011) On the design and analysis of the privacy-preserving SVM classifier. IEEE Trans Knowl Data Eng 23(11):1704–1717. doi:10.1109/TKDE.2010.193

    Article  Google Scholar 

  21. Lin KP, Chen MS (2010) Privacy-preserving outsourcing support vector machines with random transformation. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge discovery and data mining. ACM, pp 363–372. doi:10.1145/1835804.1835852

  22. Wang X-Z, Lu S-X, Zhai J-H (2008) Fast fuzzy multi-category SVM based on support vector domain description. Int J Pattern Recognit Artif Intell 22(1):109–120

    Article  Google Scholar 

  23. Shao Y-H, Deng N-Y, Chen W-J, Wang Z (2013) Improved generalized eigenvalue proximal support vector machine. IEEE Signal Process Lett 20(3):213–216. doi:10.1109/LSP.2012.2216874

    Article  Google Scholar 

  24. Chen W-J, Shao Y-H, Xu D-K, Fu Y-F (2013) Manifold proximal support vector machine for semi-supervised classification. Appl Intell. doi:10.1007/s10489-013-0491-z

    Google Scholar 

  25. Liu Z, Wu Q, Zhang Y, Chen CP (2011) Adaptive least squares support vector machines filter for hand tremor canceling in microsurgery. Int J Mach Learn Cybern 2(1):37–47. doi:10.1007/s13042-011-0012-5

    Article  MathSciNet  Google Scholar 

  26. Fung G, Mangasarian O L (2001) Proximal support vector machine classifiers. In: Proceedings of the seventh ACM SIGKDD International Conference on Knowledge discovery and data mining (KDD’01). ACM, New York, pp 77–86. doi:10.1145/502512.502527

  27. Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300. doi:10.1023/A:1018628609742

    Article  MathSciNet  Google Scholar 

  28. Horn RA, Johnson CR (2012) Matrix analysis. Cambridge university press, Cambridge

    Book  Google Scholar 

  29. Bache K, Lichman M (2013) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA. http://archive.ics.uci.edu/ml

  30. Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, New York

    Google Scholar 

  31. Lee YJ, Mangasarian OL (2001) RSVM: Reduced support vector machines. In: Proceedings of the first SIAM International Conference on data mining. SIAM, Philadelphia, pp 5–7

  32. Lee YJ, Huang SY (2007) Reduced support vector machines: a statistical theory. IEEE Trans Neural Netw 18(1):1–13

    Article  Google Scholar 

  33. Mitchell TM (1997) Machine learning. McGraw-Hill, Boston

    MATH  Google Scholar 

Download references

Acknowledgments

This work is supported by the China Agricultural Research System (CARS-30).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi-Jian Zhou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, L., Mu, WS., Qi, B. et al. A new privacy-preserving proximal support vector machine for classification of vertically partitioned data. Int. J. Mach. Learn. & Cyber. 6, 109–118 (2015). https://doi.org/10.1007/s13042-014-0245-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-014-0245-1

Keywords

Navigation