Skip to main content

Advertisement

Log in

A Novel Anti-classification Approach for Knowledge Protection

  • Transactional Processing Systems
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

Classification is the problem of identifying a set of categories where new data belong, on the basis of a set of training data whose category membership is known. Its application is wide-spread, such as the medical science domain. The issue of the classification knowledge protection has been paid attention increasingly in recent years because of the popularity of cloud environments. In the paper, we propose a Shaking Sorted-Sampling (triple-S) algorithm for protecting the classification knowledge of a dataset. The triple-S algorithm sorts the data of an original dataset according to the projection results of the principal components analysis so that the features of the adjacent data are similar. Then, we generate noise data with incorrect classes and add those data to the original dataset. In addition, we develop an effective positioning strategy, determining the added positions of noise data in the original dataset, to ensure the restoration of the original dataset after removing those noise data. The experimental results show that the disturbance effect of the triple-S algorithm on the CLC, MySVM, and LibSVM classifiers increases when the noise data ratio increases. In addition, compared with existing methods, the disturbance effect of the triple-S algorithm is more significant on MySVM and LibSVM when a certain amount of the noise data added to the original dataset is reached.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Aggarwal, A., Rani, R., and Dhir, R., Recognition of devanagari handwritten numerals using gradient features and SVM. Int. J. Comput. Appl. 48(8):39–44, 2012. doi:10.5120/7371-0151.

    Google Scholar 

  2. Jahromia, M. Z., Parvinniab, E., and Johna, R., A method of learning weighted similarity function to improve the performance of nearest neighbor. Inform. Sci. 179(17):2964–2973, 2009. doi:10.1016/j.ins.2009.04.012.

    Article  Google Scholar 

  3. Martens, D., Baesens, B., and Gestel, T. V., Decompositional rule extraction from support vector machines by active learning. IEEE Trans. Knowl. Data Eng 21(2):178–191, 2009. doi:10.1109/TKDE.2008.131.

    Article  Google Scholar 

  4. Li, Y., Chen, M., Li, Q., and Zhang, W., Enabling multilevel trust in privacy preserving data mining. IEEE Trans. Knowl. Data Eng. 24(9):1598–1612, 2012. doi:10.1109/TKDE.2011.124.

    Article  Google Scholar 

  5. Li, X. B., and Sarkar, S., A tree-based data perturbation approach for privacy-preserving data mining. IEEE Trans. Knowl. Data Eng. 18(9):1278–1283, 2006. doi:10.1109/TKDE.2006.136.

    Article  CAS  Google Scholar 

  6. Pearson, S., and Yee, G., Privacy and security for cloud computing. Springer, Heidelberg, 2003.

    Google Scholar 

  7. Geng, R., Bose, I., and Chen, X., Prediction of financial distress: An empirical study of listed chinese companies using data mining. Eur. J. Oper. Res. 241(1):236–247, 2014. doi:10.1016/j.ejor.2014.08.016.

    Article  Google Scholar 

  8. Hou, L., Yang, S., and Chen, Z., The use of data mining techniques and support vector regression for financial forecasting. Int. J. Database Theory Appl. 6(4):145–156, 2013.

    Google Scholar 

  9. Oreski, S., and Oreski, G., Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst. Appl. 41(4):2052–2064, 2014. doi:10.1016/j.eswa.2013.09.004.

    Article  Google Scholar 

  10. Yang, W., and Qiao, S., A novel anonymization algorithm: Privacy protection and knowledge preservation. Expert Syst. Appl. 37(1):756–766, 2010. doi:10.1016/j.eswa.2009.05.097.

    Article  Google Scholar 

  11. Bacardit, J., and Llorà, X., Large-scale data mining using genetics-based machine learning. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 3(1):37–61, 2013. doi:10.1002/widm.1078.

    Article  Google Scholar 

  12. Chen, T. S., Lin, C. C., Chiu, Y. H., Lin, H. L., and Chen, R. C., A new binary classifier: Clustering-launched classification. Lect. Notes in Comput. Sci. 4114:278–283, 2006. doi:10.1007/11816171_35.

    Article  Google Scholar 

  13. Chen, T. S., Chen, J., Lin, Y. C., Tsai, Y. C., Kao, Y. H., and Wu, K., A novel knowledge protection technique base on support vector machine model for anti-classification. Electr. Eng. Control 98:517–524, 2011. doi:10.1007/978-3-642-21765-4_63.

    Article  Google Scholar 

  14. Clifton, C., Kantarcioglu, M., and Vaidya, J., Defining privacy for data mining. Proceedings of the National Science Foundation Workshop on Next Generation Data Mining 126–133, 2002.

  15. Gkoulalas-Divanis, A., and Verykios, V. S., Exact knowledge hiding through database extension. IEEE Trans. Knowl. Data Eng. 21(5):699–713, 2009. doi:10.1109/TKDE.2008.199.

    Article  Google Scholar 

  16. Bertino, E., Ghinita, G., Kantarcioglu, M., Nguyen, D., Park, J., Sandhu, R., Sultana, S., Thuraisingham, B, and Xu, S., A roadmap for privacy-enhanced secure data provenance. Journal of Intelligent Information Systems 43(3): 481–501, 2014. doi:10.1007/s10844-014-0322-7

  17. Goldberg, M., Cloud security alliance lists 10 big data security challenges. http://data-informed.com/cloudsecurity-alliance-lists-10-big-data-security-challenges/. Accessed 3 September 2013.

  18. Hubbard, D., and Sutton, M., Top threats to cloud computing V1. 0. Cloud Security Alliance, 2010.

  19. Jansen, W. A., Cloud hooks: Security and privacy issues in cloud computing. Proceedings of the 44th Hawaii International Conference on System Sciences 1–10, 2011. doi:10.1109/HICSS.2011.103.

  20. Kshetri, N., Privacy and security issues in cloud computing: The role of institutions and institutional evolution. Telecommun. Pol. 37(4):372–386, 2013. doi:10.1016/j.telpol.2012.04.011.

    Article  Google Scholar 

  21. Rong, C., Nguyen, S. T., and Jaatun, M. G., Beyond lightning: A survey on security challenges in cloud computing. Comput. Electr. Eng. 39(1):47–54, 2013. doi:10.1016/j.compeleceng.2012.04.015.

    Article  Google Scholar 

  22. Subashini, S., and Kavitha, V., A survey on security issues in service delivery models of cloud computing. J. Netw. Comput. Appl. 34(1):1–11, 2011. doi:10.1016/j.jnca.2010.07.006.

    Article  Google Scholar 

  23. Bianchi, T., Piva, A., and Barni, M., On the implementation of the discrete Fourier transform in the encrypted domain. IEEE Trans. Inf. Forensic Secur 4(1):86–97, 2009. doi:10.1109/TIFS.2008.2011087.

    Article  Google Scholar 

  24. Hao, Z., Zhong, S., and Yu, N., A privacy-preserving remote data integrity checking protocol with data dynamics and public verifiability. IEEE Trans. Knowl. Data Eng. 23(9):1432–1437, 2011. doi:10.1109/TKDE.2011.62.

    Article  Google Scholar 

  25. Sasikala, I. S., and Banu, N., Privacy preserving data mining using piecewise vector quantization (PVQ). Int. J. Adv. Res. Comput. Sci. Technol. 2(3):302–306, 2014.

    Google Scholar 

  26. Chun, J. Y., Hong, D., Jeong, I. R., and Lee, D. H., Privacy-preserving disjunctive normal form operations on distributed sets. Inform. Sci. 231:113–122, 2013. doi:10.1016/j.ins.2011.07.003.

    Article  Google Scholar 

  27. Fung, B., Wang, K., Chen, R., and Yu, P. S., Privacy-preserving data publishing: A survey of recent developments. ACM Comput. Surv. (CSUR) 42(4):14, 2010. doi:10.1145/1749603.1749605.

    Article  Google Scholar 

  28. Liu, K., and Kargupta, H., Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans. Knowl. Data Eng. 18(1):92–106, 2006. doi:10.1109/TKDE.2006.14.

    Article  Google Scholar 

  29. Matatov, N., Rokach, L., and Maimon, O., Privacy-preserving data mining: A feature set partitioning approach. Inform. Sci. 180(14):2696–2720, 2010. doi:10.1016/j.ins.2010.03.011.

    Article  Google Scholar 

  30. Zhu, D., Li, X. B., and Wu, S., Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining. Decis. Support. Syst. 48(1):133–140, 2009. doi:10.1016/j.dss.2009.07.003.

    Article  Google Scholar 

  31. Li, T., Li, N., Zhang, J., and Molloy, I., Slicing: A new approach for privacy preserving data publishing. IEEE Trans. Knowl. Data Eng. 24(3):561–574, 2012. doi:10.1109/TKDE.2010.236.

    Article  Google Scholar 

  32. Chen, T. S., Chen, J., Lin, Y. C., and Tsai, Y. C., Research to protect database by shaking random sampling interference (SRSI). Proceedings of the 2009 Global Congress on Intelligent Systems 569–572, 2009. doi:10.1109/GCIS.2009.384.

  33. Chen, T. S., Chen, J., Kao, Y. H., and Hsieh, T. C., A novel anti-data mining technique based on hierarchical anti-clustering (HAC). Proceedings of the Eighth International Conference on Intelligent Systems Design and Applications 426–430, 2008. doi:10.1109/ISDA.2008.155

  34. Abdi, H., and Williams, L. J., Principal component analysis. Comput. Stat. 2(4):433–459, 2010. doi:10.1002/wics.101.

    Article  Google Scholar 

  35. Lin, J. S., Tien, S. W., Chen, T. S., Kao, Y. H., Lin, C. C., and Chiu, Y. H., Referential hierarchical clustering algorithm based upon principal component analysis and genetic algorithm. Proceedings of the WSEAS International Conference on Applied Computer Science 139–143, 2007.

  36. Rüping, S., mySVM-manual. University of Dortmund, 2000.

  37. Chang, C. C., and Lin, C. J., LIBSVM: a library for support vector machines. Software Available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Accessed 2 June 2008.

  38. Frank, A., and Asuncion, A., UCI Machine learning repository. http://archive.ics.uci.edu/ml/. Accessed 6 September 2010.

Download references

Acknowledgments

This work was supported partially by the Ministry of Science and Technology of Republic of China under grant MOST 103-2221-E-025-007 and 102-2218-E-025-001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuan-Hung Kao.

Additional information

This article is part of the Topical Collection on Transactional Processing Systems

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, CY., Chen, TS., Tsai, HF. et al. A Novel Anti-classification Approach for Knowledge Protection. J Med Syst 39, 113 (2015). https://doi.org/10.1007/s10916-015-0305-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-015-0305-4

Keywords

Navigation