Data Privacy in Hadoop Using Anonymization and T-Closeness

Kaushik, Praveen; Tayde, Varsha Dipak

doi:10.1007/978-981-10-8660-1_35

Praveen Kaushik¹³ &
Varsha Dipak Tayde¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 828))

Included in the following conference series:

International Conference on Next Generation Computing Technologies

1531 Accesses

Abstract

As everyone uses internet today, modern technology generates huge amount of data every fraction of second. Data gets generated from social sites, private and government organization, hospitals and educational institutes. As everyone wants to remain in touch with their companion and other people, they prefer to have their account on many of social sites and that’s result in generation of huge data. In addition, there are organizations, which has huge number employees, the personal data of employees is very important and it should be secured from any kind of misuse. Similarly in hospitals, patient’s data is very important to do patient analysis for future use. There is need to provide privacy and security to all such data. Hadoop is there for storing and analyzing such huge data. There are various tools which work on the top of Hadoop stack to provide privacy and security to data. One of existing method which contains slicing with l-diversity to provide protection to data from attribute disclosure, but it can’t avoid skewness attack. This paper proposed an algorithm for avoiding skewness attack.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. mass storage systems and technologies. In: IEEE/NASA Goddard Conference (2012). https://doi.org/10.1109/msst.2010.5496972
Derbeko, P., Dolev, S., et al.: Security and privacy aspects in MapReduce on clouds: a survey. Comput. Sci. Rev. 20, 1–28 (2016). Elsevier
Article MathSciNet Google Scholar
Kishor1, A.V., Balasaheb, D.P., Balasaheb, S.: Privacy preservation for high dimensional data using slicing method in data mining. IJMTER 02 (2015). ISSN 2349–9745
Google Scholar
Mohanapriya, D., Meyyappan, T.: Slicing: a efficient method for privacy preservation in data publishing. Int. J. Eng. Res. Appl. (IJERA) 3(4), 1463–1468 (2013)
Google Scholar
Singh, A.K., Keer, N.P., Motwani, A.: A review of privacy preservation technique. Int. J. Comput. Appl. 90(3) (2014). ISSN 0975–8887
Google Scholar
Thanamani, A.S.: Comparison and analysis of anonymization techniques for preserving privacy in big data. Adv. Comput. Sci. Technol. 10(2), 247–253 (2017). ISSN 0973-6107
Google Scholar
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1
Chapter Google Scholar
Aristodimou, A., Antoniades, A., Pattichis, C.S.: Privacy preserving data publishing of categorical data through k-anonymity and feature selection. 3(1), 3 (2016). IEEE. https://doi.org/10.1049/htl.2015.0050
Gehrke, A.M.J., Kifer, D.: L-diversity: privacy beyond k-anonymity. In: 2006 IEEE ICDE (2006). IEEE https://doi.org/10.1109/icde.2006.1
Veena, D.: Data anonymization approaches for data sets using map reduce on cloud: a survey. Int. J. Sci. Res. (IJSR) 3(4) (2014). ISSN (Online): 2319-7064
Google Scholar

Download references

Author information

Authors and Affiliations

Maulana Azad National Institute of Technology, Bhopal, 462003, India
Praveen Kaushik & Varsha Dipak Tayde

Authors

Praveen Kaushik
View author publications
You can also search for this author in PubMed Google Scholar
Varsha Dipak Tayde
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Praveen Kaushik .

Editor information

Editors and Affiliations

Indian Institute of Technology Patna, Patna, Bihar, India
Pushpak Bhattacharyya
University of Petroleum and Energy Studies, Dehradun, India
Hanumat G. Sastry
University of Petroleum and Energy Studies, Dehradun, India
Venkatadri Marriboyina
University of Petroleum and Energy Studies, Dehradun, India
Rashmi Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kaushik, P., Tayde, V.D. (2018). Data Privacy in Hadoop Using Anonymization and T-Closeness. In: Bhattacharyya, P., Sastry, H., Marriboyina, V., Sharma, R. (eds) Smart and Innovative Trends in Next Generation Computing Technologies. NGCT 2017. Communications in Computer and Information Science, vol 828. Springer, Singapore. https://doi.org/10.1007/978-981-10-8660-1_35

Download citation

DOI: https://doi.org/10.1007/978-981-10-8660-1_35
Published: 09 June 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8659-5
Online ISBN: 978-981-10-8660-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics