Abstract
As everyone uses internet today, modern technology generates huge amount of data every fraction of second. Data gets generated from social sites, private and government organization, hospitals and educational institutes. As everyone wants to remain in touch with their companion and other people, they prefer to have their account on many of social sites and that’s result in generation of huge data. In addition, there are organizations, which has huge number employees, the personal data of employees is very important and it should be secured from any kind of misuse. Similarly in hospitals, patient’s data is very important to do patient analysis for future use. There is need to provide privacy and security to all such data. Hadoop is there for storing and analyzing such huge data. There are various tools which work on the top of Hadoop stack to provide privacy and security to data. One of existing method which contains slicing with l-diversity to provide protection to data from attribute disclosure, but it can’t avoid skewness attack. This paper proposed an algorithm for avoiding skewness attack.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. mass storage systems and technologies. In: IEEE/NASA Goddard Conference (2012). https://doi.org/10.1109/msst.2010.5496972
Derbeko, P., Dolev, S., et al.: Security and privacy aspects in MapReduce on clouds: a survey. Comput. Sci. Rev. 20, 1–28 (2016). Elsevier
Kishor1, A.V., Balasaheb, D.P., Balasaheb, S.: Privacy preservation for high dimensional data using slicing method in data mining. IJMTER 02 (2015). ISSN 2349–9745
Mohanapriya, D., Meyyappan, T.: Slicing: a efficient method for privacy preservation in data publishing. Int. J. Eng. Res. Appl. (IJERA) 3(4), 1463–1468 (2013)
Singh, A.K., Keer, N.P., Motwani, A.: A review of privacy preservation technique. Int. J. Comput. Appl. 90(3) (2014). ISSN 0975–8887
Thanamani, A.S.: Comparison and analysis of anonymization techniques for preserving privacy in big data. Adv. Comput. Sci. Technol. 10(2), 247–253 (2017). ISSN 0973-6107
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1
Aristodimou, A., Antoniades, A., Pattichis, C.S.: Privacy preserving data publishing of categorical data through k-anonymity and feature selection. 3(1), 3 (2016). IEEE. https://doi.org/10.1049/htl.2015.0050
Gehrke, A.M.J., Kifer, D.: L-diversity: privacy beyond k-anonymity. In: 2006 IEEE ICDE (2006). IEEE https://doi.org/10.1109/icde.2006.1
Veena, D.: Data anonymization approaches for data sets using map reduce on cloud: a survey. Int. J. Sci. Res. (IJSR) 3(4) (2014). ISSN (Online): 2319-7064
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kaushik, P., Tayde, V.D. (2018). Data Privacy in Hadoop Using Anonymization and T-Closeness. In: Bhattacharyya, P., Sastry, H., Marriboyina, V., Sharma, R. (eds) Smart and Innovative Trends in Next Generation Computing Technologies. NGCT 2017. Communications in Computer and Information Science, vol 828. Springer, Singapore. https://doi.org/10.1007/978-981-10-8660-1_35
Download citation
DOI: https://doi.org/10.1007/978-981-10-8660-1_35
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8659-5
Online ISBN: 978-981-10-8660-1
eBook Packages: Computer ScienceComputer Science (R0)