Skip to main content

Data Privacy in Hadoop Using Anonymization and T-Closeness

  • Conference paper
  • First Online:
Smart and Innovative Trends in Next Generation Computing Technologies (NGCT 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 828))

Included in the following conference series:

  • 1531 Accesses

Abstract

As everyone uses internet today, modern technology generates huge amount of data every fraction of second. Data gets generated from social sites, private and government organization, hospitals and educational institutes. As everyone wants to remain in touch with their companion and other people, they prefer to have their account on many of social sites and that’s result in generation of huge data. In addition, there are organizations, which has huge number employees, the personal data of employees is very important and it should be secured from any kind of misuse. Similarly in hospitals, patient’s data is very important to do patient analysis for future use. There is need to provide privacy and security to all such data. Hadoop is there for storing and analyzing such huge data. There are various tools which work on the top of Hadoop stack to provide privacy and security to data. One of existing method which contains slicing with l-diversity to provide protection to data from attribute disclosure, but it can’t avoid skewness attack. This paper proposed an algorithm for avoiding skewness attack.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. mass storage systems and technologies. In: IEEE/NASA Goddard Conference (2012). https://doi.org/10.1109/msst.2010.5496972

  2. Derbeko, P., Dolev, S., et al.: Security and privacy aspects in MapReduce on clouds: a survey. Comput. Sci. Rev. 20, 1–28 (2016). Elsevier

    Article  MathSciNet  Google Scholar 

  3. Kishor1, A.V., Balasaheb, D.P., Balasaheb, S.: Privacy preservation for high dimensional data using slicing method in data mining. IJMTER 02 (2015). ISSN 2349–9745

    Google Scholar 

  4. Mohanapriya, D., Meyyappan, T.: Slicing: a efficient method for privacy preservation in data publishing. Int. J. Eng. Res. Appl. (IJERA) 3(4), 1463–1468 (2013)

    Google Scholar 

  5. Singh, A.K., Keer, N.P., Motwani, A.: A review of privacy preservation technique. Int. J. Comput. Appl. 90(3) (2014). ISSN 0975–8887

    Google Scholar 

  6. Thanamani, A.S.: Comparison and analysis of anonymization techniques for preserving privacy in big data. Adv. Comput. Sci. Technol. 10(2), 247–253 (2017). ISSN 0973-6107

    Google Scholar 

  7. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1

    Chapter  Google Scholar 

  8. Aristodimou, A., Antoniades, A., Pattichis, C.S.: Privacy preserving data publishing of categorical data through k-anonymity and feature selection. 3(1), 3 (2016). IEEE. https://doi.org/10.1049/htl.2015.0050

  9. Gehrke, A.M.J., Kifer, D.: L-diversity: privacy beyond k-anonymity. In: 2006 IEEE ICDE (2006). IEEE https://doi.org/10.1109/icde.2006.1

  10. Veena, D.: Data anonymization approaches for data sets using map reduce on cloud: a survey. Int. J. Sci. Res. (IJSR) 3(4) (2014). ISSN (Online): 2319-7064

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Praveen Kaushik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaushik, P., Tayde, V.D. (2018). Data Privacy in Hadoop Using Anonymization and T-Closeness. In: Bhattacharyya, P., Sastry, H., Marriboyina, V., Sharma, R. (eds) Smart and Innovative Trends in Next Generation Computing Technologies. NGCT 2017. Communications in Computer and Information Science, vol 828. Springer, Singapore. https://doi.org/10.1007/978-981-10-8660-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8660-1_35

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8659-5

  • Online ISBN: 978-981-10-8660-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics