Skip to main content

A Behavioral Modeling Approach to Prevent Unauthorized Large-Scale Documents Copying from Digital Libraries

  • Chapter
Behavior Computing

Abstract

There are many issues concerning information security of digital libraries. Apart from traditional information security problems there are some specific ones for digital libraries. In this work we consider a behavioral modeling approach to discover unauthorized copying of a large amount of documents from a digital library. Supposing the regular user has interest in semantically related documents, we treat referencing to semantically unrelated documents as anomalous behavior that may indicate attempt of unauthorized large-scale copying. We use an adapted anomaly detection approach to discover attempts of unauthorized large-scale documents copying. We propose a method for constructing classifiers and profiles of regular users’ behavior based on application of Markov chains. We also present the results of experiments conducted within development of a prototype digital library protection system. Finally, examples of a normal profile and an automatically detected anomalous session derived from the real data logs of a digital library illustrate the suggested approach to the problem.

This work was supported by Kaspersky Lab grant as part of the “Program of support for innovative projects”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cox, I.J., Miller, M.L., Bloom, J.A.: Digital Watermarking. The Morgan Kaufmann Series in Multimedia and Information Science. Morgan Kaufmann, San Mateo (2002)

    Google Scholar 

  2. Copyright Lawsuit against Georgia State University: http://ourgeorgiahistory.com/ogh/Copyright_Lawsuit_against_Georgia_State_University. Cited 7 June 2011

  3. D’Ambrosio, B., Altendorf, E., Jorgensen, J.: Probabilistic relational models of on-line user behavior. In: Proceedings of the WebKDD-2003 Workshop on Webmining as a Premise to Effective and Intelligent Web Applications, Washington, DC, pp. 9–16 (2003)

    Google Scholar 

  4. Cao, L.: In-depth behavior understanding and use: the behavior informatics approach. Inf. Sci. 180, 3067–3085 (2010)

    Article  Google Scholar 

  5. Digital Library of the Republic of Karelia: http://www.elibrary.karelia.ru. Cited 7 June 2011

  6. Hassan, M.T., Junejo, K.N., Karim, A.: Bayesian inference for Web surfer behavior prediction. In: Proceedings of ECML/PKDD Discovery Challenge Workshop (2007)

    Google Scholar 

  7. Hogg, T., Lerman, K.: Stochastic models of user-contributory Web sites. In: Proceedings of the 3rd International Conference on Weblogs and Social Media (2009)

    Google Scholar 

  8. Hu, Y., Zincir-Heywood, A.N.: Modeling user behaviors from FTP server logs. In: Proceedings of the 4th Annual Communication Networks and Services Research Conference (2006). doi:10.1109/CNSR.2006.36

    Google Scholar 

  9. Ivashko, E.: The defensive system against unauthorized documents-copying of the digital libraries development. In: Proceedings of the Ninth Russian Conference on Digital Libraries, pp. 300–306 (2007) (in Russian)

    Google Scholar 

  10. Jha, S., Tan, K., Maxion, R.A.: Markov chains, classifiers, and intrusion detection. In: Proceedings of the 14th IEEE Computer Security Foundations Workshop, 0206 (2001). doi:10.1109/CSFW.2001.930147

    Google Scholar 

  11. JSTOR service: Terms and conditions of use. http://www.jstor.org/page/info/about/policies/terms.jsp. Cited 7 June 2011

  12. Koulouris, A., Kapidakis, S.: Access and reproduction policies of the digital material of seven national libraries. In: Proceedings of the Fifth Russian Conference on Digital Libraries, pp. 35–44 (2003)

    Google Scholar 

  13. Log Files—Apache HTTP Server: http://httpd.apache.org/docs/current/logs.html. Cited 7 June 2011

  14. Pavlov, D., Manavoglu, E., Pennock, D., Lee Giles, C.: Collaborative Filtering with Maximum Entropy. IEEE Intell. Syst. 19(6) (2004). doi:10.1109/MIS.2004.59

  15. Wang, J.-H., Chang, H.-C., Hsiao, J.H.: Protecting digital library collections with collaborative Web image copy detection. In: Buchanan, G., Masoodian, M., Cunningham, S.J. (eds.) Digital Libraries: Universal and Ubiquitous Access to Information. Springer, Heidelberg (2008)

    Google Scholar 

  16. Wang, Y.: A multinomial logistic regression modeling approach for anomaly intrusion detection. Comput. Secur. (2005). doi:10.1016/j.cose.2005.05.003

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evgeny E. Ivashko .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag London

About this chapter

Cite this chapter

Ivashko, E.E., Nikitina, N.N. (2012). A Behavioral Modeling Approach to Prevent Unauthorized Large-Scale Documents Copying from Digital Libraries. In: Cao, L., Yu, P. (eds) Behavior Computing. Springer, London. https://doi.org/10.1007/978-1-4471-2969-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-2969-1_16

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-2968-4

  • Online ISBN: 978-1-4471-2969-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics