skip to main content
10.1145/1571941.1572010acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Automatic video tagging using content redundancy

Published:19 July 2009Publication History

ABSTRACT

The analysis of the leading social video sharing platform YouTube reveals a high amount of redundancy, in the form of videos with overlapping or duplicated content. In this paper, we show that this redundancy can provide useful information about connections between videos. We reveal these links using robust content-based video analysis techniques and exploit them for generating new tag assignments. To this end, we propose different tag propagation methods for automatically obtaining richer video annotations. Our techniques provide the user with additional information about videos, and lead to enhanced feature representations for applications such as automatic data organization and search. Experiments on video clustering and classification as well as a user evaluation demonstrate the viability of our approach.

References

  1. J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In SIGIR '98, pages 37--45. ACM Press, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. E.L. Allwein, R.E. Schapire, and Y. Singer. Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research, 1:113--141, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon. I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system. In IMC '07, pages 1--14, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M.S. Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 380--388, NY, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. X. Cheng, C. Dale, and J. Liu. Understanding the characteristics of internet short video sharing: Youtube as a case study, Technical Report arXiv:0707.3670v1 {cs.NI}, Cornell University, arXiv e-prints, July 2007.Google ScholarGoogle Scholar
  6. N. Craswell and M. Szummer. Random walks on the click graph. In SIGIR'07, pages 239--246, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Dumais, J. Platt, D. Heckerman, and M. Sahami. Inductive learning algorithms and representations for text categorization. In CIKM '98, pages 148--155, Bethesda, Maryland, United States, 1998. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Shivakumar and H. Garcia-Molina. Scam: A copy detection mechanism for digital documents. In Proceedings of the Second Annual Conference on the Theory and Practice of Digital Libraries. June 1995.Google ScholarGoogle Scholar
  9. P. Gill, M. Arlitt, Z. Li, and A. Mahanti. Youtube traffic characterization: a view from the edge. In IMC '07: Proceedings of ACM SIGCOMM, pages 15--28, New York, USA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J.S. Hare, P.H. Lewis, P.G.B. Enser, and C.J. Sandom. Mind the gap: another look at the problem of the semantic gap in image retrieval. Multimedia Content Analysis, Management, and Retrieval 2006, 6073(1), 2006.Google ScholarGoogle Scholar
  12. A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Information Retrieval in Folksonomies: Search and Ranking. In The Semantic Web: Research and Applications, volume 4011 of LNAI, pages 411--426, Heidelberg, 2006. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Huffman, A. Lehman, A. Stolboushkin, H. Wong-Toi, F. Yang, and H. Roehrig. Multiple-signal duplicate detection for search evaluation. In SIGIR '07, pages 223--230, New York, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Jing and S. Baluja. Pagerank for product image search. In WWW '08, pages 307--316, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Joly, O. Buisson, and C. Frelicot. Content-based copy retrieval using distortion-based probabilistic similarity search. Multimedia, IEEE Transactions on, 9(2):293--306, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Ke, R. Sukthankar, and L. Huston. An efficient parts-based near-duplicate and sub-image retrieval system. In ACM Multimedia, MM'04, pages 869--876, New York, USA, 2004. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J.M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Likert. A technique for the measurement of attitudes. Archives of Psychology, 22(140):1--55, 1932.Google ScholarGoogle Scholar
  19. L. Liu, W. Lai, X.-S. Hua, and S.-Q. Yang. Video histogram: A novel video signature for efficient web video duplicate detection. Advances in Multimedia Modeling, pages 94--103, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G.S. Manku, A. Jain, and A.D. Sarma. Detecting near-duplicates for web crawling. In ACM WWW'07, pages 141--150, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google ScholarGoogle Scholar
  22. J. San Pedro. Fobs: an open source object-oriented library for accessing multimedia content. In ACM Multimedia, MM '08, pages 1097--1100, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. San Pedro and S. Dominguez. Network-aware identification of video clip fragments. In CIVR '07, pages 317--324, New York, USA, 2007. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. Stokes and J. Carthy. Combining semantic and syntactic document classifiers to improve first story detection. In SIGIR '01, pages 424--425, New York, USA, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Szekely and E. Torres. Ranking bookmarks and bistros: Intelligent community and folksonomy development. In http://torrez.us/archives/2005/07/13/tagrank.pdf (unpublished), 2005.Google ScholarGoogle Scholar
  26. S. van Dongen. A cluster algorithm for graphs. National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, Technical Report INS-R0010, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. X. Wu, A.G. Hauptmann, and C.-W. Ngo. Practical elimination of near-duplicates from web video search. In ACM Multimedia, MM'07, pages 218--227, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Yang and J. Callan. Near-duplicate detection by instance-level constrained clustering. In SIGIR '06, pages 421--428, New York, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. B. Zhang, H. Li, Y. Liu, L. Ji, W. Xi, W. Fan, Z. Chen, and W.-Y. Ma. Improving web search results using affinity graph. In SIGIR '05, pages 504--511, New York, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In SIGIR '02, pages 81--88, New York, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic video tagging using content redundancy

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
          July 2009
          896 pages
          ISBN:9781605584836
          DOI:10.1145/1571941

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 July 2009

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader