research-article

Automatic video tagging using content redundancy

Authors:
Stefan Siersdorfer

L3S Research Centre, Hannover, Germany

L3S Research Centre, Hannover, Germany
View Profile

,
Jose San Pedro

University of Sheffield, Sheffield, United Kingdom

University of Sheffield, Sheffield, United Kingdom
View Profile

,
Mark Sanderson

University of Sheffield, Sheffield, United Kingdom

University of Sheffield, Sheffield, United Kingdom
View Profile

SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrievalJuly 2009Pages 395–402https://doi.org/10.1145/1571941.1572010

Published:19 July 2009Publication History

SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Pages 395–402

ABSTRACT

The analysis of the leading social video sharing platform YouTube reveals a high amount of redundancy, in the form of videos with overlapping or duplicated content. In this paper, we show that this redundancy can provide useful information about connections between videos. We reveal these links using robust content-based video analysis techniques and exploit them for generating new tag assignments. To this end, we propose different tag propagation methods for automatically obtaining richer video annotations. Our techniques provide the user with additional information about videos, and lead to enhanced feature representations for applications such as automatic data organization and search. Experiments on video clustering and classification as well as a user evaluation demonstrate the viability of our approach.

References

J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In SIGIR '98, pages 37--45. ACM Press, 1998. Google ScholarDigital Library
E.L. Allwein, R.E. Schapire, and Y. Singer. Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research, 1:113--141, 2001. Google ScholarDigital Library
M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon. I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system. In IMC '07, pages 1--14, NY, USA, 2007. ACM. Google ScholarDigital Library
M.S. Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 380--388, NY, USA, 2002. ACM. Google ScholarDigital Library
X. Cheng, C. Dale, and J. Liu. Understanding the characteristics of internet short video sharing: Youtube as a case study, Technical Report arXiv:0707.3670v1 {cs.NI}, Cornell University, arXiv e-prints, July 2007.Google Scholar
N. Craswell and M. Szummer. Random walks on the click graph. In SIGIR'07, pages 239--246, 2007. Google ScholarDigital Library
S. Dumais, J. Platt, D. Heckerman, and M. Sahami. Inductive learning algorithms and representations for text categorization. In CIKM '98, pages 148--155, Bethesda, Maryland, United States, 1998. ACM Press. Google ScholarDigital Library
N. Shivakumar and H. Garcia-Molina. Scam: A copy detection mechanism for digital documents. In Proceedings of the Second Annual Conference on the Theory and Practice of Digital Libraries. June 1995.Google Scholar
P. Gill, M. Arlitt, Z. Li, and A. Mahanti. Youtube traffic characterization: a view from the edge. In IMC '07: Proceedings of ACM SIGCOMM, pages 15--28, New York, USA, 2007. Google ScholarDigital Library
J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2001. Google ScholarDigital Library
J.S. Hare, P.H. Lewis, P.G.B. Enser, and C.J. Sandom. Mind the gap: another look at the problem of the semantic gap in image retrieval. Multimedia Content Analysis, Management, and Retrieval 2006, 6073(1), 2006.Google Scholar
A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Information Retrieval in Folksonomies: Search and Ranking. In The Semantic Web: Research and Applications, volume 4011 of LNAI, pages 411--426, Heidelberg, 2006. Springer. Google ScholarDigital Library
S. Huffman, A. Lehman, A. Stolboushkin, H. Wong-Toi, F. Yang, and H. Roehrig. Multiple-signal duplicate detection for search evaluation. In SIGIR '07, pages 223--230, New York, USA, 2007. ACM. Google ScholarDigital Library
Y. Jing and S. Baluja. Pagerank for product image search. In WWW '08, pages 307--316, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
A. Joly, O. Buisson, and C. Frelicot. Content-based copy retrieval using distortion-based probabilistic similarity search. Multimedia, IEEE Transactions on, 9(2):293--306, 2007. Google ScholarDigital Library
Y. Ke, R. Sukthankar, and L. Huston. An efficient parts-based near-duplicate and sub-image retrieval system. In ACM Multimedia, MM'04, pages 869--876, New York, USA, 2004. ACM Press. Google ScholarDigital Library
J.M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999. Google ScholarDigital Library
R. Likert. A technique for the measurement of attitudes. Archives of Psychology, 22(140):1--55, 1932.Google Scholar
L. Liu, W. Lai, X.-S. Hua, and S.-Q. Yang. Video histogram: A novel video signature for efficient web video duplicate detection. Advances in Multimedia Modeling, pages 94--103, 2006. Google ScholarDigital Library
G.S. Manku, A. Jain, and A.D. Sarma. Detecting near-duplicates for web crawling. In ACM WWW'07, pages 141--150, NY, USA, 2007. ACM. Google ScholarDigital Library
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google Scholar
J. San Pedro. Fobs: an open source object-oriented library for accessing multimedia content. In ACM Multimedia, MM '08, pages 1097--1100, 2008. Google ScholarDigital Library
J. San Pedro and S. Dominguez. Network-aware identification of video clip fragments. In CIVR '07, pages 317--324, New York, USA, 2007. ACM Press. Google ScholarDigital Library
N. Stokes and J. Carthy. Combining semantic and syntactic document classifiers to improve first story detection. In SIGIR '01, pages 424--425, New York, USA, 2001. ACM. Google ScholarDigital Library
B. Szekely and E. Torres. Ranking bookmarks and bistros: Intelligent community and folksonomy development. In http://torrez.us/archives/2005/07/13/tagrank.pdf (unpublished), 2005.Google Scholar
S. van Dongen. A cluster algorithm for graphs. National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, Technical Report INS-R0010, 2000. Google ScholarDigital Library
X. Wu, A.G. Hauptmann, and C.-W. Ngo. Practical elimination of near-duplicates from web video search. In ACM Multimedia, MM'07, pages 218--227, 2007. Google ScholarDigital Library
H. Yang and J. Callan. Near-duplicate detection by instance-level constrained clustering. In SIGIR '06, pages 421--428, New York, USA, 2006. ACM. Google ScholarDigital Library
B. Zhang, H. Li, Y. Liu, L. Ji, W. Xi, W. Fan, Z. Chen, and W.-Y. Ma. Improving web search results using affinity graph. In SIGIR '05, pages 504--511, New York, USA, 2005. ACM. Google ScholarDigital Library
Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In SIGIR '02, pages 81--88, New York, USA, 2002. ACM. Google ScholarDigital Library

Index Terms

Automatic video tagging using content redundancy
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing systems and tools
2. Information systems
  1. Information systems applications
  2. World Wide Web

Recommendations

Content redundancy in YouTube and its application to video tagging

The emergence of large-scale social Web communities has enabled users to share online vast amounts of multimedia content. An analysis of YouTube reveals a high amount of redundancy, in the form of videos with overlapping or duplicated content. We use ...
Read More
Semi-Automatic Tagging of Photo Albums via Exemplar Selection and Tag Inference

As one of the emerging Web 2.0 activities, tagging becomes a popular approach to manage personal media data, such as photo albums. A dilemma in tagging behavior is the users' manual efforts and the tagging accuracy: exhaustively tagging all photos in an ...
Read More
Automatic image tagging through information propagation in a query log based graph structure
MM '11: Proceedings of the 19th ACM international conference on Multimedia

Annotating or tagging multimedia objects is an important task for enhancing multimedia information retrieval processes. In the context of the Web, automatic tagging deals with many issues, such as loosely tagged images and huge collections of images ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
July 2009
896 pages
ISBN:9781605584836
DOI:10.1145/1571941
General Chairs:
James Allan
University of Massachusetts Amherst, USA
,
Javed Aslam
Northeastern University, USA
,
Program Chairs:
Mark Sanderson
University of Sheffield, UK
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Justin Zobel
University of Melbourne, Australia
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
automatic tagging
content-based links
data organization
neighbor-based tagging
tag propagation
video duplicates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 92
  Total Citations
  View Citations
- 1,155
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic video tagging using content redundancy

SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Content redundancy in YouTube and its application to video tagging

Semi-Automatic Tagging of Photo Albums via Exemplar Selection and Tag Inference

Automatic image tagging through information propagation in a query log based graph structure