RDF-powered semantic video annotation tools with concept mapping to Linked Data for next-generation video indexing: a comprehensive review

Sikos, Leslie F.

doi:10.1007/s11042-016-3705-7

RDF-powered semantic video annotation tools with concept mapping to Linked Data for next-generation video indexing: a comprehensive review

Published: 19 August 2016

Volume 76, pages 14437–14460, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Leslie F. Sikos¹

1016 Accesses
17 Citations
4 Altmetric
Explore all metrics

Abstract

Video annotation tools are often compared in the literature, however, most reviews mix unstructured, semi-structured, and the very few structured annotation software. This paper is a comprehensive review of video annotations tools generating structured data output for video clips, regions of interest, frames, and media fragments, with a focus on Linked Data support. The tools are compared in terms of supported input and output data formats, expressivity, annotation specificity, spatial and temporal fragmentation, the concept mapping sources used for Linked Open Data (LOD) interlinking, provenance data support, and standards alignment. Practicality and usability aspects of the user interface of these tools are highlighted. Moreover, this review distinguishes extensively researched yet discontinued semantic video annotation software from promising state-of-the-art tools that show new directions in this increasingly important field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

User-Generated Short Video Content in Social Media. A Case Study of TikTok

Movie Description

Article Open access 25 January 2017

Multimedia learning principles in different learning environments: a systematic review

Article Open access 13 April 2022

Notes

There are also cross-media annotation tools, such as IMAS and YUMA, which provide annotations for multiple media types (see Section 3).
http://vitooki.sourceforge.net/components/muvino/code/index.html
http://www.exmaralda.org/en/tool/exmaralda/
http://www.research.ibm.com/VideoAnnEx/
https://tla.mpi.nl/tools/tla-tools/elan/
http://sourceforge.net/projects/via-tool/
http://www.joanneum.at/en/digital/productssolutions/sematic-video-annotation.html
https://www.dimis.fim.uni-passau.de/iris/index.php?view=vanalyzer
https://www.dimis.fim.uni-passau.de/MDPS/de/mitglieder/30-german-articles/forschung/projekte/33-svcat.html
http://www.anvil-software.org
http://schema.org/Clip
http://xmlns.com/foaf/spec/
It is a common practice to abbreviate terms using the namespace mechanism, which relies on a prefix to eliminate long (often symbolic) URIs, such that schema: abbreviates http://schema.org/ and foaf: abbreviates http://xmlns.com/foaf/0.1/. For example, foaf:depicts abbreviates http://xmlns.com/foaf/0.1/depicts.
http://wordnet-rdf.princeton.edu/ontology
https://sourceforge.net/projects/texai/files/open-cyc-rdf/1.1/
http://swrl.stanford.edu/ontologies/built-ins/3.3/temporal.owl
http://vidont.org/vidont.ttl
https://www.w3.org/TR/media-frags/
In the example, concept names are written in PascalCase, role names in camelCase, and individual names in ALL CAPS, as per description logic best practices.
http://dbpedia.org
http://lod-cloud.net
https://www.w3.org/2001/Annotea/
http://advene.org
http://www.ontomedia.de
http://annomation.open.ac.uk
http://tomayac.com/semwebvid/
https://www.youtube.com
https://github.com/paulweichhart/client-suite
http://www.geonames.org/ontology/
http://www.openannotation.org/spec/core/
https://www.wikidata.org
http://linkedtv.eurecom.fr/tv2rdf
http://editortoolv2.linkedtv.eu
http://www.openvideoannotation.org
http://videojs.com
http://annotatorjs.org
https://github.com/andreruffert/rangeslider.js
http://www.eclap.eu
http://vidont.org/semvidlod/
http://www.w3.org/TR/prov-o/
http://standards.iso.org/ittf/PubliclyAvailableStandards/c035641_ISO_IEC_16448_2002%28E%29.zip
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-267.pdf
http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?csnumber=51140
http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=34228
http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=39478
https://www.ietf.org/rfc/rfc1738.txt
https://www.ietf.org/rfc/rfc5013.txt
http://www.iso.org/iso/catalogue_detail.htm?csnumber=52142
http://www.niso.org/apps/group_public/project/details.php?project_id=105
https://www.w3.org/TR/rdf11-concepts/
https://www.w3.org/TR/skos-reference/
https://vimeo.com
http://www.liveleak.com
http://dublincore.org/documents/dcmi-terms/
https://www.w3.org/TR/mediaont-10/
http://xmlns.com/foaf/spec/
http://www.openannotation.org/ns/
https://www.w3.org/2011/content

References

Aydınlılar M, Yazıcı A (2013) Semi-automatic semantic video annotation tool. In: Gelenbe E, Lent R (eds) Computer and information sciences III, pp 303–310. doi:10.1007/978-1-4471-4594-3_31
Ballan L, Bertini M, Del Bimbo A, Seidenari L, Serra G (2011) Event detection and recognition for semantic annotation of video. Multimed Tools Appl 51(1):279–302. doi:10.1007/s11042-010-0643-7
Article Google Scholar
Ballan L, Bertini M, Del Bimbo A, Serra G (2010) Semantic annotation of soccer videos by visual instance clustering and spatial/temporal reasoning in ontologies. Multimed Tools Appl 48:313–337. doi:10.1007/s11042-009-0342-4
Article Google Scholar
Bellini P, Nesi P, Serena M (2015) MyStoryPlayer: experiencing multiple audiovisual content for education and training. Multimed Tools Appl 74:8219–8259. doi:10.1007/s11042-014-2052-9
Article Google Scholar
Benmokhtar R, Huet B (2014) An ontology-based evidential framework for video indexing using high-level multimodal fusion. Multimed Tools Appl 73(2):663–689. doi:10.1007/s11042-011-0936-5
Article Google Scholar
Bertini M, d’Amico G, Ferracani A, Meoni M, Serra G (2010) Sirio, Orione and Pan: an integrated web system for ontology-based video search and annotation. In: ACM international conference on multimedia, Firenze, Oct 25–29, 2010, pp 1625–1628. doi:10.1145/1873951.1874305
Bertini M, Del Bimbo A, Torniai C, Cucchiara R, Grana C (2006) MOM: multimedia ontology manager. A framework for automatic annotation and semantic retrieval of video sequences. In: ACM Multimedia 2006, Santa Barbara, Oct 23–27, 2006, pp 787–788
Bizer C, Heath T, Berners-Lee T (2009) Linked Data—the story so far. Int J Semant Web Inform Syst 5(3):1–22. doi:10.4018/jswis.2009081901
Bohlken W, Neumann B, Hotz L, Koopmann P (2011) Ontology-based realtime activity monitoring using beam search. Lect Notes Comput Sci 6962:112–121. doi:10.1007/978-3-642-23968-7_12
Article Google Scholar
Carrer M, Ligresti L, Ahanger G, Little TDC (1998) An annotation engine for supporting video database population. Springer Int Series Eng Comput Sci 431:161–184. doi:10.1007/978-0-585-28767-6_7
Google Scholar
Choudhury S, Breslin JG (2010) Enriching videos with light semantics. In: Fourth international conference on advances in semantic processing, Florence, Oct 25–30, 2010, pp 126–131
Duong TH, Nguyen NT, Truong HB, Nguyen VH (2015) A collaborative algorithm for semantic video annotation using a consensus-based social network analysis. Expert Syst Appl 42(1):246–258. doi:10.1016/j.eswa.2014.07.046
Article Google Scholar
Elleuch N, Zarka M, Ammar AB, Alimi AM (2011) A fuzzy ontology-based framework for reasoning in visual video content analysis and indexing. In: Eleventh international workshop on multimedia data mining, San Diego, Aug 21–24, 2011, Article 1. doi:10.1145/2237827.2237828
Gómez-Romero J, Patricio MA, García J, Molina JM (2010) Ontology-based context representation and reasoning for object tracking and scene interpretation in video. Expert Syst Appl 38:7494–7510. doi:10.1016/j.eswa.2010.12.118
Article Google Scholar
Grassi M, Morbidoni C, Nucci M (2012) A collaborative video annotation system based on semantic web technologies. Cogn Comput 4(4):497–514. doi:10.1007/s12559-012-9172-1
Google Scholar
Guo K, Zhang S (2013) A semantic medical multimedia retrieval approach using ontology information hiding. Computational and Mathematical Methods in Medicine, Volume 2013, Article ID 407917, Hindawi Publishing Corporation. doi:10.1155/2013/407917
Haslhofer B, Jochum W, King R, Sadilek C, Schellner K (2009) The LEMO annotation framework: weaving multimedia annotations with the web. Int J Digit Libr 10(1):15–32. doi:10.1007/s00799-009-0050-8
Article Google Scholar
Haslhofer B, Momeni E, Gay M, Simon R (2010) Augmenting Europeana content with Linked Data resources. In: 6th international conference on semantic systems, Graz, Sep 1–3, 2010, Article 40. doi:10.1145/1839707.1839757
Heggland J (2002) Ontolog: temporal annotation using ad hoc ontologies and application profiles. Lect Notes Comput Sci 2458:118–128. doi:10.1007/3-540-45747-X_9
Article MATH Google Scholar
Hunter J, Newmarch J (1999) An indexing, browsing, search and retrieval system for audiovisual libraries. Lect Notes Comput Sci 1696:76–91. doi:10.1007/3-540-48155-9_7
Article Google Scholar
Hunter J, Schroeter R, Henderson M (2003) Vannotea screenshot. University of Queensland. http://www.itee.uq.edu.au/eresearch/filething/images/get/projects/vannotea/031014_Screenshot_FilmEd_v2.jpg. Accessed 4 April 2016
Jiang Y-G, Bhattacharya S, Chang S-F, Shah M (2013) High-level event recognition in unconstrained videos. Int J Multimed Info Retr 2:73–101. doi:10.1007/s13735-012-0024-2
Article Google Scholar
Khedher MI, El Yacoubi MA (2015) Local sparse representation based interest point matching for person re-identification. Lect Notes Comput Sci 9491:241–250. doi:10.1007/978-3-319-26555-1_28
Article Google Scholar
Krötzsch M, Simančík F, Horrocks I (2013) A description logic primer. arXiv:1201.4089v3
Lee M-H, Rho S, Choi E-I (2014) Ontology-based user query interpretation for semantic multimedia contents retrieval. Multimed Tools Appl 73(2):901–915. doi:10.1007/s11042-013-1383-2
Article Google Scholar
Lienhart R, Maydt J (2002) An extended set of Haar-like features for rapid object detection. In: 2002 International conference on image processing, New York, Sep 22–25, 2002, pp 900–903. doi:10.1109/ICIP.2002.1038171
Lombardo V, Pizzo A (2014) Ontology–based visualization of characters’ intentions. Lect Notes Comput Sci 8832:176–187. doi:10.1007/978-3-319-12337-0_18
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. doi:10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Mazloom M, Habibian A, Snoek CG (2013) Querying for video events by semantic signatures from few examples. In: 21st ACM international conference on multimedia, Barcelona, Oct 21–25, 2013, pp 609–612. doi:10.1145/2502081.2502160
Merler M, Huang B, Xie L, Hua G, Natsev A (2012) Semantic model vectors for complex video event recognition. IEEE Trans Multimed 14(1):88–101. doi:10.1109/TMM.2011.2168948
Article Google Scholar
Naphade M, Smith JR, Tesic J, Chang S-F, Hsu W, Kennedy L, Hauptmann A, Curtis J (2006) Large-scale concept ontology for multimedia. IEEE Multimedia 13(3):86–91. doi:10.1109/MMUL.2006.63
Article Google Scholar
Nixon L, Bauer M, Bara C, Kurz T, Pereira J (2012) ConnectME: semantic tools for enriching online video with web content. In: 8th international conference on semantic systems, Graz, Sep 5–7, 2012, pp 55–62
Oomoto E, Tanaka K (1993) OVID: design and implementation of a video-object database system. IEEE T Knowl Data En 5(4):629–643. doi:10.1109/69.234775
Article Google Scholar
Poppe C, Martens G, De Potter P, Van de Walle R (2012) Semantic web technologies for video surveillance metadata. Multimed Tools Appl 56(3):439–467. doi:10.1007/s11042-010-0600-5
Article Google Scholar
Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. In: 2011 I.E. international conference on computer vision, Barcelona, Nov 6–13, 2011, pp 2564–2571. doi:10.1109/ICCV.2011.6126544
Sikos LF (2015) Mastering structured data on the Semantic Web: from HTML5 Microdata to Linked Open Data. Apress Media, New York. doi:10.1007/978-1-4842-1049-9
Sikos LF (2016) A novel approach to multimedia ontology engineering for automated reasoning over audiovisual LOD datasets. Lect Notes Comput Sci 9621:3–12. doi:10.1007/978-3-662-49381-6_1
Article Google Scholar
Sikos LF, Powers DMW (2015) Knowledge-driven video information retrieval with LOD: from semi-structured to structured video metadata. In: Exploiting semantic annotations in information retrieval, Melbourne, Oct 23, 2015, pp 35–37. doi:10.1145/2810133.2810141
Simon R, Jung J, Haslhofer B (2011) The YUMA media annotation framework. Lect Notes Comput Sci 6966:434–437. doi:10.1007/978-3-642-24469-8_43
Article Google Scholar
Steiner T, Hausenblas M (2010) SemWebVid—making video a first class semantic web citizen and a first class web Bourgeois. In: Ninth international semantic web conference, Shanghai, Nov 7–11, 2010
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: IEEE computer society conference on computer vision and pattern recognition, Kauai, Dec 8–14, 2001, pp 511–518. doi:10.1109/CVPR.2001.990517
Weiss W, Bürger T, Villa R, Punitha P, Halb W (2009) Statement-based semantic annotation of media resources. Int J Digital Libr 5887:52–64. doi:10.1007/978-3-642-10543-2_7
Google Scholar
Xu F, Zhang Y-J (2006) Evaluation and comparison of texture descriptors proposed in MPEG-7. J Vis Commun Image Represent 17:701–716. doi:10.1016/j.jvcir.2005.10.002
Article Google Scholar
Yang N-C, Chang W-H, Kuo C-M, Li T-H (2008) A fast MPEG-7 dominant color extraction with new similarity measure for image retrieval. J Vis Commun Image Represent 19:92–105. doi:10.1016/j.jvcir.2007.05.003
Article Google Scholar
Yıldırım Y, Yazıcı A, Yılmaz T (2013) Automatic semantic content extraction in videos using a fuzzy ontology and rule-based model. IEEE T Knowl Data En 25(1):47–61. doi:10.1109/TKDE.2011.189
Zarka M, Ammar AB, Alimi AM (2015) Fuzzy reasoning framework to improve semantic video interpretation. Multimed Tools Appl. doi:10.1007/s11042-015-2537-1
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Knowledge and Interaction Technologies, School of Computer Science, Engineering and Mathematics, Flinders University, GPO Box 2100, Adelaide, SA, 5001, Australia
Leslie F. Sikos

Authors

Leslie F. Sikos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leslie F. Sikos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sikos, L.F. RDF-powered semantic video annotation tools with concept mapping to Linked Data for next-generation video indexing: a comprehensive review. Multimed Tools Appl 76, 14437–14460 (2017). https://doi.org/10.1007/s11042-016-3705-7

Download citation

Received: 05 December 2015
Revised: 24 April 2016
Accepted: 24 June 2016
Published: 19 August 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11042-016-3705-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RDF-powered semantic video annotation tools with concept mapping to Linked Data for next-generation video indexing: a comprehensive review

Abstract

Access this article

Similar content being viewed by others

User-Generated Short Video Content in Social Media. A Case Study of TikTok

Movie Description

Multimedia learning principles in different learning environments: a systematic review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RDF-powered semantic video annotation tools with concept mapping to Linked Data for next-generation video indexing: a comprehensive review

Abstract

Access this article

Similar content being viewed by others

User-Generated Short Video Content in Social Media. A Case Study of TikTok

Movie Description

Multimedia learning principles in different learning environments: a systematic review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation