skip to main content
10.1145/3617233.3617273acmotherconferencesArticle/Chapter ViewAbstractPublication PagescbmiConference Proceedingsconference-collections
short-paper
Open Access

Efficient Search with an Interactive Video Retrieval System for Novice Users in IVR4B

Published:30 December 2023Publication History

ABSTRACT

In this paper, we present the second release of VideoCLIP, an interactive CLIP-based video retrieval system that participated in the Video Browser Showdown 2023. While we continue to use the underlying architecture to map the content between image and text, we concentrate on improving the user experience for novice users. Specifically, we have implemented three different query modalities and redesigned the user interface in order to adapt to the context of the Interactive Video Retrieval for Beginners (IVR4B) workshop. These modifications ultimately aim to provide newcomers with a simple and efficient user experience to locate the desired videos.

References

  1. Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, Nicola Messina, Lucia Vadicamo, and Claudio Vairo. 2022. VISIONE at Video Browser Showdown 2022. In MultiMedia Modeling(Lecture Notes in Computer Science), Björn Þór Jónsson, Cathal Gurrin, Minh-Triet Tran, Duc-Tien Dang-Nguyen, Anita Min-Chun Hu, Binh Huynh Thi Thanh, and Benoit Huet (Eds.). Springer International Publishing, Cham, 543–548. https://doi.org/10.1007/978-3-030-98355-0_52Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, Nicola Messina, Lucia Vadicamo, and Claudio Vairo. 2023. VISIONE at Video Browser Showdown 2023. In MultiMedia Modeling(Lecture Notes in Computer Science), Duc-Tien Dang-Nguyen, Cathal Gurrin, Martha Larson, Alan F. Smeaton, Stevan Rudinac, Minh-Son Dao, Christoph Trattner, and Phoebe Chen (Eds.). Springer International Publishing, Cham, 615–621. https://doi.org/10.1007/978-3-031-27077-2_48Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Fabian Berns, Luca Rossetto, Klaus Schoeffmann, Christian Beecks, and George Awad. 2019. V3C1 Dataset: An Evaluation of Content Characteristics. In Proceedings of the 2019 on International Conference on Multimedia Retrieval (Ottawa ON, Canada) (ICMR ’19). Association for Computing Machinery, New York, NY, USA, 334–338. https://doi.org/10.1145/3323873.3325051Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. https://doi.org/10.48550/ARXIV.2010.11929Google ScholarGoogle ScholarCross RefCross Ref
  5. Cathal Gurrin, Björn Þór Jónsson, Klaus Schöffmann, Duc-Tien Dang-Nguyen, Jakub Lokoč, Minh-Triet Tran, Wolfgang Hürst, Luca Rossetto, and Graham Healy. 2021. Introduction to the Fourth Annual Lifelog Search Challenge, LSC’21. In Proceedings of the 2021 International Conference on Multimedia Retrieval (Taipei, Taiwan) (ICMR ’21). Association for Computing Machinery, New York, NY, USA, 690–691. https://doi.org/10.1145/3460426.3470945Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. https://doi.org/10.48550/ARXIV.1512.03385Google ScholarGoogle ScholarCross RefCross Ref
  7. Silvan Heller, Viktor Gsteiger, Werner Bailer, Cathal Gurrin, Björn Þór Jónsson, Jakub Lokoč, Andreas Leibetseder, František Mejzlík, Ladislav Peška, Luca Rossetto, Konstantin Schall, Klaus Schoeffmann, Heiko Schuldt, Florian Spiess, Ly-Duyen Tran, Lucia Vadicamo, Patrik Veselý, Stefanos Vrochidis, and Jiaxin Wu. 2022. Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown. International Journal of Multimedia Information Retrieval 11 (2022), 1 – 18.Google ScholarGoogle ScholarCross RefCross Ref
  8. Nico Hezel, Konstantin Schall, Klaus Jung, and Kai Uwe Barthel. 2022. Efficient Search and Browsing of Large-Scale Video Collections with Vibro. In MultiMedia Modeling(Lecture Notes in Computer Science), Björn Þór Jónsson, Cathal Gurrin, Minh-Triet Tran, Duc-Tien Dang-Nguyen, Anita Min-Chun Hu, Binh Huynh Thi Thanh, and Benoit Huet (Eds.). Springer International Publishing, Cham, 487–492. https://doi.org/10.1007/978-3-030-98355-0_43Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data 7, 3 (2019), 535–547.Google ScholarGoogle ScholarCross RefCross Ref
  10. Jakub Lokoč, Patrik Veselý, František Mejzlík, Gregor Kovalčík, Tomáš Souček, Luca Rossetto, Klaus Schoeffmann, Werner Bailer, Cathal Gurrin, Loris Sauter, Jaeyub Song, Stefanos Vrochidis, Jiaxin Wu, and Björn þóR Jónsson. 2021. Is the Reign of Interactive Search Eternal? Findings from the Video Browser Showdown 2020. ACM Trans. Multimedia Comput. Commun. Appl. 17, 3, Article 91 (jul 2021), 26 pages. https://doi.org/10.1145/3445031Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Thao-Nhu Nguyen, Tu-Khiem Le, Van-Tu Ninh, Minh-Triet Tran, Thanh Binh Nguyen, Graham Healy, Sinéad Smyth, Annalina Caputo, and Cathal Gurrin. 2023. E-LifeSeeker: An Interactive Lifelog Search Engine for LSC’23. In Proceedings of the 6th Annual on Lifelog Search Challenge (Thessaloniki, Greece) (LSC’23). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3592573.3593098Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Thao-Nhu Nguyen, Bunyarit Puangthamawathanakun, Annalina Caputo, Graham Healy, Binh T. Nguyen, Chonlameth Arpnikanondt, and Cathal Gurrin. 2023. VideoCLIP: An Interactive CLIP-Based Video Retrieval System At VBS2023. In MultiMedia Modeling: 29th International Conference, MMM 2023, Bergen, Norway, January 9–12, 2023, Proceedings, Part I (Bergen, Norway). Springer-Verlag, Berlin, Heidelberg, 671–677. https://doi.org/10.1007/978-3-031-27077-2_57Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thao-Nhu Nguyen, Bunyarit Puangthamawathanakun, Graham Healy, Binh T. Nguyen, Cathal Gurrin, and Annalina Caputo. 2022. Videofall - A Hierarchical Search Engine for VBS2022. In MultiMedia Modeling, Björn Þór Jónsson, Cathal Gurrin, Minh-Triet Tran, Duc-Tien Dang-Nguyen, Anita Min-Chun Hu, Binh Huynh Thi Thanh, and Benoit Huet (Eds.). Springer International Publishing, Cham, 518–523.Google ScholarGoogle Scholar
  14. Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8748–8763. https://proceedings.mlr.press/v139/radford21a.htmlGoogle ScholarGoogle Scholar
  15. Luca Rossetto, Klaus Schoeffmann, and Abraham Bernstein. 2021. Insights on the V3C2 Dataset. CoRR abs/2105.01475 (2021). arXiv:2105.01475https://arxiv.org/abs/2105.01475Google ScholarGoogle Scholar
  16. Konstantin Schall, Nico Hezel, Klaus Jung, and Kai Uwe Barthel. 2023. Vibro: Video Browsing with Semantic and Visual Image Embeddings. In MultiMedia Modeling, Duc-Tien Dang-Nguyen, Cathal Gurrin, Martha Larson, Alan F. Smeaton, Stevan Rudinac, Minh-Son Dao, Christoph Trattner, and Phoebe Chen (Eds.). Springer International Publishing, Cham, 665–670.Google ScholarGoogle Scholar

Index Terms

  1. Efficient Search with an Interactive Video Retrieval System for Novice Users in IVR4B

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Article Metrics

        • Downloads (Last 12 months)62
        • Downloads (Last 6 weeks)38

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format