Skip to main content

DIME: An Online Tool for the Visual Comparison of Cross-modal Retrieval Models

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11962))

Included in the following conference series:

Abstract

Cross-modal retrieval relies on accurate models to retrieve relevant results for queries across modalities such as image, text, and video. In this paper, we build upon previous work by tackling the difficulty of evaluating models both quantitatively and qualitatively quickly. We present DIME (Dataset, Index, Model, Embedding), a modality-agnostic tool that handles multimodal datasets, trained models, and data preprocessors to support straightforward model comparison with a web browser graphical user interface. DIME inherently supports building modality-agnostic queryable indexes and extraction of relevant feature embeddings, and thus effectively doubles as an efficient cross-modal tool to explore and search through datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zhen, L., Peng H., Wang, X., Peng, D.: Deep supervised cross-modal retrieval. In: CVPR (2019)

    Google Scholar 

  2. Wang, K., Yin Q., Wang W., Wu S., Wang L.: A comprehensive survey on cross-modal retrieval (2016)

    Google Scholar 

  3. Hezel, N., Barthel, K.U., Jung, K.: ImageX - explore and search local/private images. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10705, pp. 372–376. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73600-6_35

    Chapter  Google Scholar 

  4. Gasser R., Rossetto L., Schuldt, H.: Multimodal multimedia retrieval with Vitrivr. In: ICMR (2019)

    Google Scholar 

  5. Johnson, J., Douze, M., Jégou, H.: Billion-scale similarity search with GPUs. arXiv preprint arXiv:1702.08734 (2017)

  6. Amato, G., Falchi, F., Gennaro, C., Rabitti, F.: YFCC100M-HNfc6: a large-scale deep features benchmark for similarity search. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 196–209. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46759-7_15

    Chapter  Google Scholar 

Download references

Acknowledgments

Parts of this work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 and was supported by the LLNL-LDRD Program under Project No. 17-SI-003. Computation resources used in this work were partially supported by AWS Cloud Credits for Research. Any findings and conclusions are those of the authors, and do not necessarily represent the views of the funders.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tony Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, T., Choi, J., Friedland, G. (2020). DIME: An Online Tool for the Visual Comparison of Cross-modal Retrieval Models. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11962. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37734-2_61

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37733-5

  • Online ISBN: 978-3-030-37734-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics