Paper
10 November 2022 Coordinated and specific autoencoder for cross-modal retrieval
Menghan Xu, Bo Sun, Chong Wang, Fangxiang Feng
Author Affiliations +
Proceedings Volume 12331, International Conference on Mechanisms and Robotics (ICMAR 2022); 123313M (2022) https://doi.org/10.1117/12.2652351
Event: International Conference on Mechanisms and Robotics (ICMAR 2022), 2022, Zhuhai, China
Abstract
This paper considers the problem of cross-modal retrieval, e.g. using a text query to search for images and vice-versa. Existing approaches usually learn a common subspace where the shared parts of different modalities can be directly compared. However, no previous works explicitly show that the learned space contains only the common information but without the modality-specific information. And the division between these two types of information would benefits the task of cross-modal retrieval. In this paper, we present a COordinated and Specific autoEncoder (a.k.a. COSE) that can distinguish the common part from modality-specific part of different modalities. The proposed model COSE consists of two subnetworks, each with two representation layers. The common representation layer learns the common patterns shared within different modalities. And the modality-specific representation layer learns the modality-specific patterns owned by individual modalities. We evaluate our model on three publicly real-world datasets with the task of cross-modal retrieval. The extensive experiments demonstrate the effectiveness of our COSE.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Menghan Xu, Bo Sun, Chong Wang, and Fangxiang Feng "Coordinated and specific autoencoder for cross-modal retrieval", Proc. SPIE 12331, International Conference on Mechanisms and Robotics (ICMAR 2022), 123313M (10 November 2022); https://doi.org/10.1117/12.2652351
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image restoration

Optimization (mathematics)

Image retrieval

Statistical modeling

Multimedia

RELATED CONTENT

Active registration models
Proceedings of SPIE (February 24 2017)
Blur-free low-light imaging with color and event cameras
Proceedings of SPIE (April 12 2021)
Extensible feature management engine for image retrieval
Proceedings of SPIE (December 19 2001)
Private and lossless digital image watermarking system
Proceedings of SPIE (July 31 2002)

Back to Top