- 博客(0)
- 资源 (8)
- 收藏
- 关注
Adversarial Cross-Modal Retrieval
基于对抗的跨媒体检索Cross-modal retrieval aims to enable flexible retrieval experience
across different modalities (e.g., texts vs. images). The core of crossmodal retrieval research is to learn a common subspace where the
items of different modalities can be directly compared to each other.
In this paper, we present a novel Adversarial Cross-Modal Retrieval
(ACMR) method, which seeks an effective common subspace based
on adversarial learning. Adversarial learning is implemented as
an interplay between two processes. The first process, a feature
projector, tries to generate a modality-invariant representation in
the common subspace and to confuse the other process, modality
classifier, which tries to discriminate between different modalities
based on the generated representation. We further impose triplet
constraints on the feature projector in order to minimize the gap
among the representations of all items from different modalities
with same semantic labels, while maximizing the distances among
semantically different images and texts. Through the joint exploitation of the above, the underlying cross-modal semantic structure
of multimedia data is better preserved when this data is projected
into the common subspace. Comprehensive experimental results
on four widely used benchmark datasets show that the proposed
ACMR method is superior in learning effective subspace representation and that it significantly outperforms the state-of-the-art
cross-modal retrieval methods.
2018-11-27
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人