CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Multi-modal semantic autoencoder for cross-modal retrieval
Wu, Yiling1,2; Wang, Shuhui1; Huang, Qingming2
2019-02-28
发表期刊NEUROCOMPUTING
ISSN0925-2312
卷号331页码:165-175
摘要Cross-modal retrieval has gained much attention in recent years. As the research mainstream, most of existing approaches learn projections for data from different modalities into a common space where data can be compared directly. However, they neglect the preservation of feature and semantic information, so they are unable to obtain satisfactory results as expected. In this paper, we propose a two-stage learning method to learn multi-modal mappings that project multi-modal data to low dimensional embeddings that preserve both feature and semantic information. In the first stage, we combine both low-level feature and high-level semantic information to learn feature-aware semantic code vectors. In the second stage, we use encoder-decoder paradigm to learn projections. The encoder projects feature vectors to code vectors, and the decoder projects code vectors back to feature vectors. The encoder-decoder paradigm guarantees the embeddings to preserve both feature and semantic information. An alternating minimization procedure is developed to solve the multi-modal semantic autoencoder optimization problem. Extensive experiments on three benchmark datasets demonstrate that the proposed method outperforms state-of-the-art cross-modal retrieval methods. (C) 2018 Elsevier B.V. All rights reserved.
关键词Cross-modal retrieval Multi-modal data Autoencoder
DOI10.1016/j.neucom.2018.11.042
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[61672497] ; National Natural Science Foundation of China[61332016] ; National Natural Science Foundation of China[61620106009] ; National Natural Science Foundation of China[61650202] ; National Natural Science Foundation of China[U1636214] ; National Basic Research Program of China (973 Program)[2015CB351802] ; Key Research Program of Frontier Sciences of CAS[QYZDJ-SSW-SYS013]
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence
WOS记录号WOS:000455210900015
出版者ELSEVIER SCIENCE BV
引用统计
被引频次:31[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/3477
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Wang, Shuhui
作者单位1.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Sch Comp & Control Engn, Beijing 100049, Peoples R China
推荐引用方式
GB/T 7714
Wu, Yiling,Wang, Shuhui,Huang, Qingming. Multi-modal semantic autoencoder for cross-modal retrieval[J]. NEUROCOMPUTING,2019,331:165-175.
APA Wu, Yiling,Wang, Shuhui,&Huang, Qingming.(2019).Multi-modal semantic autoencoder for cross-modal retrieval.NEUROCOMPUTING,331,165-175.
MLA Wu, Yiling,et al."Multi-modal semantic autoencoder for cross-modal retrieval".NEUROCOMPUTING 331(2019):165-175.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Wu, Yiling]的文章
[Wang, Shuhui]的文章
[Huang, Qingming]的文章
百度学术
百度学术中相似的文章
[Wu, Yiling]的文章
[Wang, Shuhui]的文章
[Huang, Qingming]的文章
必应学术
必应学术中相似的文章
[Wu, Yiling]的文章
[Wang, Shuhui]的文章
[Huang, Qingming]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。