CSpace
Dual-View Curricular Optimal Transport for Cross-Lingual Cross-Modal Retrieval
Wang, Yabing1,2,3; Wang, Shuhui4; Luo, Hao5,6; Dong, Jianfeng3,7; Wang, Fan5; Han, Meng8; Wang, Xun3,7; Wang, Meng9
2024
发表期刊IEEE TRANSACTIONS ON IMAGE PROCESSING
ISSN1057-7149
卷号33页码:1522-1533
摘要Current research on cross-modal retrieval is mostly English-oriented, as the availability of a large number of English-oriented human-labeled vision-language corpora. In order to break the limit of non-English labeled data, cross-lingual cross-modal retrieval (CCR) has attracted increasing attention. Most CCR methods construct pseudo-parallel vision-language corpora via Machine Translation (MT) to achieve cross-lingual transfer. However, the translated sentences from MT are generally imperfect in describing the corresponding visual contents. Improperly assuming the pseudo-parallel data are correctly correlated will make the networks overfit to the noisy correspondence. Therefore, we propose Dual-view Curricular Optimal Transport (DCOT) to learn with noisy correspondence in CCR. In particular, we quantify the confidence of the sample pair correlation with optimal transport theory from both the cross-lingual and cross-modal views, and design dual-view curriculum learning to dynamically model the transportation costs according to the learning stage of the two views. Extensive experiments are conducted on two multilingual image-text datasets and one video-text dataset, and the results demonstrate the effectiveness and robustness of the proposed method. Besides, our proposed method also shows a good expansibility to cross-lingual image-text baselines and a decent generalization on out-of-domain data.
关键词Visualization Noise measurement Estimation Costs Transportation Training Task analysis Cross-modal retrieval noise correspondence learning cross-lingual transfer optimal transport machine translation
DOI10.1109/TIP.2024.3365248
收录类别SCI
语种英语
资助项目Pioneer and Leading Goose Research and Development Program of Zhejiang[2023C01212] ; Young Elite Scientists Sponsorship Program by China Association for Science and Technology (CAST)[2022QNRC001] ; Zhejiang Provincial Natural Science Foundation[LZ23F020004] ; National Natural Science Foundation of China[62236008] ; National Natural Science Foundation of China[62376246] ; Zhejiang Key Laboratory of Multidimensional Perception Technology Application and Cybersecurity[HIKKL-20230007]
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS记录号WOS:001177650300006
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/38717
专题中国科学院计算技术研究所
通讯作者Dong, Jianfeng
作者单位1.Xi An Jiao Tong Univ, Natl Key Lab Human Machine Hybrid Augmented Intell, Natl Engn Res Ctr Visual Informat & Applicat, Xian 710049, Shaanxi, Peoples R China
2.Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Shaanxi, Peoples R China
3.Zhejiang Gongshang Univ, Coll Comp Sci & Technol, Hangzhou 310035, Peoples R China
4.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
5.Alibaba Grp, Hangzhou 310052, Peoples R China
6.Hupan Lab, Hangzhou 310058, Zhejiang, Peoples R China
7.Zhejiang Key Lab E Commerce, Zhoushan 311121, Peoples R China
8.Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310058, Peoples R China
9.Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
推荐引用方式
GB/T 7714
Wang, Yabing,Wang, Shuhui,Luo, Hao,et al. Dual-View Curricular Optimal Transport for Cross-Lingual Cross-Modal Retrieval[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING,2024,33:1522-1533.
APA Wang, Yabing.,Wang, Shuhui.,Luo, Hao.,Dong, Jianfeng.,Wang, Fan.,...&Wang, Meng.(2024).Dual-View Curricular Optimal Transport for Cross-Lingual Cross-Modal Retrieval.IEEE TRANSACTIONS ON IMAGE PROCESSING,33,1522-1533.
MLA Wang, Yabing,et al."Dual-View Curricular Optimal Transport for Cross-Lingual Cross-Modal Retrieval".IEEE TRANSACTIONS ON IMAGE PROCESSING 33(2024):1522-1533.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Wang, Yabing]的文章
[Wang, Shuhui]的文章
[Luo, Hao]的文章
百度学术
百度学术中相似的文章
[Wang, Yabing]的文章
[Wang, Shuhui]的文章
[Luo, Hao]的文章
必应学术
必应学术中相似的文章
[Wang, Yabing]的文章
[Wang, Shuhui]的文章
[Luo, Hao]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。