CSpace
Neighborhood Contrastive Transformer for Change Captioning
Tu, Yunbin1; Li, Liang2,3; Su, Li1; Lu, Ke4,5; Huang, Qingming1
2023
发表期刊IEEE TRANSACTIONS ON MULTIMEDIA
ISSN1520-9210
卷号25页码:9518-9529
摘要Change captioning is to describe the semantic change between a pair of similar images in natural language. It is more challenging than general image captioning, because it requires capturing fine-grained change information while being immune to irrelevant viewpoint changes, and solving syntax ambiguity in change descriptions. In this paper, we propose a neighborhood contrastive transformer to improve the model's perceiving ability for various changes under different scenes and cognition ability for complex syntax structure. Concretely, we first design a neighboring feature aggregating to integrate neighboring context into each feature, which helps quickly locate the inconspicuous changes under the guidance of conspicuous referents. Then, we devise a common feature distilling to compare two images at neighborhood level and extract common properties from each image, so as to learn effective contrastive information between them. Finally, we introduce the explicit dependencies between words to calibrate the transformer decoder, which helps better understand complex syntax structure during training. Extensive experimental results demonstrate that the proposed method achieves the state-of-the-art performance on three public datasets with different change scenarios.
关键词Change captioning neighborhood contrastive transformer syntax dependencies
DOI10.1109/TMM.2023.3254162
收录类别SCI
语种英语
资助项目National Key Ramp;D Program of China
WOS研究方向Computer Science ; Telecommunications
WOS类目Computer Science, Information Systems ; Computer Science, Software Engineering ; Telecommunications
WOS记录号WOS:001133324200036
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
被引频次:1[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/38416
专题中国科学院计算技术研究所
通讯作者Li, Liang; Su, Li
作者单位1.Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
3.Hangzhou Dianzi Univ, Lishui Inst, Lishui 323000, Zhejiang, Peoples R China
4.Univ Chinese Acad Sci, Sch Engn Sci, Beijing 101408, Peoples R China
5.Peng Cheng Lab, Shenzhen 518055, Guangdong, Peoples R China
推荐引用方式
GB/T 7714
Tu, Yunbin,Li, Liang,Su, Li,et al. Neighborhood Contrastive Transformer for Change Captioning[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2023,25:9518-9529.
APA Tu, Yunbin,Li, Liang,Su, Li,Lu, Ke,&Huang, Qingming.(2023).Neighborhood Contrastive Transformer for Change Captioning.IEEE TRANSACTIONS ON MULTIMEDIA,25,9518-9529.
MLA Tu, Yunbin,et al."Neighborhood Contrastive Transformer for Change Captioning".IEEE TRANSACTIONS ON MULTIMEDIA 25(2023):9518-9529.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Tu, Yunbin]的文章
[Li, Liang]的文章
[Su, Li]的文章
百度学术
百度学术中相似的文章
[Tu, Yunbin]的文章
[Li, Liang]的文章
[Su, Li]的文章
必应学术
必应学术中相似的文章
[Tu, Yunbin]的文章
[Li, Liang]的文章
[Su, Li]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。