Institute of Computing Technology, Chinese Academy IR
Neighborhood Contrastive Transformer for Change Captioning | |
Tu, Yunbin1; Li, Liang2,3; Su, Li1; Lu, Ke4,5; Huang, Qingming1 | |
2023 | |
发表期刊 | IEEE TRANSACTIONS ON MULTIMEDIA
![]() |
ISSN | 1520-9210 |
卷号 | 25页码:9518-9529 |
摘要 | Change captioning is to describe the semantic change between a pair of similar images in natural language. It is more challenging than general image captioning, because it requires capturing fine-grained change information while being immune to irrelevant viewpoint changes, and solving syntax ambiguity in change descriptions. In this paper, we propose a neighborhood contrastive transformer to improve the model's perceiving ability for various changes under different scenes and cognition ability for complex syntax structure. Concretely, we first design a neighboring feature aggregating to integrate neighboring context into each feature, which helps quickly locate the inconspicuous changes under the guidance of conspicuous referents. Then, we devise a common feature distilling to compare two images at neighborhood level and extract common properties from each image, so as to learn effective contrastive information between them. Finally, we introduce the explicit dependencies between words to calibrate the transformer decoder, which helps better understand complex syntax structure during training. Extensive experimental results demonstrate that the proposed method achieves the state-of-the-art performance on three public datasets with different change scenarios. |
关键词 | Change captioning neighborhood contrastive transformer syntax dependencies |
DOI | 10.1109/TMM.2023.3254162 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Key Ramp;D Program of China |
WOS研究方向 | Computer Science ; Telecommunications |
WOS类目 | Computer Science, Information Systems ; Computer Science, Software Engineering ; Telecommunications |
WOS记录号 | WOS:001133324200036 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/38416 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Li, Liang; Su, Li |
作者单位 | 1.Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China 3.Hangzhou Dianzi Univ, Lishui Inst, Lishui 323000, Zhejiang, Peoples R China 4.Univ Chinese Acad Sci, Sch Engn Sci, Beijing 101408, Peoples R China 5.Peng Cheng Lab, Shenzhen 518055, Guangdong, Peoples R China |
推荐引用方式 GB/T 7714 | Tu, Yunbin,Li, Liang,Su, Li,et al. Neighborhood Contrastive Transformer for Change Captioning[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2023,25:9518-9529. |
APA | Tu, Yunbin,Li, Liang,Su, Li,Lu, Ke,&Huang, Qingming.(2023).Neighborhood Contrastive Transformer for Change Captioning.IEEE TRANSACTIONS ON MULTIMEDIA,25,9518-9529. |
MLA | Tu, Yunbin,et al."Neighborhood Contrastive Transformer for Change Captioning".IEEE TRANSACTIONS ON MULTIMEDIA 25(2023):9518-9529. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论