CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Integrating Scene Semantic Knowledge into Image Captioning
Wei, Haiyang1; Li, Zhixin1; Huang, Feicheng1; Zhang, Canlong1; Ma, Huifang2; Shi, Zhongzhi3
2021-06-01
发表期刊ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS
ISSN1551-6857
卷号17期号:2页码:22
摘要Most existing image captioning methods use only the visual information of the image to guide the generation of captions, lack the guidance of effective scene semantic information, and the current visual attention mechanism cannot adjust the focus intensity on the image. In this article, we first propose an improved visual attention model. At each timestep, we calculated the focus intensity coefficient of the attention mechanism through the context information of themodel, then automatically adjusted the focus intensity of the attention mechanism through the coefficient to extract more accurate visual information. In addition, we represented the scene semantic knowledge of the image through topic words related to the image scene, then added them to the language model. We used the attention mechanism to determine the visual information and scene semantic information that the model pays attention to at each timestep and combined them to enable the model to generate more accurate and scene-specific captions. Finally, we evaluated our model on Microsoft COCO (MSCOCO) and Flickr30k standard datasets. The experimental results show that our approach generates more accurate captions and outperforms many recent advanced models in various evaluation metrics.
关键词Image captioning attention mechanism scene semantics encoder-decoder framework
DOI10.1145/3439734
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[61966004] ; National Natural Science Foundation of China[61663004] ; National Natural Science Foundation of China[61866004] ; National Natural Science Foundation of China[61762078] ; Guangxi Natural Science Foundation[2019GXNSFDA245018] ; Guangxi Natural Science Foundation[2018GXNSFDA281009] ; Guangxi Bagui Scholar Teams for Innovation and Research Project ; Guangxi Talent Highland Project of Big Data Intelligence and Application ; Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing
WOS研究方向Computer Science
WOS类目Computer Science, Information Systems ; Computer Science, Software Engineering ; Computer Science, Theory & Methods
WOS记录号WOS:000661037000017
出版者ASSOC COMPUTING MACHINERY
引用统计
被引频次:27[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/17625
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Li, Zhixin
作者单位1.Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, 15 Yucai Rd, Guilin 541004, Guangxi, Peoples R China
2.Northwest Normal Univ, Coll Comp Sci & Engn, 967 Anning East Rd, Lanzhou 730070, Gansu, Peoples R China
3.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, 6 Kexueyuan South Rd, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Wei, Haiyang,Li, Zhixin,Huang, Feicheng,et al. Integrating Scene Semantic Knowledge into Image Captioning[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,2021,17(2):22.
APA Wei, Haiyang,Li, Zhixin,Huang, Feicheng,Zhang, Canlong,Ma, Huifang,&Shi, Zhongzhi.(2021).Integrating Scene Semantic Knowledge into Image Captioning.ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,17(2),22.
MLA Wei, Haiyang,et al."Integrating Scene Semantic Knowledge into Image Captioning".ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS 17.2(2021):22.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Wei, Haiyang]的文章
[Li, Zhixin]的文章
[Huang, Feicheng]的文章
百度学术
百度学术中相似的文章
[Wei, Haiyang]的文章
[Li, Zhixin]的文章
[Huang, Feicheng]的文章
必应学术
必应学术中相似的文章
[Wei, Haiyang]的文章
[Li, Zhixin]的文章
[Huang, Feicheng]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。