CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
GLA: Global-Local Attention for Image Description
Li, Linghui1,2; Tang, Sheng1,2; Zhang, Yongdong1,2; Deng, Lixi1,2; Tian, Qi3
2018-03-01
发表期刊IEEE TRANSACTIONS ON MULTIMEDIA
ISSN1520-9210
卷号20期号:3页码:726-737
摘要In recent years, the task of automatically generating image description has attracted a lot of attention in the field of artificial intelligence. Benefitting from the development of convolutional neural networks (CNNs) and recurrent neural networks (RNNs), many approaches based on the CNN-RNN framework have been proposed to solve this task and achieved remarkable process. However, two problems remain to be tackled in which the most existing methods use only the image-level representation. One problem is object missing, in which some important objects may he missing when generating the image description and the other is misprediction, when one object may be recognized in a wrong category. In this paper, to address these two problems, we propose a new method called global-local attention (GLA) for generating image description. The proposed GLA model utilizes an attention mechanism to integrate object-level features with image-level feature. Through this manner, our model can selectively pay attention to objects and context information concurrently. Therefore, our proposed GLA method can generate more relevant image description sentences and achieve the state-of-the-art performance on the well-known Microsoft COCO caption dataset with several popular evaluation metrics-CIDEr, METEOR, ROUGE-L and BLEU-1, 2,3, 4.
关键词Convolutional neural network recurrent neural network image description natural language processing
DOI10.1109/TMM.2017.2751140
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China[2017YFB1002202] ; Beijing Natural Science Foundation[4152050] ; Beijing Advanced Innovation Center for Imaging Technology[BAICIT-2016009] ; ARO[W911NF-15-1-0290] ; National Natural Science Foundation of China[61525206] ; National Natural Science Foundation of China[61572472] ; National Natural Science Foundation of China[61429201]
WOS研究方向Computer Science ; Telecommunications
WOS类目Computer Science, Information Systems ; Computer Science, Software Engineering ; Telecommunications
WOS记录号WOS:000425397500017
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
被引频次:90[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/5631
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Tang, Sheng
作者单位1.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
3.Univ Texas San Antonio, Dept Comp Sci, San Antonio, TX 78249 USA
推荐引用方式
GB/T 7714
Li, Linghui,Tang, Sheng,Zhang, Yongdong,et al. GLA: Global-Local Attention for Image Description[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2018,20(3):726-737.
APA Li, Linghui,Tang, Sheng,Zhang, Yongdong,Deng, Lixi,&Tian, Qi.(2018).GLA: Global-Local Attention for Image Description.IEEE TRANSACTIONS ON MULTIMEDIA,20(3),726-737.
MLA Li, Linghui,et al."GLA: Global-Local Attention for Image Description".IEEE TRANSACTIONS ON MULTIMEDIA 20.3(2018):726-737.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Li, Linghui]的文章
[Tang, Sheng]的文章
[Zhang, Yongdong]的文章
百度学术
百度学术中相似的文章
[Li, Linghui]的文章
[Tang, Sheng]的文章
[Zhang, Yongdong]的文章
必应学术
必应学术中相似的文章
[Li, Linghui]的文章
[Tang, Sheng]的文章
[Zhang, Yongdong]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。