Institute of Computing Technology, Chinese Academy IR
Perspective-Adaptive Convolutions for Scene Parsing | |
Zhang, Rui1,2; Tang, Sheng1,2; Zhang, Yongdong1,2; Li, Jintao1,2; Yan, Shuicheng3 | |
2020-04-01 | |
发表期刊 | IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE |
ISSN | 0162-8828 |
卷号 | 42期号:4页码:909-924 |
摘要 | Many existing scene parsing methods adopt Convolutional Neural Networks with receptive fields of fixed sizes and shapes, which frequently results in inconsistent predictions of large objects and invisibility of small objects. To tackle this issue, we propose perspective-adaptive convolutions to acquire receptive fields of flexible sizes and shapes during scene parsing. Through adding a new perspective regression layer, we can dynamically infer the position-adaptive perspective coefficient vectors utilized to reshape the convolutional patches. Consequently, the receptive fields can be adjusted automatically according to the various sizes and perspective deformations of the objects in scene images. Our proposed convolutions are differentiable to learn the convolutional parameters and perspective coefficients in an end-to-end way without any extra training supervision of object sizes. Furthermore, considering that the standard convolutions lack contextual information and spatial dependencies, we propose a context adaptive bias to capture both local and global contextual information through average pooling on the local feature patches and global feature maps, followed by flexible attentive summing to the convolutional results. The attentive weights are position-adaptive and context-aware, and can be learned through adding an additional context regression layer. Experiments on Cityscapes and ADE20K datasets well demonstrate the effectiveness of the proposed methods. |
关键词 | Shape Standards Strain Proposals Convolutional neural networks Training Task analysis Scene parsing convolutional neural networks perspective-adaptive convolutions context adaptive biases |
DOI | 10.1109/TPAMI.2018.2890637 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Natural Science Foundation of China[61525206] ; National Natural Science Foundation of China[61572472] |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000526541100009 |
出版者 | IEEE COMPUTER SOC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/14203 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Tang, Sheng; Zhang, Yongdong |
作者单位 | 1.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 3.AI Inst Qihoo 360, Beijing 100025, Peoples R China |
推荐引用方式 GB/T 7714 | Zhang, Rui,Tang, Sheng,Zhang, Yongdong,et al. Perspective-Adaptive Convolutions for Scene Parsing[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2020,42(4):909-924. |
APA | Zhang, Rui,Tang, Sheng,Zhang, Yongdong,Li, Jintao,&Yan, Shuicheng.(2020).Perspective-Adaptive Convolutions for Scene Parsing.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,42(4),909-924. |
MLA | Zhang, Rui,et al."Perspective-Adaptive Convolutions for Scene Parsing".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 42.4(2020):909-924. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论