Perspective-Adaptive Convolutions for Scene Parsing

doi:10.1109/TPAMI.2018.2890637

	Perspective-Adaptive Convolutions for Scene Parsing
	Zhang, Rui 1,2; Tang, Sheng 1,2; Zhang, Yongdong 1,2; Li, Jintao 1,2; Yan, Shuicheng 3
	2020-04-01
发表期刊	IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
ISSN	0162-8828
卷号	42 期号:4 页码:909-924
摘要	Many existing scene parsing methods adopt Convolutional Neural Networks with receptive fields of fixed sizes and shapes, which frequently results in inconsistent predictions of large objects and invisibility of small objects. To tackle this issue, we propose perspective-adaptive convolutions to acquire receptive fields of flexible sizes and shapes during scene parsing. Through adding a new perspective regression layer, we can dynamically infer the position-adaptive perspective coefficient vectors utilized to reshape the convolutional patches. Consequently, the receptive fields can be adjusted automatically according to the various sizes and perspective deformations of the objects in scene images. Our proposed convolutions are differentiable to learn the convolutional parameters and perspective coefficients in an end-to-end way without any extra training supervision of object sizes. Furthermore, considering that the standard convolutions lack contextual information and spatial dependencies, we propose a context adaptive bias to capture both local and global contextual information through average pooling on the local feature patches and global feature maps, followed by flexible attentive summing to the convolutional results. The attentive weights are position-adaptive and context-aware, and can be learned through adding an additional context regression layer. Experiments on Cityscapes and ADE20K datasets well demonstrate the effectiveness of the proposed methods.
关键词	Shape Standards Strain Proposals Convolutional neural networks Training Task analysis Scene parsing convolutional neural networks perspective-adaptive convolutions context adaptive biases
DOI	10.1109/TPAMI.2018.2890637
收录类别	SCI
语种	英语
WOS研究方向	Computer Science ; Engineering
WOS类目	Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS记录号	WOS:000526541100009
出版者	IEEE COMPUTER SOC
引用统计	被引频次：27[WOS] [WOS记录] [WOS相关记录]
文献类型	期刊论文
条目标识符	http://119.78.100.204/handle/2XEOYT63/14203
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Tang, Sheng; Zhang, Yongdong
作者单位	1.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 3.AI Inst Qihoo 360, Beijing 100025, Peoples R China
推荐引用方式 GB/T 7714	Zhang, Rui,Tang, Sheng,Zhang, Yongdong,et al. Perspective-Adaptive Convolutions for Scene Parsing[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2020,42(4):909-924.
APA	Zhang, Rui,Tang, Sheng,Zhang, Yongdong,Li, Jintao,&Yan, Shuicheng.(2020).Perspective-Adaptive Convolutions for Scene Parsing.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,42(4),909-924.
MLA	Zhang, Rui,et al."Perspective-Adaptive Convolutions for Scene Parsing".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 42.4(2020):909-924.