Search-Free Inference Acceleration for Sparse Convolutional Neural Networks

doi:10.1109/TCAD.2021.3102191

	Search-Free Inference Acceleration for Sparse Convolutional Neural Networks
	Liu, Bosheng 1,2,3; Chen, Xiaoming 2,3; Han, Yinhe 2,3; Wu, Jigang 1; Chang, Liang 4; Liu, Peng 1; Xu, Haobo 2,3
	2022-07-01
发表期刊	IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
ISSN	0278-0070
卷号	41 期号:7 页码:2156-2169
摘要	Sparse convolution neural networks (CNNs) are promising in reducing both memory usage and computational complexity while still preserving high inference accuracy. State-of-the-art sparse CNN accelerators can deliver high throughput by skipping zero weights and/or activations. To operate on only nonzero weights and activations, sparse accelerators typically search pairs of nonzero weights and activations for multiplication-accumulation (MAC) operations. However, the conventional search operation results in a severe limitation in the processing element (PE) array scale because of the enormous demands of internal interconnection and memory bandwidth. In this article, we first provide a design principle to free the search process of sparse CNN accelerations. Specifically, the indexes of the static compressed weights access the dynamic activations directly to avoid the search process for MAC operations. We then develop two search-free inference accelerators, called Swan and Swan-flexible, for sparse CNN accelerations. Swan supports search-free sparse convolution accelerations for interconnection and bandwidth saving. Compared with Swan, Swan-flexible not only has the search-free capability but also comprises a configurable architecture for optimum throughput. We formulate a mathematical optimization problem by combining the configurable characterization with the compressive dataflow to optimize the overall throughput. Evaluations based on a place-and-route process show that the proposed designs, in a compact factor of 4096 PEs, achieve 1.5- $2.7x higher speedup and 6.0- $13.6x better energy efficiency than representative accelerator baselines with the same PE array scale.
关键词	Internal interconnection memory bandwidth sparse accelerators sparse convolution neural networks (CNNs)
DOI	10.1109/TCAD.2021.3102191
收录类别	SCI
语种	英语
资助项目	Natural Science Foundation of Guangdong, China[2018B030311007] ; Key Research Program of Frontier Sciences, Chinese Academy of Sciences[ZDBS-LY-JSC012] ; Strategic Priority Research Program of Chinese Academy of Sciences[XDB44000000] ; National Natural Science Foundation of China[62072118] ; National Natural Science Foundation of China[U1811264] ; National Natural Science Foundation of China[61966009] ; National Natural Science Foundation of China[U1711263] ; Youth Innovation Promotion Association CAS ; Guangxi Key Laboratory of Trusted Software[kx202025] ; State Key Laboratory of Computer Architecture (ICT, CAS)[CARCH201907]
WOS研究方向	Computer Science ; Engineering
WOS类目	Computer Science, Hardware & Architecture ; Computer Science, Interdisciplinary Applications ; Engineering, Electrical & Electronic
WOS记录号	WOS:000812532700018
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计	被引频次：2[WOS] [WOS记录] [WOS相关记录]
文献类型	期刊论文
条目标识符	http://119.78.100.204/handle/2XEOYT63/19615
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Chen, Xiaoming; Wu, Jigang
作者单位	1.Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou 510006, Peoples R China 2.Chinese Acad Sci, Ctr Intelligent Comp Syst, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China 3.Univ Chinese Acad Sci, Beijing 100190, Peoples R China 4.Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin 541004, Peoples R China
推荐引用方式 GB/T 7714	Liu, Bosheng,Chen, Xiaoming,Han, Yinhe,et al. Search-Free Inference Acceleration for Sparse Convolutional Neural Networks[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2022,41(7):2156-2169.
APA	Liu, Bosheng.,Chen, Xiaoming.,Han, Yinhe.,Wu, Jigang.,Chang, Liang.,...&Xu, Haobo.(2022).Search-Free Inference Acceleration for Sparse Convolutional Neural Networks.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,41(7),2156-2169.
MLA	Liu, Bosheng,et al."Search-Free Inference Acceleration for Sparse Convolutional Neural Networks".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 41.7(2022):2156-2169.