Institute of Computing Technology, Chinese Academy IR
Search-Free Inference Acceleration for Sparse Convolutional Neural Networks | |
Liu, Bosheng1,2,3; Chen, Xiaoming2,3; Han, Yinhe2,3; Wu, Jigang1; Chang, Liang4; Liu, Peng1; Xu, Haobo2,3 | |
2022-07-01 | |
发表期刊 | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS |
ISSN | 0278-0070 |
卷号 | 41期号:7页码:2156-2169 |
摘要 | Sparse convolution neural networks (CNNs) are promising in reducing both memory usage and computational complexity while still preserving high inference accuracy. State-of-the-art sparse CNN accelerators can deliver high throughput by skipping zero weights and/or activations. To operate on only nonzero weights and activations, sparse accelerators typically search pairs of nonzero weights and activations for multiplication-accumulation (MAC) operations. However, the conventional search operation results in a severe limitation in the processing element (PE) array scale because of the enormous demands of internal interconnection and memory bandwidth. In this article, we first provide a design principle to free the search process of sparse CNN accelerations. Specifically, the indexes of the static compressed weights access the dynamic activations directly to avoid the search process for MAC operations. We then develop two search-free inference accelerators, called Swan and Swan-flexible, for sparse CNN accelerations. Swan supports search-free sparse convolution accelerations for interconnection and bandwidth saving. Compared with Swan, Swan-flexible not only has the search-free capability but also comprises a configurable architecture for optimum throughput. We formulate a mathematical optimization problem by combining the configurable characterization with the compressive dataflow to optimize the overall throughput. Evaluations based on a place-and-route process show that the proposed designs, in a compact factor of 4096 PEs, achieve 1.5- $2.7x higher speedup and 6.0- $13.6x better energy efficiency than representative accelerator baselines with the same PE array scale. |
关键词 | Internal interconnection memory bandwidth sparse accelerators sparse convolution neural networks (CNNs) |
DOI | 10.1109/TCAD.2021.3102191 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | Natural Science Foundation of Guangdong, China[2018B030311007] ; Key Research Program of Frontier Sciences, Chinese Academy of Sciences[ZDBS-LY-JSC012] ; Strategic Priority Research Program of Chinese Academy of Sciences[XDB44000000] ; National Natural Science Foundation of China[62072118] ; National Natural Science Foundation of China[U1811264] ; National Natural Science Foundation of China[61966009] ; National Natural Science Foundation of China[U1711263] ; Youth Innovation Promotion Association CAS ; Guangxi Key Laboratory of Trusted Software[kx202025] ; State Key Laboratory of Computer Architecture (ICT, CAS)[CARCH201907] |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Hardware & Architecture ; Computer Science, Interdisciplinary Applications ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000812532700018 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/19615 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Chen, Xiaoming; Wu, Jigang |
作者单位 | 1.Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou 510006, Peoples R China 2.Chinese Acad Sci, Ctr Intelligent Comp Syst, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China 3.Univ Chinese Acad Sci, Beijing 100190, Peoples R China 4.Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin 541004, Peoples R China |
推荐引用方式 GB/T 7714 | Liu, Bosheng,Chen, Xiaoming,Han, Yinhe,et al. Search-Free Inference Acceleration for Sparse Convolutional Neural Networks[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2022,41(7):2156-2169. |
APA | Liu, Bosheng.,Chen, Xiaoming.,Han, Yinhe.,Wu, Jigang.,Chang, Liang.,...&Xu, Haobo.(2022).Search-Free Inference Acceleration for Sparse Convolutional Neural Networks.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,41(7),2156-2169. |
MLA | Liu, Bosheng,et al."Search-Free Inference Acceleration for Sparse Convolutional Neural Networks".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 41.7(2022):2156-2169. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论