CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Addressing Irregularity in Sparse Neural Networks Through a Cooperative Software/Hardware Approach
Zeng, Xi1,2,3; Zhi, Tian1,3; Zhou, Xuda1,2,3; Du, Zidong1,3; Guo, Qi1,3; Liu, Shaoli1,3; Wang, Bingrui1,3; Wen, Yuanbo2,3; Wang, Chao4; Zhou, Xuehai4; Li, Ling5; Chen, Tianshi1,3; Sun, Ninghui1; Chen, Yunji1,2,6
2020-07-01
发表期刊IEEE TRANSACTIONS ON COMPUTERS
ISSN0018-9340
卷号69期号:7页码:968-985
摘要Neural networks have become the dominant algorithms rapidly as they achieve state-of-the-art performance in a broad range of applications such as image recognition, speech recognition, and natural language processing. However, neural networks keep moving toward deeper and larger architectures, posing a great challenge to hardware systems due to the huge amount of data and computations. Although sparsity has emerged as an effective solution for reducing the intensity of computation and memory accesses directly, irregularity caused by sparsity (including sparse synapses and neurons) prevents accelerators from completely leveraging the benefits, i.e., it also introduces costly indexing module in accelerators. In this article, we propose a cooperative software/hardware approach to address the irregularity of sparse neural networks efficiently. Initially, we observe the local convergence, namely larger weights tend to gather into small clusters during training. Based on that key observation, we propose a software-based coarse-grained pruning technique to reduce the irregularity of sparse synapses drastically. The coarse-grained pruning technique, together with local quantization, significantly reduces the size of indexes and improves the network compression ratio. We further design a multi-core hardware accelerator, Cambricon-SE, to address the remaining irregularity of sparse synapses and neurons efficiently. The novel accelerator have three key features: 1) selector modules to filter unnecessary synapses and neurons, 2) compress/decompress modules for exploiting the sparsity in data transmission (which is rarely studied in previous work), and 3) a multi-core architecture with elevated throughput to meet the real-time processing requirement. Compared against a state-of-the-art sparse neural network accelerator, our accelerator is 1.20x and 2.72x better in terms of performance and energy efficiency, respectively. Moreover, for real-time video analysis tasks, Cambricon-SE can process 1080p video at the speed of 76.59 fps.
关键词Accelerator architecture deep neural networks sparsity
DOI10.1109/TC.2020.2978475
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China[2017YFA0700900] ; National Key Research and Development Program of China[2017YFA0700902] ; National Key Research and Development Program of China[2017YFA0700901] ; National Key Research and Development Program of China[2017YFB1003101] ; National Key Research and Development Program of China[2018AAA0103300] ; NSF of China[61432016] ; NSF of China[61532016] ; NSF of China[61672491] ; NSF of China[61602441] ; NSF of China[61602446] ; NSF of China[61732002] ; NSF of China[61702478] ; NSF of China[61732007] ; NSF of China[61732020] ; Beijing Natural Science Foundation[JQ18013] ; 973 Program of China[2015CB358800] ; National Science and Technology Major Project[2018ZX01031102] ; Transformation and Transfer of Scientific and Technological Achievements of Chinese Academy of Sciences[KFJ-HGZX-013] ; Key Research Projects in Frontier Science of Chinese Academy of Sciences[QYZDB-SSW-JSC001] ; Strategic Priority Research Program of Chinese Academy of Science[XDB32050200] ; Strategic Priority Research Program of Chinese Academy of Science[XDC01020000] ; Standardization Research Project of Chinese Academy of Sciences[BZ201800001]
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Hardware & Architecture ; Engineering, Electrical & Electronic
WOS记录号WOS:000542950100005
出版者IEEE COMPUTER SOC
引用统计
被引频次:6[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/15190
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Chen, Yunji
作者单位1.Chinese Acad Sci, Inst Comp Technol ICT, State Key Lab Comp Architecture, Beijing 100864, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
3.Cambricon Technol, Beijing, Peoples R China
4.Univ Sci & Technol China, Hefei 230052, Peoples R China
5.Chinese Acad Sci, Inst Software, Beijing 100864, Peoples R China
6.CAS Ctr Excellence Brain Sci & Intelligence Techn, Shanghai Res Ctr Brian Sci & Brain Inspired Intel, Inst Brain Intelligence Technol, Zhangjiang Lab BIT,ZJLab, Shanghai, Peoples R China
推荐引用方式
GB/T 7714
Zeng, Xi,Zhi, Tian,Zhou, Xuda,et al. Addressing Irregularity in Sparse Neural Networks Through a Cooperative Software/Hardware Approach[J]. IEEE TRANSACTIONS ON COMPUTERS,2020,69(7):968-985.
APA Zeng, Xi.,Zhi, Tian.,Zhou, Xuda.,Du, Zidong.,Guo, Qi.,...&Chen, Yunji.(2020).Addressing Irregularity in Sparse Neural Networks Through a Cooperative Software/Hardware Approach.IEEE TRANSACTIONS ON COMPUTERS,69(7),968-985.
MLA Zeng, Xi,et al."Addressing Irregularity in Sparse Neural Networks Through a Cooperative Software/Hardware Approach".IEEE TRANSACTIONS ON COMPUTERS 69.7(2020):968-985.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zeng, Xi]的文章
[Zhi, Tian]的文章
[Zhou, Xuda]的文章
百度学术
百度学术中相似的文章
[Zeng, Xi]的文章
[Zhi, Tian]的文章
[Zhou, Xuda]的文章
必应学术
必应学术中相似的文章
[Zeng, Xi]的文章
[Zhi, Tian]的文章
[Zhou, Xuda]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。