CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
A Coordinated Model Pruning and Mapping Framework for RRAM-Based DNN Accelerators
Qu, Songyun1,2; Li, Bing3; Zhao, Shixin1,2; Zhang, Lei4; Wang, Ying1,2,5
2023-07-01
发表期刊IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
ISSN0278-0070
卷号42期号:7页码:2364-2376
摘要Network sparsity or pruning is a pivotal technology for edge intelligence. Resistive random access memory (RRAM)-based accelerators, featuring dense storage and processing in memory capability, have demonstrated the superior computing performance and energy efficiency over the traditional CMOS-based accelerators for neural network applications. Unfortunately, RRAM-based accelerators suffer the performance or energy degradation when deploying pruned models, impairing their competition in the edge intelligence scenarios. We observed the essential reason is the pruning technology and the mapping strategy in prior RRAM-based accelerator and are optimized individually. As a result, the random zeros in the pruned deep neural network are irregularly distributed in the crossbars, rendering the degradation of computation parallelism of the crossbar without crossbar demand reduction. In this work, we propose a coordinated model pruning and mapping framework to jointly optimize of model accuracy and efficiency of RRAM-based accelerators. As for the mapping, we first decouple weight matrices in the bit-wise way and map the bit matrices to different crossbars, where the signed weights are represented with the two's complement so as that save half desired crossbars. As for the pruning, we prune weight bits at the crossbar granularity so that free the crossbars holding the pruned bits. Furthermore, we employ an reinforcement learning (RL) approach to automatically select the optimal crossbar-aware bit-pruning strategy for any given neural network without laborious human efforts. We conducted the experiments on a set of representative neural networks and compared our framework with the state-of-the-art (SOTA) bit-sparsity works. The results show that automatic structured bit-pruning saves up to 89.64% energy reduction and 84.12% area overhead compared to existing PRIME-like architecture. Besides, our framework outperforms the SOTA bit-sparsity design by 1.5x in terms of the energy reduction on the RRAM-based accelerator.
关键词AutoML bit-pruning deep neural networks (DNNs) resistive random access memory (RRAM)
DOI10.1109/TCAD.2022.3221906
收录类别SCI
语种英语
资助项目National Key Research and Development Program[2018AAA0102505] ; National Natural Science Foundation of China[62090024] ; National Natural Science Foundation of China[62222411] ; National Natural Science Foundation of China[61874124] ; National Natural Science Foundation of China[62204164] ; Zhejiang Lab[2021PC0AC01]
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Hardware & Architecture ; Computer Science, Interdisciplinary Applications ; Engineering, Electrical & Electronic
WOS记录号WOS:001017411600023
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
被引频次:2[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/21276
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Wang, Ying
作者单位1.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Sch Comp & Control Engn, Beijing 100190, Peoples R China
3.Capital Normal Univ, Acad Multidisciplinary Studies, Beijing 100037, Peoples R China
4.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
5.Chinese Acad Sci, State Key Lab Comp Architecture, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Qu, Songyun,Li, Bing,Zhao, Shixin,et al. A Coordinated Model Pruning and Mapping Framework for RRAM-Based DNN Accelerators[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2023,42(7):2364-2376.
APA Qu, Songyun,Li, Bing,Zhao, Shixin,Zhang, Lei,&Wang, Ying.(2023).A Coordinated Model Pruning and Mapping Framework for RRAM-Based DNN Accelerators.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,42(7),2364-2376.
MLA Qu, Songyun,et al."A Coordinated Model Pruning and Mapping Framework for RRAM-Based DNN Accelerators".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 42.7(2023):2364-2376.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Qu, Songyun]的文章
[Li, Bing]的文章
[Zhao, Shixin]的文章
百度学术
百度学术中相似的文章
[Qu, Songyun]的文章
[Li, Bing]的文章
[Zhao, Shixin]的文章
必应学术
必应学术中相似的文章
[Qu, Songyun]的文章
[Li, Bing]的文章
[Zhao, Shixin]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。