CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
An Automated Quantization Framework for High-Utilization RRAM-Based PIM
Li, Bing1; Qu, Songyun2; Wang, Ying3
2022-03-01
发表期刊IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
ISSN0278-0070
卷号41期号:3页码:583-596
摘要With the advancement of deep neural networks (DNNs), the applications driven by DNNs have been spread from the cloud to the edge. However, the intensive computations and data movements in CNNs impede the adoption of DNNs in resource-constraint edge devices. Quantization, a common model compression method, has attracted a lot of attention as it enables efficient inference by lowering the data bit-width of CNN parameters. Due to the features of massive storage and computing-in-memory array, resistive memory (RRAM) has established the energy efficiency and small area processing-in-memory (PIM) for the acceleration of DNNs at the edge end. However, when deploying the network onto resistive-memory-based PIM (RRAM-based PIM), there will be tremendous unused cells due to the mismatch between the structure of the neural network layer and memory array, resulting in the resource under-utilization and low computation efficiency. In this work, we observed prior quantization approaches fail to improve hardware resource utilization as they ignored the hardware structure information in RRAM. Thus, combining the information of the neural network model and hardware information is essential for a high-utilization RRAM-based PIM design. Considering the vast model parameters and heterogeneous RRAM crossbar structure, we develop a novel quantization framework by leveraging the AutoML technique, i.e., RaQu, which automatically generates a fine-grained quantization strategy for any model that fully utilizes the resource of RRAM-based PIM. The experimental results show that RaQu achieves at most 29.2%-37.4% and 1.8%-3.3% improvement in resource utilization and model accuracy, respectively, compared to prior coarse-grained quantization methods.
关键词Quantization (signal) Neural networks Computational modeling Data models Hardware Resource management Arrays AutoML neural network processing-in-memory (PIM) quantization resistive memory (RRAM)
DOI10.1109/TCAD.2021.3061521
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[61874124] ; Youth Innovation Promotion Association, CAS[2018138] ; State Key Laboratory of Computer Architecture[CARCH201913]
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Hardware & Architecture ; Computer Science, Interdisciplinary Applications ; Engineering, Electrical & Electronic
WOS记录号WOS:000757852800018
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
被引频次:6[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/18969
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Wang, Ying
作者单位1.Capital Normal Univ, Acad Multidisciplinary Studies, Beijing 100037, Peoples R China
2.Univ Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
3.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Li, Bing,Qu, Songyun,Wang, Ying. An Automated Quantization Framework for High-Utilization RRAM-Based PIM[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2022,41(3):583-596.
APA Li, Bing,Qu, Songyun,&Wang, Ying.(2022).An Automated Quantization Framework for High-Utilization RRAM-Based PIM.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,41(3),583-596.
MLA Li, Bing,et al."An Automated Quantization Framework for High-Utilization RRAM-Based PIM".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 41.3(2022):583-596.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Li, Bing]的文章
[Qu, Songyun]的文章
[Wang, Ying]的文章
百度学术
百度学术中相似的文章
[Li, Bing]的文章
[Qu, Songyun]的文章
[Wang, Ying]的文章
必应学术
必应学术中相似的文章
[Li, Bing]的文章
[Qu, Songyun]的文章
[Wang, Ying]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。