Institute of Computing Technology, Chinese Academy IR
An Automated Quantization Framework for High-Utilization RRAM-Based PIM | |
Li, Bing1; Qu, Songyun2; Wang, Ying3 | |
2022-03-01 | |
发表期刊 | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS |
ISSN | 0278-0070 |
卷号 | 41期号:3页码:583-596 |
摘要 | With the advancement of deep neural networks (DNNs), the applications driven by DNNs have been spread from the cloud to the edge. However, the intensive computations and data movements in CNNs impede the adoption of DNNs in resource-constraint edge devices. Quantization, a common model compression method, has attracted a lot of attention as it enables efficient inference by lowering the data bit-width of CNN parameters. Due to the features of massive storage and computing-in-memory array, resistive memory (RRAM) has established the energy efficiency and small area processing-in-memory (PIM) for the acceleration of DNNs at the edge end. However, when deploying the network onto resistive-memory-based PIM (RRAM-based PIM), there will be tremendous unused cells due to the mismatch between the structure of the neural network layer and memory array, resulting in the resource under-utilization and low computation efficiency. In this work, we observed prior quantization approaches fail to improve hardware resource utilization as they ignored the hardware structure information in RRAM. Thus, combining the information of the neural network model and hardware information is essential for a high-utilization RRAM-based PIM design. Considering the vast model parameters and heterogeneous RRAM crossbar structure, we develop a novel quantization framework by leveraging the AutoML technique, i.e., RaQu, which automatically generates a fine-grained quantization strategy for any model that fully utilizes the resource of RRAM-based PIM. The experimental results show that RaQu achieves at most 29.2%-37.4% and 1.8%-3.3% improvement in resource utilization and model accuracy, respectively, compared to prior coarse-grained quantization methods. |
关键词 | Quantization (signal) Neural networks Computational modeling Data models Hardware Resource management Arrays AutoML neural network processing-in-memory (PIM) quantization resistive memory (RRAM) |
DOI | 10.1109/TCAD.2021.3061521 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Natural Science Foundation of China[61874124] ; Youth Innovation Promotion Association, CAS[2018138] ; State Key Laboratory of Computer Architecture[CARCH201913] |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Hardware & Architecture ; Computer Science, Interdisciplinary Applications ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000757852800018 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/18969 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Wang, Ying |
作者单位 | 1.Capital Normal Univ, Acad Multidisciplinary Studies, Beijing 100037, Peoples R China 2.Univ Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China 3.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Li, Bing,Qu, Songyun,Wang, Ying. An Automated Quantization Framework for High-Utilization RRAM-Based PIM[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2022,41(3):583-596. |
APA | Li, Bing,Qu, Songyun,&Wang, Ying.(2022).An Automated Quantization Framework for High-Utilization RRAM-Based PIM.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,41(3),583-596. |
MLA | Li, Bing,et al."An Automated Quantization Framework for High-Utilization RRAM-Based PIM".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 41.3(2022):583-596. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论