Institute of Computing Technology, Chinese Academy IR
On-Line Fault Protection for ReRAM-Based Neural Networks | |
Li, Wen1,2; Wang, Ying1,2; Liu, Cheng1,2; He, Yintao1,2; Liu, Lian1,2; Li, Huawei1,3; Li, Xiaowei1,2 | |
2023-02-01 | |
发表期刊 | IEEE TRANSACTIONS ON COMPUTERS |
ISSN | 0018-9340 |
卷号 | 72期号:2页码:423-437 |
摘要 | The emerging Resistive RAM (ReRAM) technology significantly boosts the performance and the energy efficiency of the deep learning accelerators (DLAs) via the Computing-in-Memory (CiM) architecture. However, ReRAM-based DLA also suffers a high occurrence rate of memory faults. How to detect and protect against the faults in ReRAM devices poses great challenges to ReRAM-based DLA design. In this work, we propose RRAMedy, an in-situ fault detection and network remedy framework for ReRAM-based DLAs. With the proposed Adversarial Example Testing, which is a lifetime on-device and on-line fault detection technique, it achieves high detection coverage of both hard faults and soft faults at a low run-time cost. In addition, it employs an edge-cloud collaborative model retraining method to tolerate the detected faults by leveraging the inherent fault-adaptive capability of DNNs. Meanwhile, to enable in-situ model remedy when the cloud assistance is absent due to security or overhead issues, we propose to accelerate the fault-masking retraining process on edge devices with parallelized Knowledge Transfer. Our experimental results show that the proposed fault detection technique achieves high fault detection accuracy and delivers real-time testing performance. Meanwhile, the proposed retraining approach greatly alleviates the accuracy degradation problem and achieves excellent performance speedups over the baselines. |
关键词 | Training Fault detection Computational modeling Image edge detection Memristors Neural networks Kernel Deep neural network hard fault ReRAM reliability soft fault |
DOI | 10.1109/TC.2022.3160345 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Key Research and Development Program of China[2020YFB1600201] ; National Natural Science Foundation of China (NSFC)[62090024] ; National Natural Science Foundation of China (NSFC)[61874124] ; National Natural Science Foundation of China (NSFC)[61876173] ; Zhejiang Lab[2021PC0AC01] |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Hardware & Architecture ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000917782600010 |
出版者 | IEEE COMPUTER SOC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/19936 |
专题 | 中国科学院计算技术研究所期刊论文 |
通讯作者 | Wang, Ying |
作者单位 | 1.Chinese Acad Sci, Inst Comp Technol, SKLCA, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Sch Comp & Control Engn, Beijing 100049, Peoples R China 3.Peng Cheng Lab, Shenzhen 518066, Peoples R China |
推荐引用方式 GB/T 7714 | Li, Wen,Wang, Ying,Liu, Cheng,et al. On-Line Fault Protection for ReRAM-Based Neural Networks[J]. IEEE TRANSACTIONS ON COMPUTERS,2023,72(2):423-437. |
APA | Li, Wen.,Wang, Ying.,Liu, Cheng.,He, Yintao.,Liu, Lian.,...&Li, Xiaowei.(2023).On-Line Fault Protection for ReRAM-Based Neural Networks.IEEE TRANSACTIONS ON COMPUTERS,72(2),423-437. |
MLA | Li, Wen,et al."On-Line Fault Protection for ReRAM-Based Neural Networks".IEEE TRANSACTIONS ON COMPUTERS 72.2(2023):423-437. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论