Institute of Computing Technology, Chinese Academy IR
General Purpose Deep Learning Accelerator Based on Bit Interleaving | |
Chang, Liang1; Lu, Hang2,3,4; Li, Chenglong1; Zhao, Xin1; Hu, Zhicheng1; Zhou, Jun1; Li, Xiaowei2,3,4 | |
2024-05-01 | |
发表期刊 | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS |
ISSN | 0278-0070 |
卷号 | 43期号:5页码:1470-1483 |
摘要 | Along with the rapid evolution of deep neural networks, the ever-increasing complexity imposes formidable computation intensity on the hardware accelerator. In this article, we propose a novel computing philosophy called "bit interleaving" and the associate accelerator couple called "Bitlet" and Bitlet-X to maximally exploit the bit-level sparsity. Apart from the existing bit-serial/parallel accelerators, Bitlet leverages the abundant "sparsity parallelism" in the parameters to enforce the inference acceleration. Bitlet is versatile by supporting diverse precisions on a single platform, including floating-point 32 and fixed-point from 1b to 24b . The versatility enables Bitlet feasible for both efficient inference and training. Besides, by updating the key compute engine in the accelerator, Bitlet-X could furthermore improve the peak power consumption and efficiency for the inference-only scenario, with competitive accuracy. Empirical studies on 12 domain-specific deep learning applications highlight the following results: 1) up to $81x /21x energy efficiency improvement for training/inference over recent high-performance GPUs; 2) up to 15x /8x higher speedup/efficiency over state-of-the-art fixed-point accelerators; 3) 1.5 mm(2) area and scalable power consumption from 570 mW (fp32) to 432 mW (16b) and 365 mW (8b) @28 -nm TSMC; 4) 1.3x improvement of the peak power efficiency for the Bitlet-X over Bitlet; and 5) highly configurable justified by the ablation and sensitivity studies. |
关键词 | Synchronization Parallel processing Computational modeling Training Pragmatics Power demand Hardware acceleration Accelerator bit-level sparsity deep neural network (DNN) |
DOI | 10.1109/TCAD.2023.3342728 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Natural Science Foundation of China |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Hardware & Architecture ; Computer Science, Interdisciplinary Applications ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:001225897600012 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/40063 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Lu, Hang |
作者单位 | 1.Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, State Key Lab Processors, Beijing 100190, Peoples R China 3.Zhongguancun Lab, Beijing 100081, Peoples R China 4.Shanghai Innovat Ctr Processor Technol, Shanghai 200120, Peoples R China |
推荐引用方式 GB/T 7714 | Chang, Liang,Lu, Hang,Li, Chenglong,et al. General Purpose Deep Learning Accelerator Based on Bit Interleaving[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2024,43(5):1470-1483. |
APA | Chang, Liang.,Lu, Hang.,Li, Chenglong.,Zhao, Xin.,Hu, Zhicheng.,...&Li, Xiaowei.(2024).General Purpose Deep Learning Accelerator Based on Bit Interleaving.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,43(5),1470-1483. |
MLA | Chang, Liang,et al."General Purpose Deep Learning Accelerator Based on Bit Interleaving".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 43.5(2024):1470-1483. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论