Institute of Computing Technology, Chinese Academy IR
| General Purpose Deep Learning Accelerator Based on Bit Interleaving | |
| Chang, Liang1; Lu, Hang2,3,4; Li, Chenglong1; Zhao, Xin1; Hu, Zhicheng1; Zhou, Jun1; Li, Xiaowei2,3,4 | |
| 2024-05-01 | |
| 发表期刊 | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
![]() |
| ISSN | 0278-0070 |
| 卷号 | 43期号:5页码:1470-1483 |
| 摘要 | Along with the rapid evolution of deep neural networks, the ever-increasing complexity imposes formidable computation intensity on the hardware accelerator. In this article, we propose a novel computing philosophy called "bit interleaving" and the associate accelerator couple called "Bitlet" and Bitlet-X to maximally exploit the bit-level sparsity. Apart from the existing bit-serial/parallel accelerators, Bitlet leverages the abundant "sparsity parallelism" in the parameters to enforce the inference acceleration. Bitlet is versatile by supporting diverse precisions on a single platform, including floating-point 32 and fixed-point from 1b to 24b . The versatility enables Bitlet feasible for both efficient inference and training. Besides, by updating the key compute engine in the accelerator, Bitlet-X could furthermore improve the peak power consumption and efficiency for the inference-only scenario, with competitive accuracy. Empirical studies on 12 domain-specific deep learning applications highlight the following results: 1) up to $81x /21x energy efficiency improvement for training/inference over recent high-performance GPUs; 2) up to 15x /8x higher speedup/efficiency over state-of-the-art fixed-point accelerators; 3) 1.5 mm(2) area and scalable power consumption from 570 mW (fp32) to 432 mW (16b) and 365 mW (8b) @28 -nm TSMC; 4) 1.3x improvement of the peak power efficiency for the Bitlet-X over Bitlet; and 5) highly configurable justified by the ablation and sensitivity studies. |
| 关键词 | Synchronization Parallel processing Computational modeling Training Pragmatics Power demand Hardware acceleration Accelerator bit-level sparsity deep neural network (DNN) |
| DOI | 10.1109/TCAD.2023.3342728 |
| 收录类别 | SCI |
| 语种 | 英语 |
| 资助项目 | National Natural Science Foundation of China |
| WOS研究方向 | Computer Science ; Engineering |
| WOS类目 | Computer Science, Hardware & Architecture ; Computer Science, Interdisciplinary Applications ; Engineering, Electrical & Electronic |
| WOS记录号 | WOS:001225897600012 |
| 出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
| 引用统计 | |
| 文献类型 | 期刊论文 |
| 条目标识符 | http://119.78.100.204/handle/2XEOYT63/40063 |
| 专题 | 中国科学院计算技术研究所期刊论文_英文 |
| 通讯作者 | Lu, Hang |
| 作者单位 | 1.Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, State Key Lab Processors, Beijing 100190, Peoples R China 3.Zhongguancun Lab, Beijing 100081, Peoples R China 4.Shanghai Innovat Ctr Processor Technol, Shanghai 200120, Peoples R China |
| 推荐引用方式 GB/T 7714 | Chang, Liang,Lu, Hang,Li, Chenglong,et al. General Purpose Deep Learning Accelerator Based on Bit Interleaving[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2024,43(5):1470-1483. |
| APA | Chang, Liang.,Lu, Hang.,Li, Chenglong.,Zhao, Xin.,Hu, Zhicheng.,...&Li, Xiaowei.(2024).General Purpose Deep Learning Accelerator Based on Bit Interleaving.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,43(5),1470-1483. |
| MLA | Chang, Liang,et al."General Purpose Deep Learning Accelerator Based on Bit Interleaving".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 43.5(2024):1470-1483. |
| 条目包含的文件 | 条目无相关文件。 | |||||
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论