CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
HEAT: Efficient Vision Transformer Accelerator With Hybrid-Precision Quantization
Zhao, Pan1; Xue, Donghui1; Wu, Licheng1; Chang, Liang1; Tan, Haining2; Han, Yinhe2; Zhou, Jun1
2025-05-01
发表期刊IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS
ISSN1549-7747
卷号72期号:5页码:758-762
摘要Quantization is an important technique for the acceleration of transformer-based neural networks. Prior related works mainly consider quantization from the algorithm level. Their hardware implementation is inefficient. In this brief, we propose an efficient vision transformer accelerator with retraining-free and finetuning-free hybrid-precision quantization. At the algorithm level, the features and weights are divided into two parts: normal values and outlier values. These two parts are quantized with different bit widths and scaling factors. We use matrix transformation and group-wise quantization policy to improve hardware utilization. At the hardware level, we propose a two-stage FIFO group structure and a hierarchical interleaving data flow to further improve the utilization of the PE array. As a result, the input and weight matrices are quantized to 5.71 bits on average with 0.526 % accuracy loss on Swin-T. The accelerator achieves a frame rate of 118.9 FPS and an energy efficiency of 43.58 GOPS/W on the ZCU102 FPGA board, better than state-of-the-art works.
关键词Vision transformer accelerator hybrid-precision quantization FPGA Vision transformer accelerator hybrid-precision quantization FPGA
DOI10.1109/TCSII.2025.3547340
收录类别SCI
语种英语
资助项目Open Project Program of Anhui Province Key Laboratory of Spintronic Chip Research and Manufacturing[WNKFKT-25-01] ; National Science Foundation of China[62104025] ; State Key Laboratory of Computer Architecture (ICT, CAS)[CARCHB202117]
WOS研究方向Engineering
WOS类目Engineering, Electrical & Electronic
WOS记录号WOS:001480532800020
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/40647
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Chang, Liang; Tan, Haining
作者单位1.Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Zhao, Pan,Xue, Donghui,Wu, Licheng,et al. HEAT: Efficient Vision Transformer Accelerator With Hybrid-Precision Quantization[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS,2025,72(5):758-762.
APA Zhao, Pan.,Xue, Donghui.,Wu, Licheng.,Chang, Liang.,Tan, Haining.,...&Zhou, Jun.(2025).HEAT: Efficient Vision Transformer Accelerator With Hybrid-Precision Quantization.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS,72(5),758-762.
MLA Zhao, Pan,et al."HEAT: Efficient Vision Transformer Accelerator With Hybrid-Precision Quantization".IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS 72.5(2025):758-762.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zhao, Pan]的文章
[Xue, Donghui]的文章
[Wu, Licheng]的文章
百度学术
百度学术中相似的文章
[Zhao, Pan]的文章
[Xue, Donghui]的文章
[Wu, Licheng]的文章
必应学术
必应学术中相似的文章
[Zhao, Pan]的文章
[Xue, Donghui]的文章
[Wu, Licheng]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。