CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Attar: RRAM-based in-memory attention accelerator with software-hardware co-optimization
Li, Bing1; Qi, Ying1; Wang, Ying2; Han, Yinhe2
2025-03-01
发表期刊SCIENCE CHINA-INFORMATION SCIENCES
ISSN1674-733X
卷号68期号:3页码:17
摘要The attention mechanism has become a pivotal component in artificial intelligence, significantly enhancing the performance of deep learning applications. However, its quadratic computational complexity and intricate computations lead to substantial inefficiencies when processing long sequences. To address these challenges, we introduce Attar, a resistive random access memory (RRAM)-based in-memory accelerator designed to optimize attention mechanisms through software-hardware co-optimization. Attar leverages efficient Top-k pruning and quantization strategies to exploit the sparsity and redundancy of attention matrices, and incorporates an RRAM-based in-memory softmax engine by harnessing the versatility of the RRAM crossbar. Comprehensive evaluations demonstrate that Attar achieves a performance improvement of up to 4.88x and energy saving of 55.38% over previous computing-in-memory (CIM)-based accelerators across various models and datasets while maintaining comparable accuracy. This work underscores the potential of in-memory computing to enhance the efficiency of attention-based models without compromising their effectiveness.
关键词RRAM computing-in-memory attention pruning quantization
DOI10.1007/s11432-024-4247-4
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[62204164] ; National Natural Science Foundation of China[62222411] ; National Natural Science Foundation of China[62025404] ; National Key Research and Development Program of China[2023YFB4404400] ; Hundred Talents Program of the Chinese Academy of Sciences[E4YB012] ; Institute of Microelectronics, Chinese Academy of Sciences
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Information Systems ; Engineering, Electrical & Electronic
WOS记录号WOS:001422340100004
出版者SCIENCE PRESS
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/40731
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Li, Bing; Han, Yinhe
作者单位1.Capital Normal Univ, Informat Engn Coll, Beijing 100048, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Res Ctr Intelligent Comp Syst, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Li, Bing,Qi, Ying,Wang, Ying,et al. Attar: RRAM-based in-memory attention accelerator with software-hardware co-optimization[J]. SCIENCE CHINA-INFORMATION SCIENCES,2025,68(3):17.
APA Li, Bing,Qi, Ying,Wang, Ying,&Han, Yinhe.(2025).Attar: RRAM-based in-memory attention accelerator with software-hardware co-optimization.SCIENCE CHINA-INFORMATION SCIENCES,68(3),17.
MLA Li, Bing,et al."Attar: RRAM-based in-memory attention accelerator with software-hardware co-optimization".SCIENCE CHINA-INFORMATION SCIENCES 68.3(2025):17.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Li, Bing]的文章
[Qi, Ying]的文章
[Wang, Ying]的文章
百度学术
百度学术中相似的文章
[Li, Bing]的文章
[Qi, Ying]的文章
[Wang, Ying]的文章
必应学术
必应学术中相似的文章
[Li, Bing]的文章
[Qi, Ying]的文章
[Wang, Ying]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。