CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Mentor: A Memory-Efficient Sparse-dense Matrix Multiplication Accelerator Based on Column-Wise Product
Lu, Xiaobo1; Fang, Jianbin1; Peng, Lin1; Huang, Chun1; Du, Zidong2; Zhao, Yongwei3; Wang, Zheng4
2024-11-01
发表期刊ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION
ISSN1544-3566
卷号21期号:4页码:25
摘要Sparse-dense matrix multiplication (SpMM) is the performance bottleneck of many high-performance and deep-learning applications, making it attractive to design specialized SpMM hardware accelerators. Unfortunately, existing hardware solutions do not take full advantage of data reuse opportunities of the input and output matrices or suffer from irregular memory access patterns. Their strategies increase the off-chip memory traffic and bandwidth pressure, leaving much room for improvement. We present MENTOR, a new approach to designing SpMM accelerators. Our key insight is that column-wise dataflow, while rarely exploited in prior works, can address these issues in SpMM computations. MENTOR is a software-hardware co-design approach for leveraging column-wise dataflow to improve data reuse and regular memory accesses of SpMM. On the software level, MENTOR incorporates a novel streaming construction scheme to preprocess the input matrix for enabling a streaming access pattern. On the hardware level, it employs a fully pipelined design to unlock the potential of column-wise dataflow further. The design of MENTOR is underpinned by a carefully designed analytical model to find the tradeoff between performance and hardware resources. We have implemented an FPGA prototype of MENTOR. Experimental results show that MENTOR achieves speedup by geomean 2.05x (up to 3.98x), reduces the memory traffic by geomean 2.92x (up to 4.93x), and improves bandwidth utilization by geomean 1.38x (up to 2.89x), compared with the state-of-the-art hardware solutions.
关键词Hardware Hardware accelerators Computer systems organization Architectures
DOI10.1145/3688612
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China[2023YFB3001503]
WOS研究方向Computer Science
WOS类目Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods
WOS记录号WOS:001386358100002
出版者ASSOC COMPUTING MACHINERY
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/40805
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Lu, Xiaobo
作者单位1.Natl Univ Def Technol, Sch Comp Sci & Technol, Changsha, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
3.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
4.Northwest Univ, Xian, Shaanxi, Peoples R China
推荐引用方式
GB/T 7714
Lu, Xiaobo,Fang, Jianbin,Peng, Lin,et al. Mentor: A Memory-Efficient Sparse-dense Matrix Multiplication Accelerator Based on Column-Wise Product[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,2024,21(4):25.
APA Lu, Xiaobo.,Fang, Jianbin.,Peng, Lin.,Huang, Chun.,Du, Zidong.,...&Wang, Zheng.(2024).Mentor: A Memory-Efficient Sparse-dense Matrix Multiplication Accelerator Based on Column-Wise Product.ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,21(4),25.
MLA Lu, Xiaobo,et al."Mentor: A Memory-Efficient Sparse-dense Matrix Multiplication Accelerator Based on Column-Wise Product".ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION 21.4(2024):25.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Lu, Xiaobo]的文章
[Fang, Jianbin]的文章
[Peng, Lin]的文章
百度学术
百度学术中相似的文章
[Lu, Xiaobo]的文章
[Fang, Jianbin]的文章
[Peng, Lin]的文章
必应学术
必应学术中相似的文章
[Lu, Xiaobo]的文章
[Fang, Jianbin]的文章
[Peng, Lin]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。