CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
A Cross-Platform SpMV Framework on Many-Core Architectures
Zhang, Yunquan1; Li, Shigang1; Yan, Shengen2; Zhou, Huiyang3
2016-12-01
发表期刊ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION
ISSN1544-3566
卷号13期号:4页码:25
摘要Sparse Matrix-Vector multiplication (SpMV) is a key operation in engineering and scientific computing. Although the previous work has shown impressive progress in optimizing SpMV on many-core architectures, load imbalance and high memory bandwidth remain the critical performance bottlenecks. We present our novel solutions to these problems, for both GPUs and Intel MIC many-core architectures. First, we devise a new SpMV format, called Blocked Compressed Common Coordinate (BCCOO). BCCOO extends the blocked Common Coordinate (COO) by using bit flags to store the row indices to alleviate the bandwidth problem. We further improve this format by partitioning the matrix into vertical slices for better data locality. Then, to address the load imbalance problem, we propose a highly efficient matrix-based segmented sum/scan algorithm for SpMV, which eliminates global synchronization. At last, we introduce an autotuning framework to choose optimization parameters. Experimental results show that our proposed framework has a significant advantage over the existing SpMV libraries. In single precision, our proposed scheme outperforms clSpMV COCKTAIL format by 255% on average on AMD FirePro W8000, and outperforms CUSPARSE V7.0 by 73.7% on average and outperforms CSR5 by 53.6% on average on GeForce Titan X; in double precision, our proposed scheme outperforms CUSPARSE V7.0 by 34.0% on average and outperforms CSR5 by 16.2% on average on Tesla K20, and has equivalent performance compared with CSR5 on Intel MIC.
关键词SpMV segmented scan BCCOO OpenCL CUDA GPU Intel MIC parallel algorithms
DOI10.1145/2994148
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[61502450] ; National Natural Science Foundation of China[61432018] ; National Natural Science Foundation of China[61521092] ; National Natural Science Foundation of China[61272136] ; National Key Research and Development Program of China[2016YFB0200803] ; NSF project[1216569] ; AMD Inc.
WOS研究方向Computer Science
WOS类目Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods
WOS记录号WOS:000392416400002
出版者ASSOC COMPUTING MACHINERY
引用统计
被引频次:8[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/7661
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Li, Shigang; Yan, Shengen
作者单位1.Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China
2.Chinese Univ Hong Kong, Dept Informat Engn, SenseTime Grp Ltd, Hong Kong, Hong Kong, Peoples R China
3.North Carolina State Univ, Dept Elect & Comp Engn, Raleigh, NC 27695 USA
推荐引用方式
GB/T 7714
Zhang, Yunquan,Li, Shigang,Yan, Shengen,et al. A Cross-Platform SpMV Framework on Many-Core Architectures[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,2016,13(4):25.
APA Zhang, Yunquan,Li, Shigang,Yan, Shengen,&Zhou, Huiyang.(2016).A Cross-Platform SpMV Framework on Many-Core Architectures.ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,13(4),25.
MLA Zhang, Yunquan,et al."A Cross-Platform SpMV Framework on Many-Core Architectures".ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION 13.4(2016):25.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zhang, Yunquan]的文章
[Li, Shigang]的文章
[Yan, Shengen]的文章
百度学术
百度学术中相似的文章
[Zhang, Yunquan]的文章
[Li, Shigang]的文章
[Yan, Shengen]的文章
必应学术
必应学术中相似的文章
[Zhang, Yunquan]的文章
[Li, Shigang]的文章
[Yan, Shengen]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。