CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs
Wang, Xueying1,5; Li, Guangli2,3,6,7; Jia, Zhen4,8; Feng, Xiaobing2,3,6,7; Wang, Yida4,8
2024-03-01
发表期刊ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION
ISSN1544-3566
卷号21期号:1页码:26
摘要Low-precision computation has emerged as one of the most effective techniques for accelerating convolutional neural networks and has garnered widespread support on modern hardware. Despite its effectiveness in accelerating convolutional neural networks, low-precision computation has not been commonly applied to fast convolutions, such as the Winograd algorithm, due to numerical issues. In this article, we propose an effective quantizedWinograd convolution, named LoWino, which employs an in-side quantization method in theWinograd domain to reduce the precision loss caused by transformations. Meanwhile, we present an efficient implementation that integrates well-designed optimization techniques, allowing us to fully exploit the capabilities of low-precision computation on modern CPUs. We evaluate LoWino on two Intel Xeon Scalable Processor platforms with representative convolutional layers and neural network models. The experimental results demonstrate that our approach can achieve an average of 1.84x and 1.91x operator speedups over state-of-the-art implementations in the vendor library while preserving accuracy loss at a reasonable level.
关键词Deep learning winograd convolution low-precision computation
DOI10.1145/3632956
收录类别SCI
语种英语
资助项目National Key R&D Program of China[2021ZD0110101] ; National Natural Science Foundation of China[62090024] ; National Natural Science Foundation of China[62232015] ; National Natural Science Foundation of China[62302479] ; China Postdoctoral Science Foundation[2023M733566] ; Innovation Funding of ICT, CAS[E361010]
WOS研究方向Computer Science
WOS类目Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods
WOS记录号WOS:001193465400005
出版者ASSOC COMPUTING MACHINERY
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/38765
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Li, Guangli
作者单位1.Beijing Univ Posts & Telecommun, Beijing, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
3.Univ Chinese Acad Sci, Beijing, Peoples R China
4.Amazon Web Serv, Seattle, WA USA
5.Beijing Univ Posts & Telecommun, 10 Xitucheng Rd, Beijing 100876, Peoples R China
6.Chinese Acad Sci, Inst Comp Technol, 6 Kexueyuan South Rd, Beijing 100190, Peoples R China
7.Univ Chinese Acad Sci, 6 Kexueyuan South Rd, Beijing 100190, Peoples R China
8.AmazonWeb Serv, 2795 Augustine Dr, Santa Clara, CA 95054 USA
推荐引用方式
GB/T 7714
Wang, Xueying,Li, Guangli,Jia, Zhen,et al. Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,2024,21(1):26.
APA Wang, Xueying,Li, Guangli,Jia, Zhen,Feng, Xiaobing,&Wang, Yida.(2024).Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs.ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,21(1),26.
MLA Wang, Xueying,et al."Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs".ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION 21.1(2024):26.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Wang, Xueying]的文章
[Li, Guangli]的文章
[Jia, Zhen]的文章
百度学术
百度学术中相似的文章
[Wang, Xueying]的文章
[Li, Guangli]的文章
[Jia, Zhen]的文章
必应学术
必应学术中相似的文章
[Wang, Xueying]的文章
[Li, Guangli]的文章
[Jia, Zhen]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。