Institute of Computing Technology, Chinese Academy IR
Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs | |
Wang, Xueying1,5; Li, Guangli2,3,6,7; Jia, Zhen4,8; Feng, Xiaobing2,3,6,7; Wang, Yida4,8 | |
2024-03-01 | |
发表期刊 | ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION |
ISSN | 1544-3566 |
卷号 | 21期号:1页码:26 |
摘要 | Low-precision computation has emerged as one of the most effective techniques for accelerating convolutional neural networks and has garnered widespread support on modern hardware. Despite its effectiveness in accelerating convolutional neural networks, low-precision computation has not been commonly applied to fast convolutions, such as the Winograd algorithm, due to numerical issues. In this article, we propose an effective quantizedWinograd convolution, named LoWino, which employs an in-side quantization method in theWinograd domain to reduce the precision loss caused by transformations. Meanwhile, we present an efficient implementation that integrates well-designed optimization techniques, allowing us to fully exploit the capabilities of low-precision computation on modern CPUs. We evaluate LoWino on two Intel Xeon Scalable Processor platforms with representative convolutional layers and neural network models. The experimental results demonstrate that our approach can achieve an average of 1.84x and 1.91x operator speedups over state-of-the-art implementations in the vendor library while preserving accuracy loss at a reasonable level. |
关键词 | Deep learning winograd convolution low-precision computation |
DOI | 10.1145/3632956 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Key R&D Program of China[2021ZD0110101] ; National Natural Science Foundation of China[62090024] ; National Natural Science Foundation of China[62232015] ; National Natural Science Foundation of China[62302479] ; China Postdoctoral Science Foundation[2023M733566] ; Innovation Funding of ICT, CAS[E361010] |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods |
WOS记录号 | WOS:001193465400005 |
出版者 | ASSOC COMPUTING MACHINERY |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/38765 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Li, Guangli |
作者单位 | 1.Beijing Univ Posts & Telecommun, Beijing, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China 3.Univ Chinese Acad Sci, Beijing, Peoples R China 4.Amazon Web Serv, Seattle, WA USA 5.Beijing Univ Posts & Telecommun, 10 Xitucheng Rd, Beijing 100876, Peoples R China 6.Chinese Acad Sci, Inst Comp Technol, 6 Kexueyuan South Rd, Beijing 100190, Peoples R China 7.Univ Chinese Acad Sci, 6 Kexueyuan South Rd, Beijing 100190, Peoples R China 8.AmazonWeb Serv, 2795 Augustine Dr, Santa Clara, CA 95054 USA |
推荐引用方式 GB/T 7714 | Wang, Xueying,Li, Guangli,Jia, Zhen,et al. Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,2024,21(1):26. |
APA | Wang, Xueying,Li, Guangli,Jia, Zhen,Feng, Xiaobing,&Wang, Yida.(2024).Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs.ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,21(1),26. |
MLA | Wang, Xueying,et al."Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs".ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION 21.1(2024):26. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论