Institute of Computing Technology, Chinese Academy IR
| OptiFX: Automatic Optimization for Convolutional Neural Networks with Aggressive Operator Fusion on GPUs | |
| Wang, Xueying1; Li, Shigang1; Qian, Hao2; Luo, Fan3,4; Hao, Zhaoyang3,4; Wu, Tong1; Xu, Ruiyuan3,4; Cui, Huimin3,4; Feng, Xiaobing3,4; Li, Guangli2,3,4 | |
| 2025-06-01 | |
| 发表期刊 | ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION
![]() |
| ISSN | 1544-3566 |
| 卷号 | 22期号:2页码:27 |
| 摘要 | Convolutional Neural Networks (CNNs) are fundamental to advancing computer vision technologies. As CNNs become more complex and larger, optimizing model inference remains a critical challenge in both industry and academia. On modern GPU platforms, CNN operators are typically memory-bound, leading to significant performance degradation due to memory wall effects. While recent advancements have utilized operator fusion-merging multiple operators into one-to enhance inference performance, the fusion of multiple region-based operators like convolution is seldom addressed. This article introduces AFusioN, a novel operator fusion technique aimed at improving inference performance, and OptiFX, an automatic optimization framework based on this approach. OptiFX employs a cost-based backtracking search to identify optimal sub-graphs for fusion and utilizes template-based code generation to create efficient kernels for these fused sub-graphs. We evaluate OptiFX across seven prominent CNN architectures-GoogLeNet, ResNet, DenseNet, MobileNet, SqueezeNet, NasNet, and UNet-on Nvidia A6000 Ada, RTX 4090, and Jetson AGX Orin platforms. Our results demonstrate that OptiFX significantly outperforms existing methods, achieving average speedups of 2.91x, 3.30x, and 2.09x in accelerating inference performance on these platforms, respectively. |
| 关键词 | Deep learning systems convolutional neural networks operator fusion |
| DOI | 10.1145/3716876 |
| 收录类别 | SCI |
| 语种 | 英语 |
| 资助项目 | National Science and Technology Major Project[2023ZD0120502] ; National Natural Science Foundation of China[62302479] ; National Natural Science Foundation of China[62232015] ; National Natural Science Foundation of China[62090024] ; National Natural Science Foundation of China[62372055] ; Fund of Laboratory for Advanced Computing and Intelligence Engineering, the China Postdoctoral Science Foundation[2024M750258] ; Fund of Laboratory for Advanced Computing and Intelligence Engineering, the China Postdoctoral Science Foundation[2023M733566] ; CCF-Tencent Rhino-Bird Open Research Fund ; State Key Lab of Processors, Institute of Computing Technology, CAS[CLQ202411] ; Innovation Funding of ICT, CAS[E361010] ; Innovation Funding of ICT, CAS[E261110] ; Australian Research Council (ARC) Grant[DP250104934] |
| WOS研究方向 | Computer Science |
| WOS类目 | Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods |
| WOS记录号 | WOS:001532815500004 |
| 出版者 | ASSOC COMPUTING MACHINERY |
| 引用统计 | |
| 文献类型 | 期刊论文 |
| 条目标识符 | http://119.78.100.204/handle/2XEOYT63/42097 |
| 专题 | 中国科学院计算技术研究所期刊论文_英文 |
| 通讯作者 | Li, Shigang; Li, Guangli |
| 作者单位 | 1.Beijing Univ Posts & Telecommun, Beijing, Peoples R China 2.Univ New South Wales, Sydney, NSW, Australia 3.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China 4.Univ Chinese Acad Sci, Beijing, Peoples R China |
| 推荐引用方式 GB/T 7714 | Wang, Xueying,Li, Shigang,Qian, Hao,et al. OptiFX: Automatic Optimization for Convolutional Neural Networks with Aggressive Operator Fusion on GPUs[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,2025,22(2):27. |
| APA | Wang, Xueying.,Li, Shigang.,Qian, Hao.,Luo, Fan.,Hao, Zhaoyang.,...&Li, Guangli.(2025).OptiFX: Automatic Optimization for Convolutional Neural Networks with Aggressive Operator Fusion on GPUs.ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION,22(2),27. |
| MLA | Wang, Xueying,et al."OptiFX: Automatic Optimization for Convolutional Neural Networks with Aggressive Operator Fusion on GPUs".ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION 22.2(2025):27. |
| 条目包含的文件 | 条目无相关文件。 | |||||
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论