CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning
Li, Guangli1,2; Ma, Xiu3; Wang, Xueying1; Yue, Hengshan3; Li, Jiansong1,2; Liu, Lei1; Feng, Xiaobing1,2; Xue, Jingling4
2022-03-01
发表期刊JOURNAL OF SYSTEMS ARCHITECTURE
ISSN1383-7621
卷号124页码:11
摘要While deep learning has shown superior performance in various intelligent tasks, it is still a challenging problem to deploy sophisticated models on resource-limited edge devices. Filter pruning performs a system independent optimization, which shrinks a neural network model into a thinner one, providing an attractive solution for efficient on-device inference. Prevailing approaches usually utilize fixed pruning rates for the whole neural network model to reduce the optimization space of filter pruning. However, the filters of different layers may have different sensitivities for model inference and therefore a flexible rate setting of pruning can potentially further increase the accuracy of compressed models. In this paper, we propose FlexPruner, a novel approach for compressing and accelerating neural network models via flexible-rate filter pruning. Our approach follows a greedy-based strategy to select the filters to be pruned and performs an iterative loss-aware pruning process, thereby achieving a remarkable accuracy improvement over existing methods when numerous filters are pruned. Evaluation with state-of-the-art residual neural networks on six representative intelligent edge accelerators demonstrates the effectiveness of FlexPruner, which decreases the accuracy degradation of pruned models by leveraging flexible pruning rates and achieves practical speedups for on-device inference.
关键词Edge intelligence Deep learning Neural network compression
DOI10.1016/j.sysarc.2022.102431
收录类别SCI
语种英语
资助项目National Key R&D Pro-gram of China[2017YFB1003103] ; National Natural Science Foundation of China[61872043] ; National Natural Science Foundation of China[61802368] ; Science Fund for Creative Research Groups of the National Natural Science Foundation of China[61521092]
WOS研究方向Computer Science
WOS类目Computer Science, Hardware & Architecture ; Computer Science, Software Engineering
WOS记录号WOS:000782573200016
出版者ELSEVIER
引用统计
被引频次:24[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/18875
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Li, Guangli
作者单位1.Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China
2.Univ Chinese Acad Sci, Beijing, Peoples R China
3.Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China
4.Univ New S Wales, Sch Engn & Comp Sci, Sydney, NSW, Australia
推荐引用方式
GB/T 7714
Li, Guangli,Ma, Xiu,Wang, Xueying,et al. Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning[J]. JOURNAL OF SYSTEMS ARCHITECTURE,2022,124:11.
APA Li, Guangli.,Ma, Xiu.,Wang, Xueying.,Yue, Hengshan.,Li, Jiansong.,...&Xue, Jingling.(2022).Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning.JOURNAL OF SYSTEMS ARCHITECTURE,124,11.
MLA Li, Guangli,et al."Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning".JOURNAL OF SYSTEMS ARCHITECTURE 124(2022):11.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Li, Guangli]的文章
[Ma, Xiu]的文章
[Wang, Xueying]的文章
百度学术
百度学术中相似的文章
[Li, Guangli]的文章
[Ma, Xiu]的文章
[Wang, Xueying]的文章
必应学术
必应学术中相似的文章
[Li, Guangli]的文章
[Ma, Xiu]的文章
[Wang, Xueying]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。