CSpace
DPQ: dynamic pseudo-mean mixed-precision quantization for pruned neural network
Pei, Songwen1,2,3; Wang, Jiyao1; Zhang, Bingxue1; Qin, Wei1; Xue, Hai1; Ye, Xiaochun2; Chen, Mingsong3
2024-01-31
发表期刊MACHINE LEARNING
ISSN0885-6125
页码14
摘要The ever-increasing layers and hyper-parameters of deep neural network are continuously growing to generate large-scale network by training huge masses of data. However, it is difficult to deploy deep neural network on resource-constrained edge devices. Network mixed-precision quantization is a challenging way to prune and compress deep neural network models while discovering the optimal bit width for each layer. To solve the big challenge, we therefore propose the dynamic pseudo-mean mixed-precision quantization (DPQ) by introducing two-bit scaling factors to compensate errors of quantization. Furthermore, the activation quantization named random parameters clipping (RPC) is proposed. RPC adopts partial activation quantization to reduce loss of accuracy. Therefore, DPQ can dynamically adjust the bit precision of weight quantization according to the distribution of weights. It results in a quantification scheme with strong robustness compared to previous methods. Extensive experiments demonstrate that DPQ achieves 15.43x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} compression rate of ResNet20 on CIFAR-10 dataset with 0.22% increase in accuracy, and 35.25x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} compression rate of Resnet56 on SVHN dataset with 0.12% increase in accuracy.
关键词Big data Compression Deep learning Pseudo-mean mixed-precision quantization Pruned neural network
DOI10.1007/s10994-023-06453-3
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[61975124] ; National Natural Science Foundation of China[20ZR1438500] ; Shanghai Natural Science Foundation[CARCHA202111] ; State Key Laboratory of Computer Architecture (ICT, CAS)[OP202202] ; Engineering Research Center of Software/Hardware Co-design Technology and Application, Ministry of Education, East China Normal University
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence
WOS记录号WOS:001151730300001
出版者SPRINGER
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/38402
专题中国科学院计算技术研究所
通讯作者Pei, Songwen
作者单位1.Univ Shanghai Sci & Technol, Sch Opt Elect & Comp Engn, Shanghai 200093, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China
3.East China Normal Univ, Software Hardware Codesign Technol & Applicat, Minist Educ, Engn Res Ctr, Shanghai 200062, Peoples R China
推荐引用方式
GB/T 7714
Pei, Songwen,Wang, Jiyao,Zhang, Bingxue,et al. DPQ: dynamic pseudo-mean mixed-precision quantization for pruned neural network[J]. MACHINE LEARNING,2024:14.
APA Pei, Songwen.,Wang, Jiyao.,Zhang, Bingxue.,Qin, Wei.,Xue, Hai.,...&Chen, Mingsong.(2024).DPQ: dynamic pseudo-mean mixed-precision quantization for pruned neural network.MACHINE LEARNING,14.
MLA Pei, Songwen,et al."DPQ: dynamic pseudo-mean mixed-precision quantization for pruned neural network".MACHINE LEARNING (2024):14.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Pei, Songwen]的文章
[Wang, Jiyao]的文章
[Zhang, Bingxue]的文章
百度学术
百度学术中相似的文章
[Pei, Songwen]的文章
[Wang, Jiyao]的文章
[Zhang, Bingxue]的文章
必应学术
必应学术中相似的文章
[Pei, Songwen]的文章
[Wang, Jiyao]的文章
[Zhang, Bingxue]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。