CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
CAP: Communication-Aware Automated Parallelization for Deep Learning Inference on CMP Architectures
Zou, Kaiwei1,2; Wang, Ying1,2; Cheng, Long3; Qu, Songyun2,4; Li, Huawei1,2,5; Li, Xiaowei1,2
2022-07-01
发表期刊IEEE TRANSACTIONS ON COMPUTERS
ISSN0018-9340
卷号71期号:7页码:1626-1639
摘要Real-time inference of deep learning models on embedded and energy-efficient devices becomes increasingly desirable with the rapid growth of artificial intelligence on edge. Specifically, to achieve superb energy-efficiency and scalability, efficient parallelization of single-pass deep neural network (DNN) inference on chip multiprocessor (CMP) architectures is urgently required by many time-sensitive applications. However, as the number of processing cores scales up and the performance of cores has grown much fast, the on-chip inter-core data movement is prone to be a performance bottleneck for computation. To remedy this problem and further improve the performance of network inference, in this work, we introduce a communication-aware DNN parallelization technique called CAP, by exploiting the elasticity and noise-tolerance of deep learning algorithms on CMP. Moreover, in the hope that the conducted studies can provide new design values for real-time neural network inference on embedded chips, we also have evaluated the proposed approach on both multi-core Neural Network Accelerators (NNA) chips and general-purpose chip-multiprocessors. Our experimental results show that the proposed CAP can achieve 1.12x-1.65x system speedups and 1.14x-2.70x energy efficiency for different neural networks while maintaining the inference accuracy, compared to baseline approaches.
关键词Kernel Computer architecture Multicore processing Deep learning System-on-chip Parallel processing Real-time systems Neural networks parallel processing real-time and embedded systems single-chip multiprocessors reinforcement learning structured sparsity
DOI10.1109/TC.2021.3099688
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China[2020YFB1600201] ; National Natural Science Foundation of China (NSFC)[62090024] ; National Natural Science Foundation of China (NSFC)[61874124] ; National Natural Science Foundation of China (NSFC)[61876173] ; Fundamental Research Funds for the Central Universities[2021MS017]
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Hardware & Architecture ; Engineering, Electrical & Electronic
WOS记录号WOS:000808068000011
出版者IEEE COMPUTER SOC
引用统计
被引频次:1[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/19591
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Wang, Ying
作者单位1.Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
3.North China Elect Power Univ, Sch Control & Comp Engn, Beijing 102206, Peoples R China
4.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
5.Peng Cheng Lab, Shenzhen 518066, Peoples R China
推荐引用方式
GB/T 7714
Zou, Kaiwei,Wang, Ying,Cheng, Long,et al. CAP: Communication-Aware Automated Parallelization for Deep Learning Inference on CMP Architectures[J]. IEEE TRANSACTIONS ON COMPUTERS,2022,71(7):1626-1639.
APA Zou, Kaiwei,Wang, Ying,Cheng, Long,Qu, Songyun,Li, Huawei,&Li, Xiaowei.(2022).CAP: Communication-Aware Automated Parallelization for Deep Learning Inference on CMP Architectures.IEEE TRANSACTIONS ON COMPUTERS,71(7),1626-1639.
MLA Zou, Kaiwei,et al."CAP: Communication-Aware Automated Parallelization for Deep Learning Inference on CMP Architectures".IEEE TRANSACTIONS ON COMPUTERS 71.7(2022):1626-1639.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zou, Kaiwei]的文章
[Wang, Ying]的文章
[Cheng, Long]的文章
百度学术
百度学术中相似的文章
[Zou, Kaiwei]的文章
[Wang, Ying]的文章
[Cheng, Long]的文章
必应学术
必应学术中相似的文章
[Zou, Kaiwei]的文章
[Wang, Ying]的文章
[Cheng, Long]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。