CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
An Instruction Set Architecture for Machine Learning
Chen, Yunji1,2,3,4; Lan, Huiying1; Du, Zidong1; Liu, Shaoli1; Tao, Jinhua1; Han, Dong1; Luo, Tao1; Guo, Qi1; Li, Ling2,5; Xie, Yuan6; Chen, Tianshi1
2019-08-01
发表期刊ACM TRANSACTIONS ON COMPUTER SYSTEMS
ISSN0734-2071
卷号36期号:3页码:35
摘要Machine Learning (ML) are a family of models for learning from the data to improve performance on a certain task. ML techniques, especially recent renewed neural networks (deep neural networks), have proven to be efficient for a broad range of applications. ML techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which usually are not energy efficient, since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators have been proposed recently to improve energy efficiency. However, such accelerators were designed for a small set of ML techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an ML technique (such as layers in neural networks) or even an ML as a whole. Although straightforward and easy to implement for a limited set of similar ML techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different ML techniques with sufficient flexibility and efficiency. In this article, we first propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. We then extend the application scope of Cambricon from NN to ML techniques. We also propose an assembly language, an assembler, and runtime to support programming with Cambricon, especially targeting large-scale ML problems. Our evaluation over a total of 16 representative yet distinct ML techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of ML techniques and provides higher code density than general-purpose ISAs such as x86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [7] (which can only accommodate three types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks and 7 other ML benchmarks. Compared to the recent prevalent ML accelerator PuDianNao, our Cambricon-based accelerator is able to support all the ML techniques as well as the 10 NNs but with only approximate 5.1% performance loss.
DOI10.1145/3331469
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China[2017YFA0700900] ; National Key Research and Development Program of China[2017YFA0700902] ; National Key Research and Development Program of China[2017YFA0700901] ; National Key Research and Development Program of China[2017YFB1003101] ; NSF of China[61432016] ; NSF of China[61532016] ; NSF of China[61672491] ; NSF of China[61602441] ; NSF of China[61602446] ; NSF of China[61732002] ; NSF of China[61702478] ; NSF of China[61732007] ; NSF of China[61732020] ; Beijing Natural Science Foundation[JQ18013] ; 973 Program of China[2015CB358800] ; National Science and Technology Major Project[2018ZX01031102] ; Transformation and Transfer of Scientific and Technological Achievements of Chinese Academy of Sciences[KFJ-HGZX-013] ; Key Research Projects in Frontier Science of Chinese Academy of Sciences[QYZDB-SSW-JSC001] ; Strategic Priority Research Program of Chinese Academy of Science[XDB32050200] ; Strategic Priority Research Program of Chinese Academy of Science[XDC01020000] ; CAS Center for Excellence in Brain Science and Intelligence Technology (CEBSIT)
WOS研究方向Computer Science
WOS类目Computer Science, Theory & Methods
WOS记录号WOS:000496739500003
出版者ASSOC COMPUTING MACHINERY
引用统计
被引频次:7[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/14878
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Du, Zidong
作者单位1.Chinese Acad Sci, Inst Comp Technol, SKL Comp Architecture, Beijing, Peoples R China
2.Univ Chinese Acad Sci, Beijing, Peoples R China
3.BIT, ZJLab, Inst BrainIntelligence Technol, Zhanjiang Lab, Beijing, Peoples R China
4.Shanghai Res Ctr Brain Sci & Brain Inspired Intel, Shanghai, Peoples R China
5.Chinese Acad Sci, Inst Software, Beijing, Peoples R China
6.UCSB, Dept Elect & Comp Engn, Santa Barbara, CA USA
推荐引用方式
GB/T 7714
Chen, Yunji,Lan, Huiying,Du, Zidong,et al. An Instruction Set Architecture for Machine Learning[J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS,2019,36(3):35.
APA Chen, Yunji.,Lan, Huiying.,Du, Zidong.,Liu, Shaoli.,Tao, Jinhua.,...&Chen, Tianshi.(2019).An Instruction Set Architecture for Machine Learning.ACM TRANSACTIONS ON COMPUTER SYSTEMS,36(3),35.
MLA Chen, Yunji,et al."An Instruction Set Architecture for Machine Learning".ACM TRANSACTIONS ON COMPUTER SYSTEMS 36.3(2019):35.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Chen, Yunji]的文章
[Lan, Huiying]的文章
[Du, Zidong]的文章
百度学术
百度学术中相似的文章
[Chen, Yunji]的文章
[Lan, Huiying]的文章
[Du, Zidong]的文章
必应学术
必应学术中相似的文章
[Chen, Yunji]的文章
[Lan, Huiying]的文章
[Du, Zidong]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。