Institute of Computing Technology, Chinese Academy IR
Sketch-fusion: A gradient compression method with multi-layer fusion for communication-efficient distributed training | |
Dai, Lingfei1,2; Gong, Luqi1; An, Zhulin1; Xu, Yongjun1; Diao, Boyu1 | |
2024-03-01 | |
发表期刊 | JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING |
ISSN | 0743-7315 |
卷号 | 185页码:10 |
摘要 | Gradient compression is an effective technique for improving the efficiency of distributed training. However, introducing gradient compression can reduce model accuracy and training efficiency. Furthermore, we also find that using a layer-wise gradient compression algorithm would lead to significant compression and communication overhead, which can negatively impact the scaling efficiency of the distributed training system. To address these issues, we propose a new method called Sketch-Fusion SGD, which leverages the Count-Sketch data structure to enhance the scalability and training speed of distributed deep learning systems. Moreover, our method employs LayerFusion to optimize gradient compression algorithms' scalability and convergence efficiency by formulating an optimal multi-layer fusion strategy without introducing extra hyperparameters. We evaluate our method on a cluster of 16 GPUs and demonstrate that it can improve training efficiency by up to 18.6% without compromising the model's accuracy. In addition, we find that applying our LayerFusion algorithm to other gradient compression methods improved their scalability by up to 2.87x. |
关键词 | Gradient compression Multi-layer fusion Distributed stochastic gradient descent Deep learning training |
DOI | 10.1016/j.jpdc.2023.104811 |
收录类别 | SCI |
语种 | 英语 |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Theory & Methods |
WOS记录号 | WOS:001127654600001 |
出版者 | ACADEMIC PRESS INC ELSEVIER SCIENCE |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/38454 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Diao, Boyu |
作者单位 | 1.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China 2.Univ Chinese Acad Sci, Coll Comp Sci, Beijing, Peoples R China |
推荐引用方式 GB/T 7714 | Dai, Lingfei,Gong, Luqi,An, Zhulin,et al. Sketch-fusion: A gradient compression method with multi-layer fusion for communication-efficient distributed training[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING,2024,185:10. |
APA | Dai, Lingfei,Gong, Luqi,An, Zhulin,Xu, Yongjun,&Diao, Boyu.(2024).Sketch-fusion: A gradient compression method with multi-layer fusion for communication-efficient distributed training.JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING,185,10. |
MLA | Dai, Lingfei,et al."Sketch-fusion: A gradient compression method with multi-layer fusion for communication-efficient distributed training".JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 185(2024):10. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论