Institute of Computing Technology, Chinese Academy IR
BIRD plus : Design of a Lightweight Communication Compressor for Resource-Constrained Distribution Learning Platforms | |
Wu, Donglei1,2; Yang, Weihao1,2; Zou, Xiangyu1,2; Feng, Hao3; Tao, Dingwen4; Li, Shiyi1,2; Xia, Wen1,2; Fang, Binxing1,2 | |
2024-11-01 | |
发表期刊 | IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS |
ISSN | 1045-9219 |
卷号 | 35期号:11页码:2193-2207 |
摘要 | The Top-K sparsification-based compression framework is extensively explored for reducing communication costs in distributed learning. However, we identified several issues with existing Top-K sparsification-based compression methods: (i) The limited compressibility of the Top-K parameter's indexes critically restricts the overall communication compression ratio; (ii) Several time-consuming compression operations significantly offset the benefits of communication compression; (iii) The use of error feedback techniques to maintain model quality results in a high memory footprint consumption. To solve these issues, we propose BIRD, a lightweight tensor-wise Bi-Random sampling strategy with an expectation invariance property. Specifically, BIRD applies a tensor-wise index sharing mechanism that reduces the index proportion by allowing multiple tensor elements to share a single index, thus improving the overall compression ratio. Additionally, BIRD replaces the time-consuming Top-K sorting with a faster Bi-Random sampling strategy based on the aforementioned index sharing mechanism, significantly reducing compression overheads; Moreover, BIRD establishes an expectation invariance property into the Bi-Random sampling to ensure an approximate unbiased representation for the $L_1$L1-norm of the sampled tensors, effectively maintaining the model quality without incurring extra memory costs. We further optimize BIRD to BIRD+ by introducing the uniform distribution-based sampling and Gamma correction on the tensor-wise sampling process, achieving a more flexibly adjustment of the sparsity with better convergence performance. Experimental evaluations across multiple conventional distributed learning tasks demonstrate that compared to state-of-the-art approaches, BIRD+ achieves higher communication compression ratios up to 36.2x and higher computation throughput up to 149.6x while maintaining the model quality without incurring extra memory costs. |
关键词 | Indexes Costs Computational modeling Distance learning Computer aided instruction Training Tensors Distributed learning communication compression random sampling neural network |
DOI | 10.1109/TPDS.2024.3447221 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | Major Key Project of PCL[PCL2022A03] ; Shenzhen Science and Technology Program[RCYX20210609104510007] ; Shenzhen Science and Technology Program[KJZD20230923114610021] ; Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies[2022B1212010005] ; Guangdong Basic and Applied Basic Research Foundation[2023A1515110072] ; National Natural Science Foundation of China[62472127] ; National Natural Science Foundation of China[62032023] ; National Natural Science Foundation of China[T2125013] ; Innovation Funding of ICT, CAS[E461050] |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Theory & Methods ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:001320540600003 |
出版者 | IEEE COMPUTER SOC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/39538 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Xia, Wen |
作者单位 | 1.Harbin Inst Technol, Guangdong Prov Key Lab Novel Secur Intelligence Te, Shenzhen 518055, Peoples R China 2.Peng Cheng Lab, Dept New Networks, Shenzhen 518055, Peoples R China 3.Indiana Univ, Bloomington, IN 47405 USA 4.Chinese Acad Sci, Inst Comp Technol, State Key Lab Processors, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Wu, Donglei,Yang, Weihao,Zou, Xiangyu,et al. BIRD plus : Design of a Lightweight Communication Compressor for Resource-Constrained Distribution Learning Platforms[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,2024,35(11):2193-2207. |
APA | Wu, Donglei.,Yang, Weihao.,Zou, Xiangyu.,Feng, Hao.,Tao, Dingwen.,...&Fang, Binxing.(2024).BIRD plus : Design of a Lightweight Communication Compressor for Resource-Constrained Distribution Learning Platforms.IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,35(11),2193-2207. |
MLA | Wu, Donglei,et al."BIRD plus : Design of a Lightweight Communication Compressor for Resource-Constrained Distribution Learning Platforms".IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 35.11(2024):2193-2207. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论