FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning With Partitioning and Parallelism of Search Space

doi:10.1109/TPDS.2024.3386939

	FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning With Partitioning and Parallelism of Search Space
	Li, Xiaqing 1,2; Guo, Qi 1,2; Zhang, Guangyan 1,2; Ye, Siwei 1,2; He, Guanhua 1,2; Yao, Yiheng 1,2; Zhang, Rui 1,2; Hao, Yifan 1,2; Du, Zidong 1,2; Zheng, Weimin 1,2
	2024-07-01
发表期刊	IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
ISSN	1045-9219
卷号	35 期号:7 页码:1174-1188
摘要	Hyper-parameter tuning (HPT) for deep learning (DL) models is prohibitively expensive. Sequential model-based optimization (SMBO) emerges as the state-of-the-art (SOTA) approach to automatically optimize HPT performance due to its heuristic advantages. Unfortunately, focusing on algorithm optimization rather than a large-scale parallel HPT system, existing SMBO-based approaches still cannot effectively remove their strong sequential nature, posing two performance problems: (1) extremely low tuning speed and (2) sub-optimal model quality. In this paper, we propose FastTuning, a fast, scalable, and generic system aiming at parallelly accelerating SMBO-based HPT for large DL/ML models. The key is to partition the highly complex search space into multiple smaller sub-spaces, each of which is assigned to and optimized by a different tuning worker in parallel. However, determining the right level of resource allocation to strike a balance between quality and cost remains a challenge. To address this, we further propose NIMBLE, a dynamic scheduling strategy that is specially designed for FastTuning, including (1) Dynamic Elimination Algorithm, (2) Sub-space Re-division, and (3) Posterior Information Sharing. Finally, we incorporate 6 SOTAs (i.e., 3 tuning algorithms and 3 parallel tuning tools) into FastTuning. Experimental results, on ResNet18, VGG19, ResNet50, and ResNet152, show that FastTuning can consistently offer much faster tuning speed (up to 80x) with better accuracy (up to 4.7% improvement), thereby enabling the application of automatic HPT to real-life DL models.
关键词	Deep learning distributed hyper-parameter tuning (HPT) system parallel computing
DOI	10.1109/TPDS.2024.3386939
收录类别	SCI
语种	英语
资助项目	National Key R#x0026;D Program of China
WOS研究方向	Computer Science ; Engineering
WOS类目	Computer Science, Theory & Methods ; Engineering, Electrical & Electronic
WOS记录号	WOS:001224174400001
出版者	IEEE COMPUTER SOC
引用统计	被引频次：1[WOS] [WOS记录] [WOS相关记录]
文献类型	期刊论文
条目标识符	http://119.78.100.204/handle/2XEOYT63/40077
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Guo, Qi
作者单位	1.Chinese Acad Sci, Inst Comp Technol, Beijing 100045, Peoples R China 2.Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
推荐引用方式 GB/T 7714	Li, Xiaqing,Guo, Qi,Zhang, Guangyan,et al. FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning With Partitioning and Parallelism of Search Space[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,2024,35(7):1174-1188.
APA	Li, Xiaqing.,Guo, Qi.,Zhang, Guangyan.,Ye, Siwei.,He, Guanhua.,...&Zheng, Weimin.(2024).FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning With Partitioning and Parallelism of Search Space.IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,35(7),1174-1188.
MLA	Li, Xiaqing,et al."FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning With Partitioning and Parallelism of Search Space".IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 35.7(2024):1174-1188.