Aligning Logits Generatively for Principled Black-Box Knowledge Distillation in the Wild

doi:10.1109/TPAMI.2025.3602663

CSpace

	Aligning Logits Generatively for Principled Black-Box Knowledge Distillation in the Wild
	Xiang, Xiang 1,2,3; Ma, Jing 3; Wu, Dongrui 3; Zeng, Zhigang 3; Chen, Xilin 4
	2025-12-01
发表期刊	IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
ISSN	0162-8828
卷号	47 期号:12 页码:11929-11945
摘要	Black-Box Knowledge Distillation (B2KD) is a conservative task in cloud-to-edge model compression, emphasizing the protection of data privacy and model copyrights on both the cloud and edge. With invisible data and models hosted on the server, B2KD aims to utilize only the API queries of the teacher model's inference results in the cloud to effectively distill a lightweight student model deployed on edge devices. B2KD faces challenges such as limited Internet exchange and edge-cloud disparity in data distribution. To address these issues, we theoretically provide a new optimization direction from logits to cell boundary, different from direct logits alignment, and formalize a workflow comprising deprivatization, distillation, and adaptation at test time. Guided by this, we propose a method, Mapping-Emulation KD (MEKD), to enhance the robust prediction and anti-interference capabilities of the student model on edge devices for any unknown data distribution in real-world scenarios. Our method does not differentiate between treating soft or hard responses and consists of: 1) deprivatization: emulating the inverse mapping of the teacher function with a generator, 2) distillation: aligning low-dimensional logits of the teacher and student models by reducing the distance of high-dimensional image points, and 3) adaptation: correcting the student's online prediction bias through a graph propagation-based only-forward test-time adaptation algorithm. Our method demonstrates inspiring performance for edge model distillation and adaptation across different teacher-student pairs. We validate the effectiveness of our method on multiple image recognition benchmarks and various Deep Neural Network models, achieving state-of-the-art performance and showcasing its practical value in remote sensing image recognition applications.
关键词	Data models Adaptation models Cloud computing Training Predictive models Image edge detection Generators Computational modeling Servers Model compression Cloud-to-edge model compression knowledge distillation generative adversarial network test-time adaptation
DOI	10.1109/TPAMI.2025.3602663
收录类别	SCI
语种	英语
WOS研究方向	Computer Science ; Engineering
WOS类目	Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS记录号	WOS:001609560700017
出版者	IEEE COMPUTER SOC
引用统计
文献类型	期刊论文
条目标识符	http://119.78.100.204/handle/2XEOYT63/42910
专题	中国科学院计算技术研究所
通讯作者	Xiang, Xiang
作者单位	1.Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan 430074, Peoples R China 2.Peng Cheng Lab, Shenzhen 51800, Peoples R China 3.Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China 4.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Xiang, Xiang,Ma, Jing,Wu, Dongrui,et al. Aligning Logits Generatively for Principled Black-Box Knowledge Distillation in the Wild[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2025,47(12):11929-11945.
APA	Xiang, Xiang,Ma, Jing,Wu, Dongrui,Zeng, Zhigang,&Chen, Xilin.(2025).Aligning Logits Generatively for Principled Black-Box Knowledge Distillation in the Wild.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,47(12),11929-11945.
MLA	Xiang, Xiang,et al."Aligning Logits Generatively for Principled Black-Box Knowledge Distillation in the Wild".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 47.12(2025):11929-11945.