CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
DCHF_T: A multi-dimensional adaptive compression approach for transformer-based models
Yan, Yaoyao1; Wang, Da2,4; Ye, Jing2,4,5; Yu, Hui3; Lu, Dianjie1; Zhang, Yuang1; Xu, Weizhi1,4; Liu, Fang'ai1
2025-12-01
发表期刊NEUROCOMPUTING
ISSN0925-2312
卷号656页码:12
摘要In recent years, pre-trained language models based on the Transformer architecture have achieved significant results in many natural language processing tasks. However, the high computational cost limits their application in real-world scenarios. Previous Transformer compression methods typically focus on single-dimensional compression, which may cause over-compression and consequently degrade model performance. Additionally, these methods lack targeted optimization for specific downstream tasks. In this paper, we propose DCHF_T, a multidimensional adaptive compression approach that compresses Transformer models through token compression, attention head pruning, and a lightweight FFN. This approach selects the most informative tokens during training, prunes unimportant tokens, and retains their information in a compressed form, allowing the model to focus more on task-relevant inputs. Furthermore, DCHF_T combines attention head pruning and a lightweight FFN to reduce computation and parameter size across multiple dimensions. We employ multi-objective evolutionary search to optimize the trade-off between accuracy and efficiency under various computational budgets. Experimental results on the GLUE benchmark demonstrate that DCHF_T achieves the best compression-performance trade-off. While maintaining the highest accuracy, DCHF_T achieves a reduction of 3.7x and 3.6x in FLOPs on BERT-base and RoBERTa-base, respectively. By implementing adaptive multi-dimensional compression, DCHF_T provides a systematic solution for deploying Transformer models in resource-constrained scenarios.
关键词Transformer Dynamic token compression Pruning Multi-dimensional adaptive compression
DOI10.1016/j.neucom.2025.131071
收录类别SCI
语种英语
资助项目Natural Science Foundation of Shandong Province[ZR2022MF328] ; Natural Science Foundation of Shandong Province[ZR2025MS1025] ; Natural Science Foundation of Shandong Province[ZR2024MF073] ; Natural Science Foundation of Shandong Province[ZR2019LZH014] ; National Natural Science Foundation of China[92473203] ; National Natural Science Foundation of China[61602284] ; National Natural Science Foundation of China[61602285] ; State Key Lab of Processors Open Fund Project[CLQ202409] ; State Key Lab of Processors Open Fund Project[CLQ202402] ; CCF-Ricore Education Fund[CCF-Ricore OF 2024003]
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence
WOS记录号WOS:001584006500005
出版者ELSEVIER
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/41649
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Xu, Weizhi
作者单位1.Shandong Normal Univ, Sch Informat Sci & Engn, Jinan, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
3.Shandong Normal Univ, Business Sch, Jinan, Peoples R China
4.State Key Lab Processors, Beijing, Peoples R China
5.CASTEST Co Ltd, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Yan, Yaoyao,Wang, Da,Ye, Jing,et al. DCHF_T: A multi-dimensional adaptive compression approach for transformer-based models[J]. NEUROCOMPUTING,2025,656:12.
APA Yan, Yaoyao.,Wang, Da.,Ye, Jing.,Yu, Hui.,Lu, Dianjie.,...&Liu, Fang'ai.(2025).DCHF_T: A multi-dimensional adaptive compression approach for transformer-based models.NEUROCOMPUTING,656,12.
MLA Yan, Yaoyao,et al."DCHF_T: A multi-dimensional adaptive compression approach for transformer-based models".NEUROCOMPUTING 656(2025):12.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Yan, Yaoyao]的文章
[Wang, Da]的文章
[Ye, Jing]的文章
百度学术
百度学术中相似的文章
[Yan, Yaoyao]的文章
[Wang, Da]的文章
[Ye, Jing]的文章
必应学术
必应学术中相似的文章
[Yan, Yaoyao]的文章
[Wang, Da]的文章
[Ye, Jing]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。