Institute of Computing Technology, Chinese Academy IR
Modality-Consistent Prompt Tuning With Optimal Transport | |
Ren, Hairui1; Tang, Fan2; Zheng, Huangjie3; Zhao, He4; Guo, Dandan1; Chang, Yi5,6 | |
2025-03-01 | |
发表期刊 | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
![]() |
ISSN | 1051-8215 |
卷号 | 35期号:3页码:2499-2512 |
摘要 | Prompt tuning has been successfully used in leveraging the knowledge of Large-scale Vision-Language Pre-trained (VLP) models on downstream tasks. Most existing prompt tuning approaches learn prompts by maximizing the pairwise similarity. Although samples in different modalities might be relatively aligned pairwisely, such alignment does not fully utilize the information between samples, which can be less consistent on the modality level. In this paper, we propose a novel prompt tuning strategy by distributionally matching different modalities. Specifically, we minimize the distribution-wise distance between the image and text modalities with optimal transport (OT) theory. Simultaneously, we add a constraint on the learned transport plan during the modality matching to enhance the learning of vision and text prompts. Our proposed one can be applied to improve existing uni-modal and multi-modal prompt learning methods for being a plug-and-play method, which can generate modality-consistent representations. Experiments on eleven public datasets demonstrate that our proposed method has excellent performance, achieving substantial improvements on both uni-modal and multi-modal prompt tuning methods. |
关键词 | Prompt tuning modality-consistent optimal transport distribution matching Prompt tuning modality-consistent optimal transport distribution matching |
DOI | 10.1109/TCSVT.2024.3489024 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | NSFC[62306125] ; NSFC[2023YFF0905400] ; NSFC[U2341229] |
WOS研究方向 | Engineering |
WOS类目 | Engineering, Electrical & Electronic |
WOS记录号 | WOS:001439628600039 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/40719 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Guo, Dandan; Chang, Yi |
作者单位 | 1.Jilin Univ, Sch Artificial Intelligence, Changchun 130012, Jilin, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China 3.Univ Texas Austin, Dept Stat & Data Sci, Austin, TX 78712 USA 4.CSIROs Data61, Eveleigh, NSW 2015, Australia 5.Jilin Univ, Sch Artificial Intelligence, Int Ctr Future Sci, Changchun 130012, Jilin, Peoples R China 6.Minist Educ MOE, Engn Res Ctr Knowledge Driven Human Machine Intell, Changchun 130000, Peoples R China |
推荐引用方式 GB/T 7714 | Ren, Hairui,Tang, Fan,Zheng, Huangjie,et al. Modality-Consistent Prompt Tuning With Optimal Transport[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,2025,35(3):2499-2512. |
APA | Ren, Hairui,Tang, Fan,Zheng, Huangjie,Zhao, He,Guo, Dandan,&Chang, Yi.(2025).Modality-Consistent Prompt Tuning With Optimal Transport.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,35(3),2499-2512. |
MLA | Ren, Hairui,et al."Modality-Consistent Prompt Tuning With Optimal Transport".IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 35.3(2025):2499-2512. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论