CSpace
OTRec: Cross-Modal Learning for Multimodal Recommendation via Optimal Transport
Cao, Zongsheng1,2,3; Xu, Qianqian4; Yang, Zhiyong5; He, Yuan6; Cao, Xiaochun7; Huang, Qingming4,8,9,10
2025
发表期刊IEEE TRANSACTIONS ON MULTIMEDIA
ISSN1520-9210
卷号27页码:8603-8617
摘要In recent years, there has been a growing interest in multimodal recommendation systems due to the rapid growth of multimedia and the explosion of information. Despite notable advancements, current models often fuse multimodal embeddings with ID (name or concept) embeddings in a weighted or concatenated manner for items. Under this circumstance, they may overlook the heterogeneity problem between different modalities, and lack theoretical guarantees, potentially leading to suboptimal item representations. To overcome this challenge, we introduce a novel model named OTRec, which employs optimal transport (OT) to align heterogeneous multimodal embeddings with ID embeddings. Specifically, OTRec captures co-occurrence features across modalities and distinctive features within modalities, enabling the formation of the unified representation from both modal-invariant and modal-specific perspectives. This dual strategy ensures a comprehensive alignment of heterogeneous multimodal data, significantly improving the accuracy of capturing user preferences. Additionally, traditional recommendation models typically match an item's ID with its multimodal data as positive samples for contrastive learning, neglecting the potential complementary information from other items' multimodal data. To address this issue, we introduce a semantic-enhanced contrastive learning module, which can learn latent semantic correlations across items by a semantic-similarity weighting matrix. It can be integrated as a plug-in for other models to effectively explore latent semantics. On top of this, we provide theoretical guarantees that demonstrate the effectiveness of OTRec in aligning multimodal and ID information and in enhancing the mutual information between them. Extensive evaluations on three public datasets illustrate OTRec's effectiveness and achieve state-of-the-art performance.
关键词Semantics Recommender systems Contrastive learning Electronic mail Lattices Data models Data mining Artificial intelligence Accuracy Visualization Multimodal recommendation optimal transport modal-invariant modal-specific
DOI10.1109/TMM.2025.3607735
收录类别SCI
语种英语
WOS研究方向Computer Science ; Telecommunications
WOS类目Computer Science, Information Systems ; Computer Science, Software Engineering ; Telecommunications
WOS记录号WOS:001615548800007
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/43060
专题中国科学院计算技术研究所
通讯作者Xu, Qianqian; Huang, Qingming
作者单位1.Chinese Acad Sci, Inst Informat Engn, State Key Lab Informat Secur SKLOIS, Beijing 100093, Peoples R China
2.Univ Chinese Acad Sci, Sch Cyber Secur, Beijing 100080, Peoples R China
3.Shanghai AI Lab, Shanghai 200232, Peoples R China
4.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
5.Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China
6.Alibaba Grp, Secur Dept, Hangzhou 311121, Peoples R China
7.Sun Yat sen Univ, Sch Cyber Sci & Technol, Shenzhen Campus, Shenzhen 518107, Peoples R China
8.Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China
9.Univ Chinese Acad Sci, Key Lab Big Data Min & Knowledge Management BDKM, Beijing 101408, Peoples R China
10.Peng Cheng Lab, Shenzhen 518055, Peoples R China
推荐引用方式
GB/T 7714
Cao, Zongsheng,Xu, Qianqian,Yang, Zhiyong,et al. OTRec: Cross-Modal Learning for Multimodal Recommendation via Optimal Transport[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2025,27:8603-8617.
APA Cao, Zongsheng,Xu, Qianqian,Yang, Zhiyong,He, Yuan,Cao, Xiaochun,&Huang, Qingming.(2025).OTRec: Cross-Modal Learning for Multimodal Recommendation via Optimal Transport.IEEE TRANSACTIONS ON MULTIMEDIA,27,8603-8617.
MLA Cao, Zongsheng,et al."OTRec: Cross-Modal Learning for Multimodal Recommendation via Optimal Transport".IEEE TRANSACTIONS ON MULTIMEDIA 27(2025):8603-8617.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Cao, Zongsheng]的文章
[Xu, Qianqian]的文章
[Yang, Zhiyong]的文章
百度学术
百度学术中相似的文章
[Cao, Zongsheng]的文章
[Xu, Qianqian]的文章
[Yang, Zhiyong]的文章
必应学术
必应学术中相似的文章
[Cao, Zongsheng]的文章
[Xu, Qianqian]的文章
[Yang, Zhiyong]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。