Institute of Computing Technology, Chinese Academy IR
| Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition | |
| Liu, Yuanyuan1; Wei, Lin1; Liu, Kejun1; Chen, Zijing2; Chen, Zhe3; Tang, Chang1; Chen, Jingying4; Shan, Shiguang5,6 | |
| 2025-10-01 | |
| 发表期刊 | IEEE TRANSACTIONS ON AFFECTIVE COMPUTING
![]() |
| ISSN | 1949-3045 |
| 卷号 | 16期号:4页码:3404-3420 |
| 摘要 | Video-based facial expression recognition (VFER) is challenging due to variations caused by cultural background and expression camouflage. To tackle these problems, researchers introduced eye movement signals to complement visual information. However, existing methods either require expensive devices to capture high-quality eye movements or can only extract low-quality eye movements visually, making them ineffective in the real world. To address this, we propose an eye movement-instructed VFER (EM-VFER) that leverages high-quality eye movements to instruct the visual learning, obtaining robust performance without requiring costly devices during inference. Specifically, our EM-VFER operates in two stages: the high-quality eye movement pre-training stage and the eye movement-instructed video fine-tuning stage. In the pre-training, we compile an Eye-behavior-aided Multimodal Emotion Recognition (EMER) dataset and use it to train a multimodal Transformer. During the fine-tuning, we propose a novel progressive eye movement-instructed learning to take better advantage of the prior knowledge about high-quality eye movement signals from EMER. The instructed fine-tuning model could then make more robust predictions on downstream facial expression datasets. We evaluate our approach on three macro-expression datasets (DFEW, MAFW and Aff-wild2) and two micro-expression datasets (CASME III and CASME II). The results demonstrate that EM-VFER significantly outperforms existing methods. |
| 关键词 | Videos Face recognition Visualization Emotion recognition Transformers Training Accuracy Data mining Gaze tracking Computational modeling Video-based facial expression recognition eye movement signals pre-training fine-tuning instructed learning |
| DOI | 10.1109/TAFFC.2025.3599859 |
| 收录类别 | SCI |
| 语种 | 英语 |
| WOS研究方向 | Computer Science |
| WOS类目 | Computer Science, Artificial Intelligence ; Computer Science, Cybernetics |
| WOS记录号 | WOS:001626710800006 |
| 出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
| 引用统计 | |
| 文献类型 | 期刊论文 |
| 条目标识符 | http://119.78.100.204/handle/2XEOYT63/42816 |
| 专题 | 中国科学院计算技术研究所 |
| 通讯作者 | Chen, Zhe; Chen, Jingying; Shan, Shiguang |
| 作者单位 | 1.China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China 2.La Trobe Univ, Cisco La Trobe Ctr Artificial Intelligence & Inter, Sch Comp Engn & Math Sci, Flora Hill, Vic 3550, Australia 3.La Trobe Univ, Cisco La Trobe Ctr Artificial Intelligence & Inter, Australian Ctr Artificial Intelligence Med Innovat, Sch Comp Engn & Math Sci, Flora Hill, Vic 3550, Australia 4.Cent China Normal Univ, Natl Engn Res Ctr E Learning, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China 5.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 6.Univ Chinese Acad Sci, Beijing 100049, Peoples R China |
| 推荐引用方式 GB/T 7714 | Liu, Yuanyuan,Wei, Lin,Liu, Kejun,et al. Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING,2025,16(4):3404-3420. |
| APA | Liu, Yuanyuan.,Wei, Lin.,Liu, Kejun.,Chen, Zijing.,Chen, Zhe.,...&Shan, Shiguang.(2025).Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition.IEEE TRANSACTIONS ON AFFECTIVE COMPUTING,16(4),3404-3420. |
| MLA | Liu, Yuanyuan,et al."Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition".IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 16.4(2025):3404-3420. |
| 条目包含的文件 | 条目无相关文件。 | |||||
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论