Learning Representations for Facial Actions From Unlabeled Videos

doi:10.1109/TPAMI.2020.3011063

	Learning Representations for Facial Actions From Unlabeled Videos
	Li, Yong 1,2; Zeng, Jiabei 1; Shan, Shiguang 1,2,3
	2022
发表期刊	IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
ISSN	0162-8828
卷号	44 期号:1 页码:302-317
摘要	Facial actions are usually encoded as anatomy-based action units (AUs), the labelling of which demands expertise and thus is time-consuming and expensive. To alleviate the labelling demand, we propose to leverage the large number of unlabelled videos by proposing a twin-cycle autoencoder (TAE) to learn discriminative representations for facial actions. TAE is inspired by the fact that facial actions are embedded in the pixel-wise displacements between two sequential face images (hereinafter, source and target) in the video. Therefore, learning the representations of facial actions can be achieved by learning the representations of the displacements. However, the displacements induced by facial actions are entangled with those induced by head motions. TAE is thus trained to disentangle the two kinds of movements by evaluating the quality of the synthesized images when either the facial actions or head pose is changed, aiming to reconstruct the target image. Experiments on AU detection show that TAE can achieve accuracy comparable to other existing AU detection methods including some supervised methods, thus validating the discriminant capacity of the representations learned by TAE. TAE's ability in decoupling the action-induced and pose-induced movements is also validated by visualizing the generated images and analyzing the facial image retrieval results qualitatively and quantitatively.
关键词	Facial action unit detection self-supervised learning representation learning feature disentanglement encoder-decoder structure
DOI	10.1109/TPAMI.2020.3011063
收录类别	SCI
语种	英语
WOS研究方向	Computer Science ; Engineering
WOS类目	Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS记录号	WOS:000728561300022
出版者	IEEE COMPUTER SOC
引用统计	被引频次：39[WOS] [WOS记录] [WOS相关记录]
文献类型	期刊论文
条目标识符	http://119.78.100.204/handle/2XEOYT63/18045
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Shan, Shiguang
作者单位	1.Chinese Acad Sci, Inst Comp Technol, CAS, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 3.CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Li, Yong,Zeng, Jiabei,Shan, Shiguang. Learning Representations for Facial Actions From Unlabeled Videos[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2022,44(1):302-317.
APA	Li, Yong,Zeng, Jiabei,&Shan, Shiguang.(2022).Learning Representations for Facial Actions From Unlabeled Videos.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,44(1),302-317.
MLA	Li, Yong,et al."Learning Representations for Facial Actions From Unlabeled Videos".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 44.1(2022):302-317.