CSpace  > 中国科学院计算技术研究所期刊论文
A Pyramid Semi-Autoregressive Transformer with Rich Semantics for Sign Language Production
Cui, Zhenchao1; Chen, Ziang1; Li, Zhaoxin2; Wang, Zhaoqi2
2022-12-01
发表期刊SENSORS
卷号22期号:24页码:15
摘要As a typical sequence to sequence task, sign language production (SLP) aims to automatically translate spoken language sentences into the corresponding sign language sequences. The existing SLP methods can be classified into two categories: autoregressive and non-autoregressive SLP. The autoregressive methods suffer from high latency and error accumulation caused by the long-term dependence between current output and the previous poses. And non-autoregressive methods suffer from repetition and omission during the parallel decoding process. To remedy these issues in SLP, we propose a novel method named Pyramid Semi-Autoregressive Transformer with Rich Semantics (PSAT-RS) in this paper. In PSAT-RS, we first introduce a pyramid Semi-Autoregressive mechanism with dividing target sequence into groups in a coarse-to-fine manner, which globally keeps the autoregressive property while locally generating target frames. Meanwhile, the relaxed masked attention mechanism is adopted to make the decoder not only capture the pose sequences in the previous groups, but also pay attention to the current group. Finally, considering the importance of spatial-temporal information, we also design a Rich Semantics embedding (RS) module to encode the sequential information both on time dimension and spatial displacement into the same high-dimensional space. This significantly improves the coordination of joints motion, making the generated sign language videos more natural. Results of our experiments conducted on RWTH-PHOENIX-Weather-2014T and CSL datasets show that the proposed PSAT-RS is competitive to the state-of-the-art autoregressive and non-autoregressive SLP models, achieving a better trade-off between speed and accuracy.
关键词human pose generation sign language production semi-autoregressive transformer deep learning
DOI10.3390/s22249606
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China ; Post-graduate's Innovation Fund Project of Hebei University ; National Natural Science Foundation of China ; Scientific Research Foundation for Talented Scholars of Hebei University ; Scientific Research Foundation of Colleges and Universities in Hebei Province ; [2020YFC1523302] ; [HBU2022ss014] ; [62172392] ; [521100221081] ; [QN2022107]
WOS研究方向Chemistry ; Engineering ; Instruments & Instrumentation
WOS类目Chemistry, Analytical ; Engineering, Electrical & Electronic ; Instruments & Instrumentation
WOS记录号WOS:000902932900001
出版者MDPI
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/20180
专题中国科学院计算技术研究所期刊论文
通讯作者Li, Zhaoxin
作者单位1.Hebei Univ, Hebei Machine Vis Engn Res Ctr, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Cui, Zhenchao,Chen, Ziang,Li, Zhaoxin,et al. A Pyramid Semi-Autoregressive Transformer with Rich Semantics for Sign Language Production[J]. SENSORS,2022,22(24):15.
APA Cui, Zhenchao,Chen, Ziang,Li, Zhaoxin,&Wang, Zhaoqi.(2022).A Pyramid Semi-Autoregressive Transformer with Rich Semantics for Sign Language Production.SENSORS,22(24),15.
MLA Cui, Zhenchao,et al."A Pyramid Semi-Autoregressive Transformer with Rich Semantics for Sign Language Production".SENSORS 22.24(2022):15.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Cui, Zhenchao]的文章
[Chen, Ziang]的文章
[Li, Zhaoxin]的文章
百度学术
百度学术中相似的文章
[Cui, Zhenchao]的文章
[Chen, Ziang]的文章
[Li, Zhaoxin]的文章
必应学术
必应学术中相似的文章
[Cui, Zhenchao]的文章
[Chen, Ziang]的文章
[Li, Zhaoxin]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。