CSpace

浏览/检索结果: 共1条,第1-1条 帮助

已选(0)清除 条数/页:   排序方式:
Bridging Text and Video: A Universal Multimodal Transformer for Audio-Visual Scene-Aware Dialog 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 2476-2483
作者:  Li, Zekang;  Li, Zongjia;  Zhang, Jinchao;  Feng, Yang;  Zhou, Jie
收藏  |  浏览/下载:51/0  |  提交时间:2021/12/01
Task analysis  Feature extraction  Visualization  Speech processing  History  Social networking (online)  Pattern recognition  Dialogue System  Multimodal  Natural Language Processing  Video Understanding