CSpace

浏览/检索结果: 共2条,第1-2条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Bridging Text and Video: A Universal Multimodal Transformer for Audio-Visual Scene-Aware Dialog 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 2476-2483
作者:  Li, Zekang;  Li, Zongjia;  Zhang, Jinchao;  Feng, Yang;  Zhou, Jie
收藏  |  浏览/下载:40/0  |  提交时间:2021/12/01
Task analysis  Feature extraction  Visualization  Speech processing  History  Social networking (online)  Pattern recognition  Dialogue System  Multimodal  Natural Language Processing  Video Understanding  
Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 卷号: 28, 页码: 1466-1478
作者:  Lin, Liwei;  Wang, Xiangdong;  Liu, Hong;  Qian, Yueliang
收藏  |  浏览/下载:41/0  |  提交时间:2020/12/10
Sound event detection (SED)  machine learning  weakly-supervised learning  attention pooling