CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection
Lin, Liwei1,2; Wang, Xiangdong1; Liu, Hong1; Qian, Yueliang1
2020
发表期刊IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
ISSN2329-9290
卷号28页码:1466-1478
摘要In this article, a special decision surface for the weakly-supervised sound event detection (SED) and a disentangled feature (DF) for the multi-label problem in polyphonic SED are proposed. We approach SED as a multiple instance learning (MIL) problem and utilize a neural network framework with a pooling module to solve it. General MIL approaches include two kinds: the instance-level approaches and embedding-level approaches. We present a method of generating instance-level probabilities for the embedding level approaches which tend to perform better than the instance-level approaches in terms of bag-level classification but can not provide instance-level probabilities in current approaches. Moreover, we further propose a specialized decision surface (SDS) for the embedding-level attention pooling. We analyze and explained why an embedding-level attention module with SDS is better than other typical pooling modules from the perspective of the high-level feature space. As for the problem of the unbalanced dataset and the co-occurrence of multiple categories in the polyphonic event detection task, we propose a DF to reduce interference among categories, which optimizes the high-level feature space by disentangling it based on class-wise identifiable information and obtaining multiple different subspaces. Experiments on the dataset of DCASE 2018 Task 4 show that the proposed SDS and DF significantly improve the detection performance of the embedding-level MIL approach with an attention pooling module and outperform the first place system in the challenge by $\mathbf {6.6}$ percentage points.
关键词Sound event detection (SED) machine learning weakly-supervised learning attention pooling
DOI10.1109/TASLP.2020.2989575
收录类别SCI
语种英语
资助项目Beijing Natural Science Foundation[4172058]
WOS研究方向Acoustics ; Engineering
WOS类目Acoustics ; Engineering, Electrical & Electronic
WOS记录号WOS:000538078300003
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
被引频次:14[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/15263
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Wang, Xiangdong
作者单位1.Chinese Acad Sci, Bejing Key Lab Mobile Comp & Pervas Device, Inst Comp Technol, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Lin, Liwei,Wang, Xiangdong,Liu, Hong,et al. Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,2020,28:1466-1478.
APA Lin, Liwei,Wang, Xiangdong,Liu, Hong,&Qian, Yueliang.(2020).Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection.IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,28,1466-1478.
MLA Lin, Liwei,et al."Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection".IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 28(2020):1466-1478.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Lin, Liwei]的文章
[Wang, Xiangdong]的文章
[Liu, Hong]的文章
百度学术
百度学术中相似的文章
[Lin, Liwei]的文章
[Wang, Xiangdong]的文章
[Liu, Hong]的文章
必应学术
必应学术中相似的文章
[Lin, Liwei]的文章
[Wang, Xiangdong]的文章
[Liu, Hong]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。