CSpace
Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech
Pan, Yuchen1; Shang, Yuanyuan1,4; Wang, Wei1,2; Shao, Zhuhong1,5; Han, Zhuojin1; Liu, Tie1,5; Guo, Guodong3; Ding, Hui1,5
2024-03-01
发表期刊BIOMEDICAL SIGNAL PROCESSING AND CONTROL
ISSN1746-8094
卷号89页码:15
摘要Depression can induce a range of physiological effects, leading to notable distinctions in the acoustic charac-teristics exhibited by individuals with depression as opposed to those without. Designing efficient algorithms to accurately identify depression through speech poses a formidable challenge. In this paper, we propose the Multi-Feature Deep Supervised Voiceprint Adversarial Network (MFDS-VAN) for audio-based depression recognition. The MFDS-VAN assimilates extracted acoustic features and the audio waveform, subsequently generating predictions regarding the depression score. In order to attain more robust and discriminative spatial- temporal features associated with depression, the Encoding Network module merges long-term and short-term acoustic features with the unprocessed audio waveform, while the Regression Network module enables prediction of the depression score. The Deep Supervised Regression algorithm is designed by combining GE2E clustering and Huber regression for better network optimization. Furthermore, to enhance the representation the MFDS-VAN while diminishing the influence of individual voiceprint information, we propose the Voiceprint Adversarial Network. Experimental results conducted on AVEC 2013, AVEC 2014, and AVEC 2017 datasets demonstrate that the MFDS-VAN significantly enhances robustness and performance in speech-based depression recognition. Our model achieves competitive results when compared to recent audio-based methodologies.
关键词Adversarial learning Audio processing Attention mechanism Deep neural network Depression recognition Feature enhancement
DOI10.1016/j.bspc.2023.105704
收录类别SCI
语种英语
资助项目National Natural Science tion of China[61876112] ; National Natural Science tion of China[61601311] ; Natural Science tion of Beijing, China[L201022]
WOS研究方向Engineering
WOS类目Engineering, Biomedical
WOS记录号WOS:001116988300001
出版者ELSEVIER SCI LTD
引用统计
被引频次:1[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/38490
专题中国科学院计算技术研究所
通讯作者Shang, Yuanyuan; Wang, Wei
作者单位1.Capital Normal Univ, Coll Informat Engn, Beijing 100048, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing 100000, Peoples R China
3.West Virginia Univ, Lane Dept Comp Sci & Elect Engn, Morgantown, WV 26506 USA
4.Beijing Adv Innovat Ctr Imaging Technol, Beijing 100048, Peoples R China
5.Beijing Key Lab Elect Syst Reliabil Technol, Beijing 100048, Peoples R China
推荐引用方式
GB/T 7714
Pan, Yuchen,Shang, Yuanyuan,Wang, Wei,et al. Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL,2024,89:15.
APA Pan, Yuchen.,Shang, Yuanyuan.,Wang, Wei.,Shao, Zhuhong.,Han, Zhuojin.,...&Ding, Hui.(2024).Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech.BIOMEDICAL SIGNAL PROCESSING AND CONTROL,89,15.
MLA Pan, Yuchen,et al."Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech".BIOMEDICAL SIGNAL PROCESSING AND CONTROL 89(2024):15.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Pan, Yuchen]的文章
[Shang, Yuanyuan]的文章
[Wang, Wei]的文章
百度学术
百度学术中相似的文章
[Pan, Yuchen]的文章
[Shang, Yuanyuan]的文章
[Wang, Wei]的文章
必应学术
必应学术中相似的文章
[Pan, Yuchen]的文章
[Shang, Yuanyuan]的文章
[Wang, Wei]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。