A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition | |
Liu, Yang2; Xia, Yuqi2; Sun, Haoqin2; Meng, Xiaolei2; Bai, Jianxiong2; Guan, Wenbo2; Zhao, Zhen2; LI, Yongwei1 | |
刊名 | IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES |
2023-06-01 | |
卷号 | E106A期号:6页码:876-885 |
关键词 | speech emotion recognition non-personalized features cascaded attention network multitask learning self-adaption loss |
ISSN号 | 0916-8508 |
DOI | 10.1587/transfun.2022EAP1091 |
通讯作者 | Zhao, Zhen(zzqust@126.com) |
英文摘要 | Speech emotion recognition (SER) has been a complex and difficult task for a long time due to emotional complexity. In this paper, we propose a multitask deep learning approach based on cascaded attention network and self-adaption loss for SER. First, non-personalized features are extracted to represent the process of emotion change while reducing external variables' influence. Second, to highlight salient speech emotion features, a cascade attention network is proposed, where spatial temporal attention can effectively locate the regions of speech that express emotion, while self-attention reduces the dependence on external information. Finally, the influence brought by the differences in gender and human perception of ex-ternal information is alleviated by using a multitask learning strategy, where a self-adaption loss is introduced to determine the weights of different tasks dynamically. Experimental results on IEMOCAP dataset demonstrate that our method gains an absolute improvement of 1.97% and 0.91% over state-of-the-art strategies in terms of weighted accuracy (WA) and unweighted accuracy (UA), respectively. |
资助项目 | National Natural Science Foundation of China (NSFC)[62201314] ; National Natural Science Foundation of China (NSFC)[62201571] ; Natural Science Foundation of Shandong Province[ZR2020QF007] ; Key Technology Tackling and Industrialization Demonstration projects of Qingdao[23-1-2-qdjh-18-gx] |
WOS关键词 | NEURAL-NETWORK |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
出版者 | IEICE-INST ELECTRONICS INFORMATION COMMUNICATION ENGINEERS |
WOS记录号 | WOS:001018846400001 |
资助机构 | National Natural Science Foundation of China (NSFC) ; Natural Science Foundation of Shandong Province ; Key Technology Tackling and Industrialization Demonstration projects of Qingdao |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/53585] |
专题 | 模式识别国家重点实验室_智能交互 |
通讯作者 | Zhao, Zhen |
作者单位 | 1.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100089, Peoples R China 2.Qingdao Univ Sci & Technol, Sch Informat Sci & Technol, Qingdao 266061, Peoples R China |
推荐引用方式 GB/T 7714 | Liu, Yang,Xia, Yuqi,Sun, Haoqin,et al. A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES,2023,E106A(6):876-885. |
APA | Liu, Yang.,Xia, Yuqi.,Sun, Haoqin.,Meng, Xiaolei.,Bai, Jianxiong.,...&LI, Yongwei.(2023).A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition.IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES,E106A(6),876-885. |
MLA | Liu, Yang,et al."A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition".IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E106A.6(2023):876-885. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论