A Joint Network Based on Interactive Attention for Speech Emotion Recognition

Ying Hu, Shijing Hou, Huamin Yang,Hao Huang,Liang He

2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME(2023)

引用 0|浏览13
暂无评分
摘要
Speech emotion recognition (SER) has played a vital role in human-machine interaction. In this paper, we propose a separate spectrum-based SER model and a joint network combining pre-trained and spectrum-based models. In the joint network, we design an interactive attention module to effectively fuse the intermediate features from two models. Our proposed separate spectrum-based model is superior to four compared spectrum-based methods under the speaker-dependent setting. For the application in real scenarios, we compared our proposed joint network with six methods utilizing the pre-trained model under the speaker-independent setting. Experimental results show that our proposed joint network achieves the best performance among four unimodal models on the unweighted accuracy (UA) of 73.32 % and weighted accuracy (WA) of 72.48 %, respectively.
更多
查看译文
关键词
Speech emotion recognition, joint network, interactive attention, speaker-independent
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要