DNN-based Emotion Recognition Based on Bottleneck Acoustic Features and Lexical Features

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)（2019）

引用 50|浏览16

暂无评分

摘要

In this paper, we propose a novel emotion recognition method to reflect affect salient information using acoustic and lexical features. The acoustic features are extracted from the speech signal by applying statistical functionals of emotionally high-level features derived from Deep Neural Network (DNN). These acoustic features are early fused with two types of lexical features extracted from the text transcription of the speech signal, which are the distributed representation and affective lexicon-based dimensions. The fused features are fed to another DNN for utterance-level emotion classification. Experimental results on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) multimodal dataset showed 75.5% in unweighted accuracy recall, which outperformed the best results reported previously in the multimodal emotion recognition using acoustic and lexical features.

查看译文

关键词

Multimodal emotion recognition,DNN-based emotion recognition,Acoustic feature,Lexical feature

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要