Speech emotion recognition based on multi‐feature and multi‐lingual fusion

Chunyi Wang,Ying Ren,Na Zhang,Fuwei Cui,Shiying Luo

MULTIMEDIA TOOLS AND APPLICATIONS（2021）

引用 18|浏览19

暂无评分

摘要

A speech emotion recognition algorithm based on multi-feature and Multi-lingual fusion is proposed in order to resolve low recognition accuracy caused bylack of large speech dataset and low robustness of acoustic features in the recognition of speech emotion. First, handcrafted and deep automatic features are extractedfrom existing data in Chinese and English speech emotions. Then, the various features are fused respectively. Finally, the fused features of different languages are fused again and trained in a classification model. Distinguishing the fused features with the unfused ones, the results manifest that the fused features significantly enhance the accuracy of speech emotion recognition algorithm. The proposedsolution is evaluated on the two Chinese corpus and two English corpus, and isshown to provide more accurate predictions compared to original solution. As a result of this study, the multi-feature and Multi-lingual fusion algorithm can significantly improve the speech emotion recognition accuracy when the dataset is small.

查看译文

关键词

Speech emotion recognition,Feature extraction,Multi‐feature fusion,Multi‐lingual fusion,Deep neural networks (DNN)

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要