Data Augmentation Using GANs for Speech Emotion Recognition

INTERSPEECH(2019)

引用 96|浏览162
暂无评分
摘要
In this work, we address the problem of data imbalance for the task of Speech Emotion Recognition (SER). We investigate conditioned data augmentation using Generative Adversarial Networks (GANs), in order to generate samples for underrepresented emotions. We adapt and improve a conditional GAN architecture to generate synthetic spectrograms for the minority class. For comparison purposes, we implement a series of signal-based data augmentation methods. The proposed GAN-based approach is evaluated on two datasets, namely IEMOCAP and FEEL-25k, a large multi-domain dataset. Results demonstrate a 10% relative performance improvement in IEMOCAP and 5% in FEEL- 25k, when augmenting the minority classes.
更多
查看译文
关键词
Generative Adversarial Networks, Speech Emotion Recognition, data augmentation, data imbalance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要