Semi-supervised Learning with Generative Adversarial Networks for Arabic Dialect Identification

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2019)

引用 13|浏览39
暂无评分
摘要
Dialect Identification (DID) refers to the process of identifying different dialects within the same language class. Compared with more general language identification (LID), DID is a more challenging task because of the substantial similarity between dialects. For an i-vector based LID/DID, prior studies have shown advancements with deep neural networks (DNNs) over Gaussian Mixture Models (GMMs) in acoustic modeling. In this study, a novel i-vector representation which is based on unsupervised bottleneck features is examined as the feature to identify dialects from Arabic broadcast speech. To utilize the unlabeled training data, semi-supervised learning with generative adversarial networks (GANs) are incorporated in the back-end classifier development. Experiments with the proposed method in the third release version of the Multi-Genre Broadcast (MGB-3) Challenge yields the best single system performance among all submitted systems. An overall classification accuracy of 73.8% achieves a +28.8% relative improvement over the MGB-3 baseline with an accuracy of 57.3%, which is the state-of-the-art performance in this DID task. The fused system further achieves an improvement of +39.4% in accuracy.
更多
查看译文
关键词
Semi-supervised learning, language identification, generative adversarial networks, i-vector, Arabic Dialect identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要