Semi-supervised Learning with Generative Adversarial Networks for Arabic Dialect Identification
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2019)
摘要
Dialect Identification (DID) refers to the process of identifying different dialects within the same language class. Compared with more general language identification (LID), DID is a more challenging task because of the substantial similarity between dialects. For an i-vector based LID/DID, prior studies have shown advancements with deep neural networks (DNNs) over Gaussian Mixture Models (GMMs) in acoustic modeling. In this study, a novel i-vector representation which is based on unsupervised bottleneck features is examined as the feature to identify dialects from Arabic broadcast speech. To utilize the unlabeled training data, semi-supervised learning with generative adversarial networks (GANs) are incorporated in the back-end classifier development. Experiments with the proposed method in the third release version of the Multi-Genre Broadcast (MGB-3) Challenge yields the best single system performance among all submitted systems. An overall classification accuracy of 73.8% achieves a +28.8% relative improvement over the MGB-3 baseline with an accuracy of 57.3%, which is the state-of-the-art performance in this DID task. The fused system further achieves an improvement of +39.4% in accuracy.
更多查看译文
关键词
Semi-supervised learning, language identification, generative adversarial networks, i-vector, Arabic Dialect identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要