Semi-supervised Learning with Generative Adversarial Networks for Arabic Dialect Identification

Chunlei Zhang,Qian Zhang,John H.L. Hansen

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)（2019）

引用 13|浏览39

暂无评分

摘要

Dialect Identification (DID) refers to the process of identifying different dialects within the same language class. Compared with more general language identification (LID), DID is a more challenging task because of the substantial similarity between dialects. For an i-vector based LID/DID, prior studies have shown advancements with deep neural networks (DNNs) over Gaussian Mixture Models (GMMs) in acoustic modeling. In this study, a novel i-vector representation which is based on unsupervised bottleneck features is examined as the feature to identify dialects from Arabic broadcast speech. To utilize the unlabeled training data, semi-supervised learning with generative adversarial networks (GANs) are incorporated in the back-end classifier development. Experiments with the proposed method in the third release version of the Multi-Genre Broadcast (MGB-3) Challenge yields the best single system performance among all submitted systems. An overall classification accuracy of 73.8% achieves a +28.8% relative improvement over the MGB-3 baseline with an accuracy of 57.3%, which is the state-of-the-art performance in this DID task. The fused system further achieves an improvement of +39.4% in accuracy.

查看译文

关键词

Semi-supervised learning, language identification, generative adversarial networks, i-vector, Arabic Dialect identification

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要