slimIPL - Language-Model-Free Iterative Pseudo-Labeling.

Interspeech(2021)

引用 9|浏览123
暂无评分
摘要
Recent results in end-to-end ASR have demonstrated the efficacy of simple pseudo-labeling for semi-supervised models trained both with Connectionist Temporal Classification (CTC) and Sequence-to-Sequence (seq2seq) losses. Iterative Pseudo-Labeling (IPL), which continuously trains a single model using pseudo-labels iteratively re-generated as the model learns, has been shown to further increase performance in ASR. We improve upon the IPL algorithm: as the model learns, we propose to iteratively re-generate transcriptions with hard labels (the most probable tokens) assignments, that is without a language model. We call this approach Language-Model-Free IPL (slimIPL) and we give a resultant training setup for CTC and seq2seq models. At inference, our experiments show that decoding with a strong language model is more beneficial with slimIPL than IPL, asIPL exhibits some language model over-fitting issues. Compared to prior work on semi-supervised and unsupervised approaches, slimIPL not only simplifies the training process, but also achieves competitive and state-of-the-art results on LibriSpeech test sets in both standard and low-resource settings.
更多
查看译文
关键词
deep learning,semi-supervised learning,pseudo-labeling,self-training,speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要