High quality agreement-based semi-supervised training data for acoustic modeling

2016 IEEE Spoken Language Technology Workshop (SLT)(2016)

引用 7|浏览87
暂无评分
摘要
This paper describes a new technique to automatically obtain large high-quality training speech corpora for acoustic modeling. Traditional approaches select utterances based on confidence thresholds and other heuristics. We propose instead to use an ensemble approach: we transcribe each utterance using several recognizers, and only keep those on which they agree. The recognizers we use are trained on data from different dialects of the same language, and this diversity leads them to make different mistakes in transcribing speech utterances. In this work we show, however, that when they agree, this is an extremely strong signal that the transcript is correct. This allows us to produce automatically transcribed speech corpora that are superior in transcript correctness even to those manually transcribed by humans. Furthermore, we show that using the produced semi-supervised data sets, we can train new acoustic models which outperform those trained solely on previously available data sets.
更多
查看译文
关键词
semi-supervised,agreement-based,ensemble,data selection,acoustic modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要