Large Margin Training for Attention Based End-to-End Speech Recognition

INTERSPEECH(2019)

引用 7|浏览35
暂无评分
摘要
A method of attention-based end-to-end (E2E) automatic speech recognition (ASR) training, includes performing cross-entropy training of a model, based on one or more input features of a speech signal, performing beam searching of the model of which the cross-entropy training is performed, to generate an n-best hypotheses list of output hypotheses, and determining a one-best hypothesis among the generated n-best hypotheses list. The method further includes determining a character-based gradient and a word-based gradient, based on the model of which the cross-entropy training is performed and a loss function in which a distance between a reference sequence and the determined one-best hypothesis is maximized, and performing backpropagation of the determined character-based gradient and the determined word-based gradient to the model, to update the model.
更多
查看译文
关键词
large margin, attention based end-to-end speech recognition, MBR training, MAP evaluation, Switchboard
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要