Large Margin Training for Attention Based End-to-End Speech Recognition

INTERSPEECH（2019）

引用 7|浏览35

暂无评分

摘要

A method of attention-based end-to-end (E2E) automatic speech recognition (ASR) training, includes performing cross-entropy training of a model, based on one or more input features of a speech signal, performing beam searching of the model of which the cross-entropy training is performed, to generate an n-best hypotheses list of output hypotheses, and determining a one-best hypothesis among the generated n-best hypotheses list. The method further includes determining a character-based gradient and a word-based gradient, based on the model of which the cross-entropy training is performed and a loss function in which a distance between a reference sequence and the determined one-best hypothesis is maximized, and performing backpropagation of the determined character-based gradient and the determined word-based gradient to the model, to update the model.

查看译文

关键词

large margin, attention based end-to-end speech recognition, MBR training, MAP evaluation, Switchboard

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要