Lattice Generation in Attention-Based Speech Recognition Models

INTERSPEECH(2019)

引用 12|浏览34
暂无评分
摘要
Attention-based neural speech recognition models are frequently decoded with beam search, which produces a tree of hypotheses. In many cases, such as when using external language models, numerous decoding hypotheses need to be considered, requiring large beam sizes during decoding. We demonstrate that it is possible to merge certain nodes in a tree of hypotheses, in order to obtain a decoding lattice, which increases the number of decoding hypotheses without increasing the number of candidates that are scored by the neural network. We propose a convolutional architecture, which facilitates comparing states of the model at different pi The experiments are carried on the Wall Street Journal dataset, where the lattice decoder obtains lower word error rates with smaller beam sizes, than an otherwise similar architecture with regular beam search.
更多
查看译文
关键词
speech recognition, beam search, artificial neural networks, attention-based models, lattice generation, decoding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要