A Hybrid Seq-2-Seq ASR Design for On-Device and Server Applications.

Cyril Allauzen,Ehsan Variani,Michael Riley,David Rybach,Hao Zhang

Interspeech（2021）

引用 1|浏览35

暂无评分

摘要

This paper proposes and evaluates alternative speech recognition design strategies using the hybrid autoregressive transducer (HAT) model. The different strategies are designed with special attention to the choice of modeling units and to the integration of different types of external language models during first-pass beam-search or second-pass re-scoring. These approaches are compared on a large-scale voice search task and the recognition quality over the head and tail of speech data is analyzed. Our experiments show decent improvements in WER over common speech phrases and significant gains on uncommon ones compared to the state-of-the-art approaches.

查看译文

关键词

speech recognition,modularity,sequence-to-sequence,tail distribution

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要