FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Abstract:
Streaming automatic speech recognition (ASR) aims to emit each hypothesized word as quickly and accurately as possible. However, emitting fast without degrading quality, as measured by word error rate (WER), is highly challenging. Existing approaches including Early and Late Penalties and Constrained Alignments penalize emission delay b...More
Code:
Data:
Full Text
Tags
Comments