Scaling Up Online Speech Recognition Using ConvNets
INTERSPEECH, pp. 3376-3380, 2020.
We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC). We improve the core TDS architecture in order to limit the future context and hence reduce latency while maintaining accuracy. The system has almost three times the throughput of a we...More
PPT (Upload PPT)