Multitask Training with Text Data for End-to-End Speech Recognition
Abstract:
We propose a multitask training method for attention-based end-to-end speech recognition models to better incorporate language level information. We regularize the decoder in a sequence-to-sequence architecture by multitask training it on both the speech recognition task and a next-token prediction language modeling task. Trained on eit...More
Code:
Data:
Full Text
Tags
Comments