Contextualizing ASR Lattice Rescoring with Hybrid Pointer Network Language Model
INTERSPEECH, pp. 3650-3654, 2020.
Videos uploaded on social media are often accompanied with textual descriptions. In building automatic speech recognition (ASR) systems for videos, we can exploit the contextual information provided by such video metadata. In this paper, we explore ASR lattice rescoring by selectively attending to the video descriptions. We first use an...More
PPT (Upload PPT)