Cambridge university transcription systems for the multi-genre broadcast challenge

Philip C. Woodland,Xunying Liu,Yanmin Qian,Chao Zhang,Mark J. F. Gales,Penny Karanasou,Pierre Lanchantin,Linlin Wang

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)（2015）

引用 38|浏览66

暂无评分

摘要

We describe the development of our speech-to-text transcription systems for the 2015 Multi-Genre Broadcast (MGB) challenge. Key features of the systems are: a segmentation system based on deep neural networks (DNNs); the use of HTK 3.5 for building DNN-based hybrid and tandem acoustic models and the use of these models in a joint decoding framework; techniques for adaptation of DNN based acoustic models including parameterised activation function adaptation; alternative acoustic models built using Kaldi; and recurrent neural network language models (RNNLMs) and RNNLM adaptation. The same language models were used with both HTK and Kaldi acoustic models and various combined systems built. The final systems had the lowest error rates on the evaluation data.

查看译文

关键词

Speech recognition,broadcast transcription,deep neural networks,HTK,Kaldi

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要