Hybrid HMM / Neural Network based Speech Recognition in Loquendo ASR
msra(2006)
摘要
This paper describes hybrid Hidden Markov Models / Artificial Neural Networks (HMM/ANN) models devoted to speech recognition, and in particular Loquendo HMM/ANN, that is the core of Loquendo ASR. While Hidden Markov Models (HMM) is a dominant approach in most state-of-the-art speaker-independent, continuous speech recognition systems (and commercial products), Artificial Neural Networks (ANN) are universally known as one the most powerful nonlinear methods for pattern recognition, time series prediction, optimization and forecasting. Hybrid HMM/ANN, introduced in the nineties for speech recognition, is presently a very competitive alternative to HMM, both in terms of performances and recognition accuracy. HMM/ANN combines the advantages of both approaches by using an ANN (a multilayer perceptron) to estimate the state dependent observation probabilities of a HMM, instead of Gaussian mixtures, while the temporal aspects of speech are dealt with by left-to-right HMM models. HMM/ANN can provide discriminative training, are capable of incorporating multiple input sources, and have a flexible architecture which can easily accommodate contextual inputs and feedbacks. Furthermore, ANN are typically highly parallel and regular structures, which makes them especially suited for high-performance architectures and optimized implementations.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络