Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research GroupsEIWOS
Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models (GMMs) to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden lay...更多
- 2Ignacio Lopez-Moreno, Javier Gonzalez-Dominguez, David Martinez, Oldrich Plchot, Joaquin Gonzalez-Rodriguez, Pedro J. Moreno. On the use of deep feedforward neural networks for automatic language identification.Computer Speech & Language, pp. 46-59, 2016.
- 4Siniscalchi, S.M.; Dong Yu; Li Deng; Chin-Hui Lee. Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Model.Signal Processing Letters, IEEE, pp. 201-204, 2013.
- 6Abdel-Hamid, O.; Mohamed, A.-R.; Hui Jiang; Li Deng. Convolutional Neural Networks for Speech Recognition.Audio, Speech, and Language Processing, IEEE/ACM Transactions , pp. 1533-1545, 2014.
- 7Zhang, C.; Woodland, P.C.. Standalone training of context-dependent deep neural network acoustic models.Acoustics, Speech and Signal Processing, pp. 5597-5601, 2014.
- 8Jianwei Niu, Yanmin Qian, Kai Yu. Acoustic emotion recognition using deep neural network.ISCSLP, pp. 128-132, 2014.
- 11Narayanan, A.; DeLiang Wang. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training.Audio, Speech, and Language Processing, IEEE/ACM Transactions , pp. 92-101, 2015.
- 13Arun Narayanan, DeLiang Wang. Improving robustness of deep neural network acoustic models via speech separation and joint adaptive training.IEEE/ACM Transactions on Audio, Speech & Language Processing, 2015.
- 15Sabato Marco Siniscalchi, Dong Yu, Li Deng, Chin-Hui Lee. Exploiting deep neural networks for detection-based speech recognition.Neurocomputing, pp. 148-157, 2013.
- 17G. E. Dahl, Dong Yu, Li Deng, A. Acero. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition.IEEE Transactions on Audio, Speech & Language Processing, pp. 30-42, 2012.
IEEE Signal Process. Mag., pp. 82-97, 2012.