Hidden Conditional Random Fields for phone recognition

ASRU(2009)

引用 69|浏览40
暂无评分
摘要
We apply Hidden Conditional Random Fields (HCRFs) to the task of TIMIT phone recognition. HCRFs are discriminatively trained sequence models that augment conditional random fields with hidden states that are capable of representing subphones and mixture components. We extend HCRFs, which had previously only been applied to phone classification with known boundaries, to recognize continuous phone sequences. We use an N-best inference algorithm in both learning (to approximate all competitor phone sequences) and decoding (to marginalize over hidden states). Our monophone HCRFs achieve 28.3% phone error rate, outperforming maximum likelihood trained HMMs by 3.6%, maximum mutual information trained HMMs by 2.5%, and minimum phone error trained HMMs by 2.2%. We show that this win is partially due to HCRFs' ability to simultaneously optimize discriminative language models and acoustic models, a powerful property that has important implications for speech recognition.
更多
查看译文
关键词
telephone sets,speech recognition,acoustic models,hidden conditional random fields,phone recognition,n-best inference algorithm,discriminative language models,phone error rate,maximum likelihood,decoding,speech,data mining,acoustics,feature extraction,conditional random field,hidden markov models,error rate,language model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要