Language recognition using deep-structured conditional random fields
ICASSP(2010)
摘要
We present a novel language identification technique using our recently developed deep-structured conditional random fields (CRFs). The deep-structured CRF is a multi-layer CRF model in which each higher layer's input observation sequence consists of the lower layer's observation sequence and the resulting lower layer's frame-level marginal probabilities. In this paper we extend the original deep-structured CRF by allowing for distinct state representations at different layers and demonstrate its benefits. We propose an unsupervised algorithm to pre-train the intermediate layers by casting it as a multi-objective programming problem that is aimed at minimizing the average frame-level conditional entropy while maximizing the state occupation entropy. Empirical evaluation on a seven-language/dialect voice mail routing task showed that our approach can achieve a routing accuracy (RA) of 86.4% and average equal error rate (EER) of 6.6%. These results are significantly better than the 82.5% RA and 7.5% average EER obtained using the Gaussian mixture model trained with the maximum mutual information criterion but slightly worse than the 87.7% RA and 6.4% EER achieved using the support vector machine with model pushing on the Gaussian super vector (GSV).
更多查看译文
关键词
observation sequence,speech recognition,random processes,language recognition,voice mail,language identification,gaussian super vector,equal error rate,index terms — language identification,deep-structure,state occupation entropy,deep-structured conditional random fields,deep learning,support vector machine,conditional random field,voice mail routing task,multiobjective programming problem,language identification technique,frame-level conditional entropy,entropy,frame-level marginal probabilities,unsupervised learning,support vector machines,conditional entropy,mutual information,deep structure,indexing terms,routing,gaussian mixture model,hidden markov models,casting,accuracy,automatic speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要