Comparing Different Flavors Of Spectro-Temporal Features For Asr

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5(2011)

引用 47|浏览73
暂无评分
摘要
In the last decade, several studies have shown that the robustness of ASR systems can be increased when 2D Gabor filters are used to extract specific modulation frequencies from the input pattern. This paper analyzes important design parameters for spectro-temporal features based on a Gabor filter bank: We perform experiments with filters that exhibit different phase sensitivity. Further, we analyze if non-linear weighting with a multi-layer perceptron (MLP) and a subsequent concatenation with mel-frequency cepstral coefficients (MFCCs) has beneficial effects. For the Aurora2 noisy digit recognition task, the use of phase sensitive filters improved the MFCC baseline, whereas using filters that neglect phase information did not. While MLP processing alone did not have a large effect on the overall performance, the best results were obtained for MLP-processed phase sensitive filters and added MFCCs, with relative error reductions of over 40% for both noisy and clean training.
更多
查看译文
关键词
spectro-temporal features, automatic speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要