Performance evaluation of psycho-acoustically motivated front-end compensator for TIMIT phone recognition

Pattern Analysis and Applications(2019)

引用 3|浏览5
暂无评分
摘要
Wavelet-based front-end processing technique has gained popularity for its noise removing capability. In this paper, a robust automatic speech recognition system is proposed by utilizing the advantages of psycho-acoustically motivated wavelet-based front-end compensator. In the front-end compensator block, voiced speech probability-based voice activity detector system is designed to separate voiced and unvoiced frames and to update noise statistics. The wavelet packet decomposition tree is designed according to equal rectangular bandwidth (ERB) scale. Wavelet decomposition based on ERB scale is utilized here as the central frequency of the ERB distribution resembles frequency response of human cochlea. Voiced and unvoiced frames are separately decomposed into 24 sub-bands to estimate average sub-band energy (ASE) of each frame. ASE is then used to calculate threshold value. Lastly, Wiener filtering is employed for reducing the residual noise before final reconstruction stage. The proposed system is evaluated on TIMIT database under various noise conditions. The phoneme recognition accuracy of the proposed system is compared with different baseline and robust features as well as with existing front-end compensation techniques. Additionally, the proposed front-end compensator is evaluated in terms of phoneme classification accuracy. Performance improvement is observed in all above experiments.
更多
查看译文
关键词
VSP, Wavelet decomposition, ASR, Front-end compensator, PRA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要