Semi-Blind Speech Enhancement Based On Recurrent Neural Network For Source Separation And Dereverberation

Masaya Wake,Yoshiaki Bando,Masato Mimura,Katsutoshi Itoyama,Kazuyoshi Yoshii,Tatsuya Kawahara

2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING（2017）

引用 1|浏览22

暂无评分

摘要

This paper describes a semi-blind speech enhancement method using a semi-blind recurrent neural network (SBRNN) for human-robot speech interaction. When a robot interacts with a human using speech signals, the robot inputs not only audio signals recorded by its own microphone but also speech signals made by the robot itself, which can be used for semi-blind speech enhancement. The SB-RNN consists of cascaded two modules: a semi-blind source separation module and a blind dereverberation module. Each module has a recurrent layer to capture the temporal correlations of speech signals. The SB-RNN is trained in a manner of multi-task learning, i.e., isolated echoic speech signals are used as teacher signals for the output of the separation module in addition to isolated unechoic signals for the output of the dereverberation module. Experimental results showed that the source to distortion ratio was improved by 2.30 dB on average compared to a conventional method based on a semi-blind independent component analysis. The results also showed the effectiveness of modularization of the network, multi-task learning, the recurrent structure, and semi-blind source separation.

查看译文

关键词

Semi-blind source separation, Blind dereverberation, Recurrent neural network

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要