A Fully Convolutional Neural Network For Complex Spectrogram Processing In Speech Enhancement

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2019)

引用 57|浏览21
暂无评分
摘要
In this paper we propose a fully convolutional neural network (CNN) for complex spectrogram processing in speech enhancement. The proposed CNN consists of one-dimensional (1-d) convolution and frequency-dilated 2-d convolution, and incorporates a residual learning and skip-connection structure. Compared with the state-of-the-art, the proposed CNN achieves a better performance with fewer parameters. Experiments have shown that the complex spectrogram processing is effective in terms of phase estimation, which benefits the reconstruction of clean speech especially in the female speech case. It is also demonstrated that the model yields a convincing performance with small memory footprint when the number of parameters is limited.
更多
查看译文
关键词
speech denosing, complex spectrogram, phase processing, frequency dilation, fully convolutional neural network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要