Cp-Gan: Context Pyramid Generative Adversarial Network For Speech Enhancement

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING(2020)

引用 36|浏览83
暂无评分
摘要
The topic of speech enhancement has been largely improved recently, especially with the development of generative adversarial networks (GANs). However prior methods simply follow the GAN architectures from computer vision tasks without specific designs for the speech enhancement according to the audio characteristics (i.e., different granularity context), which may leave noise points in some segments or disturb the contents of the original audio. In this work, we make the first attempt to explore the global and local speech features for coarse-to-fine speech enhancement and introduce a Context Pyramid Generative Adversarial Network (CP-GAN), which contains a densely-connected feature pyramid generator and a dynamic context granularity discriminator to better eliminate audio noise hierarchically. Extensive experiments demonstrate that our CP-GAN effectively achieves state-of-the-art speech enhancement results and boosts the performance of more high-level speech tasks including automatic speech recognition and speaker recognition.
更多
查看译文
关键词
speech enhancement, generative adversarial network, context pyramid
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要