An MVDR-Embedded U-Net Beamformer for Effective and Robust Multichannel Speech Enhancement

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览5
暂无评分
摘要
In multichannel speech enhancement (SE) systems, deep neural networks (DNNs) are often utilized to directly estimate the clean speech for effective beamforming. This approach, however, may not generalize adequately to new acoustic or noise conditions. Alternatively, DNNs can indirectly perform SE by predicting the time-frequency masks of speech and noise patterns to assist classic statistical beamformers. Despite being robust, its effectiveness is constrained by the later statistical component relying on certain modeling assumptions, e.g., covariance-based modeling in the minimum-variance-distortionless-response (MVDR) beamformer. In this paper, we propose a novel integration of the two types of methodology, by introducing an intra-MVDR module embedded in the U-Net beamformer, that encompasses the merits of both, i.e., effectiveness and robustness. Experiments show that intra-MVDR leads to improvements that are not achievable by simply enlarging the baseline SE network.
更多
查看译文
关键词
Multichannel speech enhancement,neural beamforming,MVDR,time-frequency mask,spatial filtering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要