Neural-Free Attention for Monaural Speech Enhancement Toward Voice User Interface for Consumer Electronics

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS(2023)

引用 0|浏览4
暂无评分
摘要
The traditional graphic user interface in healthcare-oriented consumer electronics faced challenges such as high operational complexity, time-consuming operations, and a high risk of infection. The adoption of voice user interface (VUI) could promote network automation with enhanced efficiency, reduced simplicity and operating expense in various applications. Given noisy operational environments, speech enhancement acts as an indispensable component for VUIs towards consumer devices. Recently, attention mechanism is studied for speech enhancement and exhibits promising potential. In this paper, we propose a novel and effective attention module for speech enhancement, called neural-free attention (NFA), which is a lightweight and plug-and-play module that enables the backbone network to capture the energy distribution information of speech signals along frequency-wise channels. Particularly, NFA adopts a learnable Gaussian function to perform the excitation operation and produce the attention weights for each frequency channel. The NFA is comprehensively evaluated as part of the residual temporal convolution network (ResTCN) backbone network on two commonly used training targets. Experimental results show NFA substantially improves the ResTCN backbone in speech quality and intelligibility, with extremely low parameter overhead. Also, the ResTCN+NFA shows superiority over several recent baseline models, indicating the strong potential for VUIs toward consumer devices.
更多
查看译文
关键词
Speech enhancement,neural-free attention,temporal convolution network,consumer voice user interface
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要