Time-and-Frequency Fusion based on Multi-scale convolution for Speech separation in intelligent manufacturing

Linfeng Jia,Liming Wu,Tengteng Wen, Yutao Liao, Li Wang,Zihao Gao

2022 International Conference on Intelligent Manufacturing, Advanced Sensing and Big Data (IMASBD)（2022）

引用 0|浏览4

暂无评分

摘要

In intelligent manufacturing technology, machine with speech separation ability can effectively improve the efficiency of human-computer interaction, which is conducive to the rapid development of intelligent manufacturing industry. In single-channel speech separation based on deep learning, the performance of time domain features is better than that of frequency domain features. However, the current methods based on time domain feature have poor robustness in real noise environment, and time domain feature has limitations on the performance of the separation model. Therefore, we propose a Time-and Frequency fusion based on multi-scale convolution model(Tff-MscNet), which integrates time domain features and frequency domain features to improve multidimensional information of data. In order to further improve the performance of separation network, we introduced multiscale convolution block to improve the feature extraction ability of the network. We compare with the Conv-TasNet baseline model and the latest time-frequency fusion speech separation baseline model in GRID speech dataset. Experiments show that the performance and robustness of the proposed method are improved greatly in the experimental environment with real noise.

查看译文

关键词

Intelligent manufacturing,speech separation,feature fusion,Multiscale convolution,Time-frequency domain features

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要