Chrome Extension
WeChat Mini Program
Use on ChatGLM

WA-Transformer: Window Attention-based Transformer with Two-stage Strategy for Multi-task Audio Source Separation

Conference of the International Speech Communication Association (INTERSPEECH)(2022)

Cited 0|Views22
No score
Abstract
The standard Conformer adopts convolution layers to exploit local features. However, the one-dimensional convolution ignores the correlation of adjacent time-frequency features. In this paper, we design a two-dimensional window attention block with dilation, and then we propose a window attention-based Transformer network (named WA-Transformer) for multi-task audio source separation. The proposed WA-Transformer adopts self-attention and window attention blocks to model global dependencies and local correlation in a parameter-efficient way. Besides, it follows a two-stage pipeline, in which the first stage separates the mixture and outputs the three types of audio signals, and the second stage performs signal compensation. Experiments demonstrate the effectiveness of WA-Transformer. WA-Transformer achieves 13.86 dB, 12.22 dB, 11.21 dB signal-to-distortion ratio improvement on speech, music, noise track, respectively, and advantages over several well-known models.
More
Translated text
Key words
multi-task audio source separation, WA-Transformer, two-dimensional window attention
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined