Separating musical sources with convolutional sparse coding
Proceedings of the 2nd International Conference on Applications of Intelligent Systems(2019)
摘要
The solution to the problem of separating multiple vocal and instrumental tracks from a single audio waveform is solved naturally by the human auditory cortex but has yet to be effectively implemented computationally. In this paper, we demonstrate a neurally-inspired approach to separating bass, drums, vocals and other instruments from sparse encodings of phase-rich Fourier and Constant-Q representations of stereo musical data. Our sparse encodings are generated from learned features that are both spectrally and temporally convolutional, similar to the hemispheric lateralization of human auditory cortex. We find that learning from neurally inspired Constant-Q representations provides better separation over Fourier spectrograms due to the presence of structure that is convolutional in log-frequency that aids in the differentiation of instruments.
更多查看译文
关键词
constant-Q transform, convolutional sparse coding, source separation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络