Exploiting Music Source Separation For Singing Voice Detection

Francesco Bonzi,Michele Mancusi, Simone Del Deo, Pierfrancesco Melucci, Maria Stella Tavella, Loreto Parisi,Emanuele Rodolá

2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP)(2023)

引用 0|浏览4
暂无评分
摘要
Singing voice detection (SVD) is an essential task in many music information retrieval (MIR) applications. Deep learning methods have shown promising results for SVD, but further performance improvements are desirable since it underlies many other tasks. This work proposes a novel SVD system combining a state-of-the-art music source separator (Demucs) with two downstream models: Long-term Recurrent Convolutional Network (LRCN) and a Transformer network. Our work highlights two main aspects: the impact of a music source separation model, such as Demucs, and its zero-shot capabilities for the SVD task; and the potential for deep learning to improve the system’s performance further. We evaluate our approach on three datasets (Jamendo Corpus, MedleyDB, and MIR-IK) and compare the performance of the two models to a baseline root mean square (RMS) algorithm and the current state-of-the-art for the Jamendo Corpus dataset.
更多
查看译文
关键词
Singing Voice Detection,Music Source Separation,Demucs,Zero-shot Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要