Multi-Level Local Feature Coding Fusion for Music Genre Recognition

IEEE ACCESS(2020)

引用 19|浏览17
暂无评分
摘要
Music genre recognition (MGR) plays a fundamental role in the context of music indexing and retrieval. Unlike images, music genres consist of immediate characteristics that are highly diversified with abstractions in different levels. However, most representation learning methods for MGR focus on global features and make decisions from features in the same level. To remedy such defects, we intergrate a convolutional neural network (CNN) with NetVLAD and self-attention to capture the local information across levels and learn their long-term dependencies. A meta classifier is used to make the final MGR classification by learning from aggregated high-level features from different local feature coding networks. Experimental results show that the proposed approach yields higher accuracies than other state-of-the-art models on GTZAN, ISMIR2004, and Extended Ballroom dataset.
更多
查看译文
关键词
Music genre recognition,NetVLAD,self-attention,convolutional neural network,representation learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要