NAS-DYMC: NAS-Based Dynamic Multi-Scale Convolutional Neural Network for Sound Event Detection

Jun Wang,Peng Yao,Feng Deng,Jianchao Tan,Chengru Song,Xiaorui Wang

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)（2023）

引用 0|浏览9

暂无评分

摘要

CNN+RNN models have become the mainstream approach for semi-supervised sound event detection, and the CNN part is mainly a stack of several 2D convolutional layers to capture the representations of the time-frequency features. However, conventional 2D convolution is of limited ability in capturing detailed information about acoustic events. In this paper, to enhance the representation ability of CNN, we propose NAS-DYMC, a NAS-based dynamic multi-scale convolutional neural network to extract a more effective acoustic representation. Specifically, multi-scale convolution can capture the characteristics of sound events with different time-frequency distributions and dynamic convolution enhances the representation capability of conventional convolution by adapting attention weights onto basis kernels. Furthermore, a neural architecture search (NAS) method is adopted to find the optimal network architecture from the search space consisting of various dynamic multi-scale convolutions for the DCASE 2021 Task4 dataset. Experimental results demonstrate the superiority of our proposed method.

查看译文

关键词

dynamic convolution,multi-scale convolution,semi-supervised sound event detection,neural architecture search

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要