NAS-DYMC: NAS-Based Dynamic Multi-Scale Convolutional Neural Network for Sound Event Detection

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览9
暂无评分
摘要
CNN+RNN models have become the mainstream approach for semi-supervised sound event detection, and the CNN part is mainly a stack of several 2D convolutional layers to capture the representations of the time-frequency features. However, conventional 2D convolution is of limited ability in capturing detailed information about acoustic events. In this paper, to enhance the representation ability of CNN, we propose NAS-DYMC, a NAS-based dynamic multi-scale convolutional neural network to extract a more effective acoustic representation. Specifically, multi-scale convolution can capture the characteristics of sound events with different time-frequency distributions and dynamic convolution enhances the representation capability of conventional convolution by adapting attention weights onto basis kernels. Furthermore, a neural architecture search (NAS) method is adopted to find the optimal network architecture from the search space consisting of various dynamic multi-scale convolutions for the DCASE 2021 Task4 dataset. Experimental results demonstrate the superiority of our proposed method.
更多
查看译文
关键词
dynamic convolution,multi-scale convolution,semi-supervised sound event detection,neural architecture search
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要