Pre-processing spectrogram parameters improve the accuracy of bioacoustic classification using convolutional neural networks

BIOACOUSTICS-THE INTERNATIONAL JOURNAL OF ANIMAL SOUND AND ITS RECORDING(2020)

引用 32|浏览18
暂无评分
摘要
A variety of automated classification approaches have been developed to extract species detection information from large bioacoustic datasets. Convolutional neural networks (CNNs) are an image classification technique that can be operated on the spectrogram of an audio recording. Using CNNs for bioacoustic classification negates the need for sophisticated feature extraction techniques; however, CNNs may be sensitive to the parameters used to create spectrograms. We used AlexNet to classify spectrograms of audio clips from 19 species of birdsong. We trained and tested AlexNet with the spectrograms and observed that mean classification accuracy ranged from 88.9% to 96.9% depending on the parameters used to create the spectrogram. Classification accuracy was highest when we used a composite of four spectrograms with different combinations of scales for frequency and amplitude. Classification accuracy also varied depending on the FFT window size of the spectrogram. Overall, our results suggest that optimal spectrogram parameters for CNN classification may differ from those used for human visualization or other classification approaches. We suggest that if spectrogram parameters are appropriately selected, classification accuracy similar to current state-of-the-art methods can be achieved using off-the-shelf software and without the need to extract domain-specific features.
更多
查看译文
关键词
Autonomous recording unit,birdsong,classification,signal processing,machine learning,spectrogram
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要