Sound Event Detection by Multitask Learning of Sound Events and Scenes with Soft Scene Labels

Keisuke Imoto,Noriyuki Tonami,Yuma Koizumi,Masahiro Yasuda,Ryosuke Yamanishi,Yoichi Yamashita

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING（2020）

引用 44|浏览20

暂无评分

摘要

Sound event detection (SED) and acoustic scene classification (ASC) are major tasks in environmental sound analysis. Considering that sound events and scenes are closely related to each other, some works have addressed joint analyses of sound events and acoustic scenes based on multitask learning (MTL), in which the knowledge of sound events and scenes can help in estimating them mutually. The conventional MTL-based methods utilize one-hot scene labels to train the relationship between sound events and scenes; thus, the conventional methods cannot model the extent to which sound events and scenes are related. However, in the real environment, common sound events may occur in some acoustic scenes; on the other hand, some sound events occur only in a limited acoustic scene. In this paper, we thus propose a new method for SED based on MTL of SED and ASC using the soft labels of acoustic scenes, which enable us to model the extent to which sound events and scenes are related. Experiments conducted using TUT Sound Events 2016/2017 and TUT Acoustic Scenes 2016 datasets show that the proposed method improves the SED performance by 3.80% in F-score compared with conventional MTL-based SED.

查看译文

关键词

Sound event detection, multitask learning, teacher-student learning, acoustic scene classification

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要