Evaluating The Robustness Of Privacy-Sensitive Audio Features For Speech Detection In Personal Audio Log Scenarios

Sree Hari Krishnan Parthasarathi,Mathew Magimai-Doss,Herve Bourlard,Daniel Gatica-Perez

Acoustics Speech and Signal Processing（2010）

引用 13|浏览43

暂无评分

摘要

Personal audio logs are often recorded in multiple environments. This poses challenges for robust front-end processing, including speech/nonspeech detection (SND). Motivated by this, we investigate the robustness of four different privacy-sensitive features for SND, namely energy, zero crossing rate, spectral flatness, and kurtosis. We study early and late fusion of these features in conjunction with modeling temporal context. These combinations are evaluated in mismatched conditions on a dataset of nearly 450 hours. While both combinations yield improvements over individual features, generally feature combinations perform better. Comparisons with a state-of-the-art spectral based and a privacy-sensitive feature set are also provided.

查看译文

关键词

Privacy Sensitive Features,Speech/nonspeech detection

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要