Convolutional networks for speech detection

Somsak Sukittanon,Arun C. Surendran,John C. Platt,Christopher J. C. Burges

INTERSPEECH（2004）

引用 61|浏览46

暂无评分

摘要

In this paper, we introduce a new framework for speech detection using convolutional networks. We propose a network architecture that can incorporate long and short-term temporal and spectral cor- relations of speech in the detection process. The proposed design is able to address many shortcomings of existing speech detectors in a unified new framework: First, itimproves the robustness of the system to environmental variability while still being fast to evalu- ate. Second, it allows for a framework that is extendable to work under different time-scales for different applications. Finally, it is discriminative and produces reliable estimates of the probability of presence of speech in each frame for a wide variety of noise con- ditions. We propose that the inputs to the system be features that are measures of the true signal-to-noise ratio of a set of frequency bands of the signal. These can be easily and automatically gener- ated by tracking thenoise spectrum online. We present preliminary results on the AURORA database to demonstrate the effectiveness of the detector over conventional Gaussian detectors.

查看译文

关键词

network architecture,spectrum,signal to noise ratio,speech detection

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要