Speech/Music Discrimination Using Hybrid-Based Feature Extraction for Audio Data Indexing

Kun-Ching Wang,Yung-Ming Yang, Ying-Ru Yang

2017 International Conference on System Science and Engineering (ICSSE)（2017）

引用 6|浏览1

暂无评分

摘要

In this paper, we present a speech/music discrimination (SMD) using hybrid manner of feature extraction to discriminate the noisy audio signal into speech and music. The hybrid-based SMD performs the combination of 1D signal processing and 2D image processing to extract multiple features. In general, the noisy audio segment can be regarded as music, speech or noise (silence). The proposed hybrid-based SMD approach has been successfully applied into audio data indexing to classify the noisy audio signal into speech, music and noise. The approach includes three main stages: pre-processing/voice activity detection (VAD), speech/music discrimination (SMD) and rule-based post-processing. Both of pre-processing and VAD are regarded as the first stage for discriminating audio recording stream into noise-only segments and noisy audio segments. Next, the hybrid-based SMD is regarded as the second stage to classify noisy audio segments into speech segments and music segments. In third stage, a rule-based post-filtering method will be applied in order to improve the discrimination accuracy and to reflect the continuity of audio data in time. Experimental results will show that the proposed hybrid-based SMD approach can successfully apply into the audio data indexing. The overall system accuracy will be evaluated on radio recordings from various sources. Performance results can provide significant classification for the envisaged tasks compared to existing methods is given.

查看译文

关键词

spectrogram image,speech/music classification,wavelet packet,support vector machine,hybrid-based feature extraction

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要