Phonetic-Oriented Identification Of Twin Speakers Using 4-Second Vowel Sounds And A Combination Of A Shift-Invariant Phase Feature (Nrd), Mfccs And F0 Information

2019 AES INTERNATIONAL CONFERENCE ON AUDIO FORENSICS(2019)

引用 0|浏览1
暂无评分
摘要
Automatic speaker identification typically relies on sophisticated statistical modeling and classification which requires large amounts of data for good performance. However, in actual audio forensics casework, frequently only a few seconds of speech material are available. In this paper, we favor diversity in feature extraction, simple modeling and classification, and constructive combination of congruent classification scores. We use phase, spectral magnitude and F0-related features in speaker identification experiments on a database of 35 speakers most of whom are twins. Using only 4.4 sec. of vowel-like sounds per speaker, we characterize the performance that is reached with individual features and we characterize simple and yet effective ways of classification score fusion. Insights for further research are also presented.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要