Spectral warping based data augmentation for low resource children’s speaker verification

MULTIMEDIA TOOLS AND APPLICATIONS(2023)

引用 0|浏览1
暂无评分
摘要
In this paper, we present our effort to develop an automatic speaker verification (ASV) system for low resources children’s data. For the children’s speakers, very limited amount of speech data is available in majority of the languages for training the ASV system. Developing an ASV system under low resource conditions is a very challenging problem. To develop the robust baseline system, we merged out of domain adults’ data with children’s data to train the ASV system and tested with children’s speech. This kind of system leads to acoustic mismatches between training and testing data. To overcome this issue, we have proposed spectral warping based data augmentation. We modified adult speech data using spectral warping method (to simulate like children’s speech) and added it to the training data to overcome data scarcity and mismatch between adults’ and children’s speech. The proposed data augmentation gives 20.46% and 52.52% relative improvement (in equal error rate) for Indian Punjabi and British English speech databases, respectively. We compared our proposed method with well known data augmentation methods: SpecAugment, speed perturbation (SP) and vocal tract length perturbation (VTLP), and found that the proposed method performed best. The proposed spectral warping method is publicly available at https://github.com/kathania/Speaker-Verification-spectral-warping .
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要