CI-Mix: cut instance mix for robust speaker verification

International Journal of Speech Technology(2023)

引用 0|浏览2
暂无评分
摘要
Data augmentation is commonly used to help build a robust speaker verification system, especially when resources are limited. In this paper, we generalize the idea of CutMix to cut instance mix (CI-Mix) for augmenting the training data of speaker verification. The augmentation policy consists of performing the patches cut and paste in both instance-level time-domain and deep feature embedding within a minibatch, without increasing any training data size and computational resource. We apply CI-Mix on the widely used ECAPA-TDNN and ECAPA-CNN-TDNN end-to-end speaker verification systems. Our experiments are performed on both the VoxCeleb and VoxMovies tasks. Extensive results show that the proposed CI-Mix outperforms the state-of-the-art speech data augmentation methods, such as SpecAugment and Mixup, and it shows significant complementary information with these augmentation methods. Together with the simple adding noise augmentation, the CI-Mix achieves up to relative EER reduction of 19.7% on the VoxCeleb1-O, and 16.5%, 10.9%, 10.0%, 10.9% and 13.1% on VoxMovies E-1 to E-5 test sets, compared with the baseline systems.
更多
查看译文
关键词
Speaker verification,Data augmentation,Cut instance mix
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要