CI-Mix: cut instance mix for robust speaker verification

International Journal of Speech Technology（2023）

引用 0|浏览2

暂无评分

摘要

Data augmentation is commonly used to help build a robust speaker verification system, especially when resources are limited. In this paper, we generalize the idea of CutMix to cut instance mix (CI-Mix) for augmenting the training data of speaker verification. The augmentation policy consists of performing the patches cut and paste in both instance-level time-domain and deep feature embedding within a minibatch, without increasing any training data size and computational resource. We apply CI-Mix on the widely used ECAPA-TDNN and ECAPA-CNN-TDNN end-to-end speaker verification systems. Our experiments are performed on both the VoxCeleb and VoxMovies tasks. Extensive results show that the proposed CI-Mix outperforms the state-of-the-art speech data augmentation methods, such as SpecAugment and Mixup, and it shows significant complementary information with these augmentation methods. Together with the simple adding noise augmentation, the CI-Mix achieves up to relative EER reduction of 19.7% on the VoxCeleb1-O, and 16.5%, 10.9%, 10.0%, 10.9% and 13.1% on VoxMovies E-1 to E-5 test sets, compared with the baseline systems.

查看译文

关键词

Speaker verification,Data augmentation,Cut instance mix

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要