Discriminative Feature Extraction Based on Sequential Variational Autoencoder for Speaker Recognition.
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference(2018)
摘要
This paper presents an extended version of the variational autoencoder (VAE) for sequence modeling. In contrast to the original VAE, the proposed model can directly handle variable-length observation sequences. Furthermore, the discriminative model and the generative model are simultaneously learned in a unified framework. The network architecture of the proposed model is inspired by the i-vector/PLDA framework, whose effectiveness has been proven in sequence modeling tasks such as speaker recognition. Experimental results on the TIMIT database show that the proposed model outperforms the traditional i-vector/PLDA system.
更多查看译文
关键词
discriminative feature extraction,sequential variational autoencoder,speaker recognition,variable-length observation sequences,discriminative model,generative model,network architecture,i-vector/PLDA framework,sequence modeling tasks,i-vector/PLDA system,VAE
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络