A Deep Scattering Spectrum - Deep Siamese Network Pipeline For Unsupervised Acoustic Modeling
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2016)
摘要
Recent work has explored deep architectures for learning acoustic features in an unsupervised or weakly-supervised way for phone recognition. Here we investigate the role of the input features, and in particular we test whether standard mel-scaled filterbanks could be replaced by inherently richer representations, such as derived from an analytic scattering spectrum. We use a Siamese network using lexical side information similar to a well-performing architecture used in the Zero Resource Speech Challenge (2015), and show a substantial improvement when the filterbanks are replaced by scattering features, even though these features yield similar performance when tested without training. This shows that unsupervised and weakly-supervised architectures can benefit from richer features than the traditional ones.
更多查看译文
关键词
speech recognition,scattering transform,siamese network,ABnet,ABX
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要