A Deep Scattering Spectrum - Deep Siamese Network Pipeline For Unsupervised Acoustic Modeling

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2016)

引用 48|浏览102
暂无评分
摘要
Recent work has explored deep architectures for learning acoustic features in an unsupervised or weakly-supervised way for phone recognition. Here we investigate the role of the input features, and in particular we test whether standard mel-scaled filterbanks could be replaced by inherently richer representations, such as derived from an analytic scattering spectrum. We use a Siamese network using lexical side information similar to a well-performing architecture used in the Zero Resource Speech Challenge (2015), and show a substantial improvement when the filterbanks are replaced by scattering features, even though these features yield similar performance when tested without training. This shows that unsupervised and weakly-supervised architectures can benefit from richer features than the traditional ones.
更多
查看译文
关键词
speech recognition,scattering transform,siamese network,ABnet,ABX
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要