Deep Vs. Wide: Depth On A Budget For Robust Speech Recognition

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5(2013)

引用 30|浏览34
暂无评分
摘要
It has now been established that incorporating neural networks can be useful for speech recognition, and that machine learning methods can make it practical to incorporate a larger number of hidden layers in a "deep" structure. Here we incorporate the constraint of freezing the number of parameters for a given task, which in many applications corresponds to practical limitations on storage or computation. Given this constraint, we vary the size of each hidden layer as we change the number of layers so as to keep the total number of parameters constant. In this way we have determined, for a common task of noisy speech recognition (Aurora2), that a large number of layers is not always optimum; for each noise level there is an optimum number of layers. We also use state-of-the-art optimization algorithms to further understand the effect of initialization and convergence properties of such networks, and to have an efficient implementation that allows us to run more experiments with a standard desktop machine with,a single GPU.
更多
查看译文
关键词
deep learning,neural networks,robust speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要