Fast dropout training.

ICML(2013)

引用 550|浏览121
暂无评分
摘要
Recently, improved classification performance has been achieved by encouraging independent contributions from input features, or equivalently, by preventing feature co-adaptation. In particular, the method proposed in [1], informally called “dropout”, did this by randomly dropping out (zeroing) hidden units and input features for training neural networks. However, sampling a random subset of input features during training makes training much slower. We look at the implied objective function of the dropout training method in the context of logistic regression. Then, instead of doing a Monte Carlo optimization of this objective as in [1], we show how to optimize it more directly by using a Gaussian approximation justified by the central limit theorem and empirical evidence, resulting in a 2-30 times speedup and more stability when it is applicable. We outline potential ways of extending the Gaussian approximation to neural networks and draw some connections to other methods in the literature. Finally, we empirically compare the performance of this method to previously published results, and to baselines. Code to replicate results in this paper will be made available.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要