Locality defeats the curse of dimensionality in convolutional teacher-student scenarios

Annual Conference on Neural Information Processing Systems(2021)

引用 2|浏览17
暂无评分
摘要
Convolutional neural networks perform a local and translationally-invariant treatment of the data: quantifying which of these two aspects is central to their success remains a challenge. We study this problem within a teacher-student framework for kernel regression, using `convolutional' kernels inspired by the neural tangent kernel of simple convolutional architectures of given filter size. Using heuristic methods from physics, we find in the ridgeless case that locality is key in determining the learning curve exponent beta (that relates the test error is an element of(t) similar to P-beta to the size of the training set P), whereas translational invariance is not. In particular, if the filter size of the teacher t is smaller than that of the student s, beta is a function of s only and does not depend on the input dimension. We confirm our predictions on beta empirically. We conclude by proving, under a natural universality assumption, that performing kernel regression with a ridge that decreases with the size of the training set leads to similar learning curve exponents to those we obtain in the ridgeless case.
更多
查看译文
关键词
analysis of algorithms,deep learning,learning theory,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要