The boundary of neural network trainability is fractal
CoRR(2024)
摘要
Some fractals – for instance those associated with the Mandelbrot and
quadratic Julia sets – are computed by iterating a function, and identifying
the boundary between hyperparameters for which the resulting series diverges or
remains bounded. Neural network training similarly involves iterating an update
function (e.g. repeated steps of gradient descent), can result in convergent or
divergent behavior, and can be extremely sensitive to small changes in
hyperparameters. Motivated by these similarities, we experimentally examine the
boundary between neural network hyperparameters that lead to stable and
divergent training. We find that this boundary is fractal over more than ten
decades of scale in all tested configurations.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要