A topological description of loss surfaces based on Betti Numbers
CoRR(2024)
摘要
In the context of deep learning models, attention has recently been paid to
studying the surface of the loss function in order to better understand
training with methods based on gradient descent. This search for an appropriate
description, both analytical and topological, has led to numerous efforts to
identify spurious minima and characterize gradient dynamics. Our work aims to
contribute to this field by providing a topological measure to evaluate loss
complexity in the case of multilayer neural networks. We compare deep and
shallow architectures with common sigmoidal activation functions by deriving
upper and lower bounds on the complexity of their loss function and revealing
how that complexity is influenced by the number of hidden units, training
models, and the activation function used. Additionally, we found that certain
variations in the loss function or model architecture, such as adding an
ℓ_2 regularization term or implementing skip connections in a feedforward
network, do not affect loss topology in specific cases.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要