A topological description of loss surfaces based on Betti Numbers

Maria Sofia Bucarelli,Giuseppe Alessio D'Inverno,Monica Bianchini,Franco Scarselli,Fabrizio Silvestri

CoRR（2024）

引用 0|浏览4

暂无评分

摘要

In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts to identify spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure to evaluate loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds on the complexity of their loss function and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an ℓ_2 regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要