Neural Rank Collapse: Weight Decay and Small Within-Class Variability Yield Low-Rank Bias
CoRR(2024)
摘要
Recent work in deep learning has shown strong empirical and theoretical
evidence of an implicit low-rank bias: weight matrices in deep networks tend to
be approximately low-rank and removing relatively small singular values during
training or from available trained models may significantly reduce model size
while maintaining or even improving model performance. However, the majority of
the theoretical investigations around low-rank bias in neural networks deal
with oversimplified deep linear networks. In this work, we consider general
networks with nonlinear activations and the weight decay parameter, and we show
the presence of an intriguing neural rank collapse phenomenon, connecting the
low-rank bias of trained networks with networks' neural collapse properties: as
the weight decay parameter grows, the rank of each layer in the network
decreases proportionally to the within-class variability of the hidden-space
embeddings of the previous layers. Our theoretical findings are supported by a
range of experimental evaluations illustrating the phenomenon.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要