Towards Redundancy-Free Sub-networks in Continual Learning
CoRR(2023)
Abstract
Catastrophic Forgetting (CF) is a prominent issue in continual learning.
Parameter isolation addresses this challenge by masking a sub-network for each
task to mitigate interference with old tasks. However, these sub-networks are
constructed relying on weight magnitude, which does not necessarily correspond
to the importance of weights, resulting in maintaining unimportant weights and
constructing redundant sub-networks. To overcome this limitation, inspired by
information bottleneck, which removes redundancy between adjacent network
layers, we propose \textbf{\underline{I}nformation \underline{B}ottleneck
\underline{M}asked sub-network (IBM)} to eliminate redundancy within
sub-networks. Specifically, IBM accumulates valuable information into essential
weights to construct redundancy-free sub-networks, not only effectively
mitigating CF by freezing the sub-networks but also facilitating new tasks
training through the transfer of valuable knowledge. Additionally, IBM
decomposes hidden representations to automate the construction process and make
it flexible. Extensive experiments demonstrate that IBM consistently
outperforms state-of-the-art methods. Notably, IBM surpasses the
state-of-the-art parameter isolation method with a 70\% reduction in the number
of parameters within sub-networks and an 80\% decrease in training time.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined