Author response: Can sleep protect memories from catastrophic forgetting?

Oscar C González,Yury Sokolov,Giri P Krishnan,Jean Erik Delanois,Maxim Bazhenov

crossref（2020）

引用 0|浏览0

暂无评分

摘要

Article Figures and data Abstract Introduction Results Discussion Materials and methods Data availability References Decision letter Author response Article and author information Metrics Abstract Continual learning remains an unsolved problem in artificial neural networks. The brain has evolved mechanisms to prevent catastrophic forgetting of old knowledge during new training. Building upon data suggesting the importance of sleep in learning and memory, we tested a hypothesis that sleep protects old memories from being forgotten after new learning. In the thalamocortical model, training a new memory interfered with previously learned old memories leading to degradation and forgetting of the old memory traces. Simulating sleep after new learning reversed the damage and enhanced old and new memories. We found that when a new memory competed for previously allocated neuronal/synaptic resources, sleep replay changed the synaptic footprint of the old memory to allow overlapping neuronal populations to store multiple memories. Our study predicts that memory storage is dynamic, and sleep enables continual learning by combining consolidation of new memory traces with reconsolidation of old memory traces to minimize interference. Introduction Animals and humans are capable of continuous, sequential learning. In contrast, modern artificial neural networks suffer from the inability to perform continual learning (Ratcliff, 1990; French, 1999; Hassabis et al., 2017; Hasselmo, 2017; Kirkpatrick et al., 2017). Training a new task results in interference and catastrophic forgetting of old memories (Ratcliff, 1990; McClelland et al., 1995; French, 1999; Hasselmo, 2017). Several attempts have been made to overcome this problem including (a) explicit retraining of all previously learned memories – interleaved training (Hasselmo, 2017), (b) using generative models to reactivate previous inputs (Kemker and Kanan, 2017), or (c) artificially ‘freezing’ subsets of synapses important for the old memories (Kirkpatrick et al., 2017). These solutions help prevent new memories from interfering with previously stored old memories, however they either require explicit retraining of all past memories using the original data or have limitations on the types of trainable new memories and network architectures (Kemker and Kanan, 2017). How biological systems avoid catastrophic forgetting remains to be understood. In this paper, we propose a mechanism for how sleep modifies network synaptic connectivity to minimize interference of competing memory traces enabling continual learning. Sleep has been suggested to play an important role in learning and memory (Paller and Voss, 2004; Walker and Stickgold, 2004; Oudiette et al., 2013; Rasch and Born, 2013; Stickgold, 2013; Weigenand et al., 2016; Wei et al., 2018). Specifically, the role of stage 2 (N2) and stage 3 (N3) of Non-Rapid Eye Movement (NREM) sleep has been shown to help with the consolidation of newly encoded memories (Paller and Voss, 2004; Walker and Stickgold, 2004; Rasch and Born, 2013; Stickgold, 2013). The mechanism by which memory consolidation is influenced by sleep is still debated, however, a number of hypotheses have been put forward. Sleep may enable memory consolidation through repeated reactivation or replay of specific memory traces during characteristic sleep rhythms such as spindles and slow oscillations (Paller and Voss, 2004; Clemens et al., 2005; Marshall et al., 2006; Oudiette et al., 2013; Rasch and Born, 2013; Weigenand et al., 2016; Ladenbauer et al., 2017; Wei et al., 2018; Xu et al., 2019). Memory replay during NREM sleep could help strengthen previously stored memories and map memory traces between brain structures. Previous work using electrical (Marshall et al., 2004; Marshall et al., 2006; Ladenbauer et al., 2017) or auditory (Ngo et al., 2013) stimulation showed that increasing neocortical oscillations during NREM sleep resulted in improved consolidation of declarative memories. Similarly, spatial memory consolidation has been shown to improve following cued reactivation of memory traces during NREM sleep (Paller and Voss, 2004; Oudiette et al., 2013; Oudiette and Paller, 2013; Papalambros et al., 2017). Our recent computational studies found that sleep dynamics can lead to replay and strengthening of recently learned memory traces (Wei et al., 2016; Wei et al., 2018; Wei et al., 2020). These studies point to the critical role of sleep in memory consolidation. Can neuroscience inspired ideas help solve the catastrophic forgetting problem in artificial neuronal networks? The most common machine learning training algorithm – backpropagation (Rumelhart et al., 1986; Werbos, 1990; Kriegeskorte, 2015) – is very different from plasticity rules utilized by brain networks. Nevertheless, we have recently seen a number of successful attempts to implement high level principles of biological learning in artificial network designs, including implementation of the ideas from ‘Complementary Learning System Theory’ (McClelland et al., 1995), according to which the hippocampus is responsible for the fast acquisition of new information, while the neocortex would more gradually learn a generalized and distributed representation. These ideas led to interesting attempts of solving the catastrophic forgetting problem in artificial neural networks (Kemker and Kanan, 2017). While few attempts have been made to implement sleep in artificial networks, one study suggested that sleep-like activity can increase storage capacity in artificial networks (Fachechi et al., 2019). We recently found that implementation of a sleep-like phase in artificial networks trained using backpropagation can dramatically reduce catastrophic forgetting, as well as improve generalization performance and transfer of knowledge (Krishnan et al., 2019; Tadros et al., 2020). However, despite this progress, we are still lacking a basic understanding of the mechanisms by which sleep replay affects memories, especially when new learning interferes with old knowledge. The ability to store and retrieve sequentially related information is arguably the foundation of intelligent behavior. It allows us to predict the outcomes of sensory situations, to achieve goals by generating sequences of motor actions, to ‘mentally’ explore the possible outcomes of different navigational or motor choices, and ultimately to communicate through complex verbal sequences generated by flexibly chaining simpler elemental sequences learned in childhood. In our new study, we trained a network, capable of transitioning between sleep-like and wake-like states, to learn spike sequences in order to identify mechanisms by which sleep allows consolidation of newly encoded memory sequences and prevents damage to old memories. Our study predicts that during a period of sleep, following training of a new memory sequence in awake, both old and new memory traces are spontaneously replayed, preventing forgetting and increasing recall performance. We found that sleep replay results in fine tuning of the synaptic connectivity matrix encoding the interfering memory sequences to allow overlapping populations of neurons to store multiple competing memories. Results The network model, used in our study, represents a minimal thalamocortical architecture implementing one cortical layer (consisting of excitatory pyramidal (PY) and inhibitory (IN) neurons) and one thalamic layer (consisting of excitatory thalamic relay (TC) and inhibitory reticular thalamic (RE) neurons) – with all neurons simulated by Hodgkin-Huxley models (Figure 1A). These models were built upon neuron models we used in our earlier work (Krishnan et al., 2016; Wei et al., 2016; Wei et al., 2018). This model exhibits two primary dynamical states of the thalamocortical system – awake, characterized by random asynchronous firing of all cortical neurons, and slow-wave sleep (SWS), characterized by slow (<1 Hz) oscillations between Up (active) and Down (silent) states (Blake and Gerard, 1937; Steriade et al., 1993; Steriade et al., 2001). Transitions between sleep and awake (Figure 1B/C) were simulated by changing network parameters to model effect of neuromodulators (Krishnan et al., 2016). While the thalamic population was part of the network, its role was limited to help simulate realistic Up and Down state activity (Bazhenov et al., 2002), as all synaptic changes occurred in the cortical population. The initial strength of the synaptic connections between cortical PY neurons was Gaussian distributed (Figure 1D). Figure 1 Download asset Open asset Network architecture and baseline dynamics. (A) Basic network architecture (PY: excitatory pyramidal neurons; IN: inhibitory interneurons; TC: excitatory thalamocortical neurons; RE: inhibitory thalamic reticular neurons). Excitatory synapses are represented by lines terminating in a dot, while inhibitory synapses are represented by lines terminating in bars. Arrows indicate the direction of the connection. (B) Behavior of a control network exhibiting wake-sleep transitions. Cortical PY neurons are shown. Color represents the voltage of a neuron at a given time during the simulation (dark blue – hyperpolarized potential; light blue / yellow – depolarized potential; red - spike). (C) Zoom-in of a subset of neurons from the network in B (time is indicated by arrows). Left and right panels show spontaneous activity during awake-like state before and after sleep, respectively. Middle panel shows example of activity during sleep. (D) Left panel shows the initial weighted adjacency matrix for the network in B. The color in this plot represents the strength of the AMPA connections between PY neurons, with white indicating the lack of synaptic connection. Right panel shows the initial weighted adjacency matrix for the subregion indicated on the left. We set probabilistic connectivity (p=0.6) between excitatory cortical neurons within a defined radius (RAMPA(PY-PY)=20). Only cortical PY-PY connections were plastic and regulated by spike-timing dependent plasticity (STDP). During initial training, STDP was biased for potentiation to simulate elevated levels of acetylcholine (Blokland, 1995; Shinoe et al., 2005; Sugisaki et al., 2016). During testing/retrieval, STDP was balanced (LTD/LTP = 1). STDP remained balanced during both sleep and interleaved training (except for few selected simulations where we tested effect of unbalancing STDP) to allow side by side comparisons. For details, please see Methods and Materials. Temporally structured sequences of events are a common type of information we learn, and they are believed to be represented in the brain by sequences of neuronal firing. Therefore, in this study we represent each memory pattern as an ordered sequence, S, of activations of populations of cortical neurons (e.g., A→B→…), where each ‘letter’ (e.g., A) labels a population of neurons, so each memory could be labeled by a unique ‘word’ of such ‘letters’. We considered memory patterns represented by non-overlapping populations of neurons as well as memory patterns sharing neurons but with a different activation order, for example, A→B→C vs. C→B→A. This setup can mimic, for example, in vivo experiments with a rat learning a track, including: (a) running in one direction on a linear track (Mehta et al., 1997) would be equivalent to a sequence training (‘A→B→C’, 'A→B→C’,…); (b) forwards and backwards running on a linear track (Navratilova et al., 2012) would be equivalent to interleaved sequences training (‘A→B→C’, 'C→B→A’, 'A→B→C’,…); (c) running on a belt track first only in one direction and then in reverse one (e.g., using Virtual Reality (VR) apparatus) would be equivalent to first learning a sequence (‘A→B→C’, 'A→B→C’,…) and then the opposite one (‘C→B→A’, 'C→B→A’,…). In our model, training always occurred in the awake state and no input was delivered to the network in the sleep state. Testing was also done in the awake state; during test sessions, the model was only presented with input to the first group (e.g., A) to test for pattern completion for the trained sequence (e.g., A→B→C→…). Performance was calculated based on the distance between the trained pattern (template) and the response during testing. The awake state included multiple testing sessions: before training, after training/before sleep, and after sleep. For details, please see Methods and Materials. The paper is organized as follows. We first consider the scenario of two memory sequences trained at different (non-overlapping) network locations. We show that SWS-like activity after training leads to sequence replay, synaptic weight changes, and performance increases during testing after sleep. Next, we focus on the case of two sequences trained in opposite directions over the same population of neurons. We show that in such a case training a new sequence in awake would ‘erase’ an old memory. However, if a sleep phase is implemented before complete destruction of the old memory, both memory sequences are spontaneously replayed during sleep. As a result of replay, each sequence allocates its own subset of neurons/synapses, and performance increases for both sequences during testing after sleep. We complete the study with a detailed analysis of synaptic weight changes and replay dynamics during the sleep state to identify mechanisms of memory consolidation and performance increase. In supplementary figures, we compare sleep replay with interleaved training and show that sleep achieves similar or better performance but without explicit access to the training data. Training of spatially separated memory sequences does not lead to interference First, we trained two memory patterns, S1 and S2, sequentially (first S1 and then S2) in spatially distinct regions of the network as shown in Figure 2A. Each memory sequence was represented by the spatio-temporal pattern of 5 sequentially activated groups of 10 neurons per group. A 5 ms delay was included between stimulations of subsequent groups within a sequence. S1 was trained in the population of cortical neurons 200–249 (Figure 2B, top). Training S1 resulted in an increase of synaptic weights between participating neurons (Figure 2D, left) and an increase in performance on sequence completion (Figure 2B/C, top). When the strength of the synapses in the direction of S1 increased, synapses in the opposite direction showed a reduction consistent with the STDP rule (see Methods and Materials). The second sequence, S2, was trained for an equal amount of time as S1 but in a different population of neurons 350–399 (W-V-X-Y-Z, Figure 2B, bottom). Training of S2 also resulted in synaptic weight changes (Figure 2D, middle) and improvement in performance (Figure 2B/C, bottom). Importantly, training of S2 did not interfere with the weight changes encoding S1 because both sequences involved spatially distinct populations of neurons (compare Figure 2D, left and middle). It should be noted that though testing resulted in reactivation of memory traces, there was little change in synaptic weights during testing periods because of a relatively small number of pre/post spike events. (Simulations where STDP was explicitly turned off during all testing periods exhibited similar results to those presented here.) Figure 2 with 1 supplement see all Download asset Open asset Two spatially separated memory sequences show no interference during training and both are strengthened by subsequent sleep. (A) Network activity during periods of testing (T), training of two spatially separated memory sequences (S1/S2), and sleep (N3). Cortical PY neurons are shown. Color indicates voltage of neurons at a given time. (B) Left panels show an example of training sequence 1 (S1, top) and sequence 2 (S2, bottom). Middle panels show examples of testing both sequences prior to sleep. Right panels show examples of testing after sleep. Note, after sleep, both sequences show better completion. (C) Performance of S1 and S2 completion before any training (baseline), after S1 training, after S2 training, and after sleep (red). (D) Synaptic weight matrices show changes of synaptic weights in the regions trained for S1 and S2. Left panel shows weights after training S1; middle panel shows weights after training S2; right panel shows weights after sleep. Color indicates strength of AMPA synaptic connections. (E) Distributions of the net sum of synaptic weights each neuron receives from all the neurons belonging to its left neighboring group (S1 direction) vs its right neighboring group (opposite direction, defined as S1* direction below) within a trained region at baseline (left), after S1 training (middle) and after sleep (right). (F) Synaptic weight-based directionality index before/after training (gray bars) and after sleep (red bar). We next calculated the net sum of synaptic weights each neuron received from all neurons belonging to its left vs right neighboring populations (e.g., total input to a neuron Bi, belonging to group B, that it received from all the neurons in group A vs all the neurons in group C) and we analyzed the difference of these net weights. The initial distribution was symmetric reflecting the initial state of the network (Figure 2E, left). After training, it became asymmetric, indicating stronger input from the left groups (i.e., total input to Bi from all the neurons in group A was larger than that from all the neurons in group C) (Figure 2E, middle). These results are consistent with in vivo recordings from a rat running in one direction on a linear track (Mehta et al., 1997), where this phenomenon was called ‘receptive field backwards expansion’, i.e., neurons representing locations along the track became asymmetrically coupled such that activity in one group of neurons (one location) led to activation of the next group of neurons (new location) even before the corresponding input occurred (before the animal moved to the new location). After successful training of both sequences, the network went through a period of sleep (N3 in Figure 2A) when no stimulation was applied. After sleep, synaptic weights for both memory sequences revealed strong increases in the direction of their respective activation patterns and further decreases in the opposing directions (Figure 2D, right). In line with our previous work (Wei et al., 2018), these changes were a result of sequence replay during the Up states of slow oscillation (see next section for details). Synaptic strengthening increased the performance on sequence completion after sleep (Figure 2B, right; 2C, red bar). Analysis of the net synaptic input to each neuron from its left vs right neighboring groups, revealed further shift of the synaptic weight distribution (Figure 2E, right). This predicts that SWS following linear track training would lead to further receptive field backwards expansion in the cortical neurons. To quantify this asymmetry we calculated a ‘directionality index’, I, for synaptic weights (similar to Navratilova et al., 2012 but using synaptic weights), based on synaptic input to each neuron from its left vs right neighboring populations (‘Directionality Index’=0 if all the neurons receive the same input from its left vs right neighboring groups and ‘Directionality Index’=1 if all the neurons receive input from one ‘side’ only; see Methods and Materials for details). This analysis showed an increase in the directionality index from naive to trained cortical networks and further increase after sleep (Figure 2F). Note, that the backwards expansion of the place fields was reset between sessions in CA1 (Mehta et al., 1997), but not in CA3 (Roth et al., 2012), where the backward shift gradually diminished across days, possibly as memories became hippocampus independent (see Discussion). The goal of this study was to reveal basic mechanisms of replay and therefore we focus on the ‘simple’ linear (e.g., S1) memory sequences. Our results, however, can be generalized to much more complex non-linear sequences (see Figure 2—figure supplement 1). In simulations from Figure 2—figure supplement 1, training a sequence in awake was not long enough to ensure reliable pattern completion, however, performance was significantly improved after replay during SWS. Sleep replay improves pattern completion performance for memory sequencies Why do SWS dynamics lead to improvement in memory performance? The hypothesis is that memory patterns trained in awake are spontaneously replayed during sleep. With this in mind, we next analyzed the network firing patterns during Up states of the slow oscillation to identify replay. We focused our analysis on pairs of neurons (as opposed to the longer sequences) because (a) having different elementary units of a sequence (neuronal pairs) replayed independently would still be sufficient to strengthen the entire sequence; (b) in vivo data suggest that memory sequence replay often involves random subsets of the entire sequence (e.g., Euston et al., 2007; Roumis and Frank, 2015; Joo and Frank, 2018; Swanson et al., 2020); (c) we want to compare results in this section to the analysis of the overlapping opposite sequences in the following sections, however, we could not reliably detect replay of the full sequences in the latter case possibly because of highly overlapping spiking between sequences. For each synapse in direction S1 (we refer to it below as S1 synapse) and each Up state, we (a) calculated the time delay between nearest pre/post spikes; (b) transformed this time delay through an STDP-like function to obtain a value characterizing its effect on synaptic weight; and (c) calculated the total net effect of all such spike events. This gave us a net weight change for a given synapse during a given Up state. If we observed a net weight increase, we labeled this S1 synapse as being preferentially replayed during a given Up state. Finally, we counted all the Up states where a given synapse was replayed as defined above. This procedure is similar to off-line STDP, however, instead of weight change over entire sleep, we obtained the number of Up states where a synapse in the direction of S1 was (preferentially) replayed. Figure 3A shows, for each synapse in the direction of S1, the total change of its synaptic strength across entire sleep (Y-axis) vs number of Up states when that synapse was replayed (X-axis). As expected, it shows a strong positive correlation. Synaptic weight changes became negative when the number of Up states where an S1 synapse was replayed dropped below half of the total number of Up states (blue vertical line in Figure 3A). In Figure 3B we plotted only those S1 synapses which were replayed reliably – for more that 66% of all Up states (dotted line in Figure 3A). We found such synapses between all neuronal groups (gray boxes in Figure 3B) as well as between neurons within groups. Figure 3 Download asset Open asset Sleep replay strengthens synapses to improve memory recall. (A) Change in synaptic weight over entire sleep period as a function of the number of Up states where a given synapse was replayed. Each star represents a synapse in the direction of S1. Dashed line indicates the threshold (66% of Up states) used to identify synapses that are replayed reliably for analysis in B; purple line indicates the maximum number of Up states; blue line demarcates the 50% mark of the total number of Up states. (B) Thresholded connectivity matrix indicating synaptic connections (blue) showing reliable replays in the trained region. Grey boxes highlight between group connections. (C) Network's graph showing between group (top) and within group (bottom) connections. Edges shown here are those synapses which revealed reliable replays of S1 as shown in B. Nodes are colored blue if they receive at least one of the synapses identified in panel B. In Figure 3C, we illustrated all the synapses identified in the analysis in Figure 3B, that is, synapses that were replayed reliably (in more than 66% of all Up states) in direction of S1. We also colored in blue neurons receiving at least one of these synapses as identified in Figure 3B. We concluded that there were multiple direct and indirect synaptic pathways connecting the first (A) and last (E) groups of neurons that were replayed reliably during sleep. These synapses increased their strength which explains reliable memory recall during testing after sleep. Sequential training of overlapping memory sequences results in interference We next tested whether our network model shows interference during awake when a new sequence (S1*) (Figure 4A) is trained in the same population of neurons as the earlier old sequence (S1). S1* included the same exact groups of neurons as S1, but the order of activation was reversed, that is, the stimulation order was E-D-C-B-A (Figure 4B). S2 was once again trained in a spatially distinct region of the network (Figure 4A/B). Testing for sequence completion was performed immediately after each training period. This protocol can represent two somewhat different training scenarios: (a) two competing memory traces (S1 and S1*) are trained sequentially before sleep; (b) the first (old) memory S1 is trained and then consolidated during sleep followed by training of the second (new) memory S1* followed by another episode of sleep. We explicitly tested both scenarios and they behaved similarly, so in the following we discuss the simpler case of two sequentially trained memories followed by sleep. This setup can simulate in vivo experiments with a rat running on a belt in a VR apparatus, first in one direction only (learning S1) and then in the opposite direction (learning S1*). An example of the second scenario is presented in Figure 5—figure supplement 1 and discussed below. Figure 4 with 1 supplement see all Download asset Open asset Training of overlapping memory sequences results in catastrophic interference. (A) Network activity (PY neurons) during training and testing periods for three memory sequences in awake-like state. Note, sequence 1 (S1) and sequence 1* (S1*) are trained over the same population of neurons. Color indicates the voltage of the neurons at a given time. (B) Examples of sequence training protocol for S1 (left), S2 (middle), and S1* (right). (C) Performances for the three sequences at baseline, and after S1, S2 and S1* training. Training of S1* leads to reduction of S1 performance. (D) Performance of S1 (black) and S1* (red) as a function of S1* training duration. Note that longer S1* training increases degradation of S1 performance. In the model, training S1 increased performance of S1 completion (Figure 4C, top/left). It also led to decrease in performance for S1* below its baseline level in the ‘naive’ network (Figure 4C, bottom/left). (Note that even a naive network displayed some above zero probability to complete a sequence depending on the initial strength of synapses and spontaneous network activity). Training S2 led to an increase in S2 performance (S1 performance also increased, most-likely due to the random reactivation of S1 in awake). Subsequent training of S1* resulted in both a significant increase in S1* performance and a significant reduction of S1 performance (Figure 4C). To evaluate the impact of S1* training on S1 performance, we varied the duration of S1* (later memory) training (Figure 4D). Increasing the duration of S1* training correlated with a reduction of S1 performance up to the point when S1 performance was reduced to its baseline level (Figures 4D and 400 sec training duration of S1*). This suggests that sequential training of two memories competing for the same population of neurons results in memory interference and catastrophic forgetting of the earlier memory sequence. The model predicts that in experiments with a rat running on a belt in a VR apparatus, training the backward direction after training the forward one would ‘erase’ the effect of the forward training. While we are not aware of such experiments, studies done with a rat running forward and backward on a liner track (Navratilova et al., 2012), which would be equivalent to interleaved training S1→ S1*→ S1→ S1*…., revealed that, in the hippocampus, spatial sequences of opposite direction are rapidly orthogonalized, largely on the basis of differential head direction system input, to accommodate both trainings. Thus, at each location, some neurons had their receptive field expanded in one direction and others in the opposite direction (Navratilova et al., 2012). To compare our model with these data, we tested interleaved training of S1 and S1* (Figure 4—figure supplement 1) and found performance increase for both sequences. Importantly, in agreement with in vivo data, different neurons became specific for S1 vs S1* as reflected in the overall increase of the directionality index (Figure 4—figure supplement 1F). In the next section we test if sleep can achieve the same goal. Sleep prevents interference and leads to performance improvement for overlapping memories So far we found that when a single sequence was trained, it replayed spontaneously during sleep resulting in improvement in performance (Figures 2 and 3). For two opposite sequences trained in the same network location we found competition and interference during sequential training in awake (Figure 4). However, when the same two sequences were trained using alternating protocol (interleaved training), both increased in performance (Figure 4—figure supplement 1C). We next tested the effect of SWS following sequential training of two opposite sequences in awake. Two outcomes are possible: (a) the stronger sequence could dominate replay and eventually suppress the weaker one, or (b) both sequences can be replayed dur

查看译文

关键词

catastrophic forgetting,protect memories,sleep

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要