Information maximization explains state-dependent synaptic plasticity and memory reorganization during non-rapid eye movement sleep

Abstract Slow waves during the non-rapid eye movement (NREM) sleep reflect the alternating up and down states of cortical neurons; global and local slow waves promote memory consolidation and forgetting, respectively. Furthermore, distinct spike-timing-dependent plasticity (STDP) operates in these up and down states. The contribution of different plasticity rules to neural information coding and memory reorganization remains unknown. Here, we show that optimal synaptic plasticity for information maximization in a cortical neuron model provides a unified explanation for these phenomena. The model indicates that the optimal synaptic plasticity is biased toward depression as the baseline firing rate increases. This property explains the distinct STDP observed in the up and down states. Furthermore, it explains how global and local slow waves predominantly potentiate and depress synapses, respectively, if the background firing rate of excitatory neurons declines with the spatial scale of waves as the model predicts. The model provides a unifying account of the role of NREM sleep, bridging neural information coding, synaptic plasticity, and memory reorganization.

Several studies have suggested that slow waves should be separated into distinct classes (10)(11)(12)(13)(14)(15)(16). Although different classification schemes have been used in previous studies, one of the classes is more global, while the other is more local (10,(12)(13)(14). A recent study further suggested that these two classes of slow waves have opposite effects on memory reorganization; the global and local classes promote memory consolidation and forgetting, respectively (13). These studies suggest that memory reorganization is induced depending on the subtle sleep states, such as the up and down states of global and local slow waves.
One possible explanation for how these sleep states differentially modulate the memory reorganization is that synaptic plas-ticity is modulated depending on the sleep state. Previous studies have shown that neuronal activity patterns in the awake state are reactivated within the slow waves during NREM sleep (17)(18)(19). Although the synaptic plasticity rule during NREM sleep is largely unknown, a recent experimental study using anesthetized young mice in vivo has suggested that the spike-timing-dependent plasticity (STDP) during up states is biased toward depression compared with down states (20). Consistently, another experimental study using acute brain slices demonstrated that the subthreshold inputs during up but not down states induce synaptic weakening (21). These findings suggest that neuronal reactivation can induce different synaptic plasticity in the up and down states. This difference might be the key to understanding memory reorganization during NREM sleep and raises two further issues worth exploring theoretically. First, what is the benefit of modulating the synaptic plasticity rule depending on the up and down states? Because the nervous system has evolved to work efficiently, the efficiency of neuronal coding might be enhanced by this modulation. Second, how does the state-dependent synaptic plasticity reorganize memories in global and local slow waves?
To understand these issues, we adopted a normative approach based on the information maximization (infomax) principle (22,23) and derived a synaptic plasticity rule for a spiking neuron model (24,25) that achieves efficient information transmission. We found that the baseline firing rate is an important parameter of the infomax rule. An increased baseline firing rate biases the synaptic plasticity towards depression, consistent with the reported difference in STDP between the up and down states. We then constructed a neuronal network model exhibiting global and local slow waves and showed that four states (up and down states of global and local slow waves) have distinct STDP owing to different baseline firing rates. Finally, we suggest that the difference in synaptic plasticity in global and local slow waves can set a balance between memory consolidation and forgetting, consistent with the previous experimental findings (see Fig. 6 for a schematic summary).

Optimal synaptic plasticity is biased toward depression in high firing rates
To consider optimal synaptic plasticity in different sleep states, we first considered a feedforward network model with a postsynaptic neuron and multiple excitatory presynaptic neurons, which we call the single-neuron model. In this model, presynaptic spikes at synapse j evoked excitatory postsynaptic potentials (EPSPs) with the amplitude w j and exponential decays with a time constant of 25 ms. The membrane potential of the postsynaptic neuron was computed as u(t) = u r + j w j h j (t), where u r is the resting membrane potential and h j is the EPSP time-course from presynaptic neuron j with an instantaneous increment of 1 after each presynaptic spike. The postsynaptic neuron emits spikes with firing probability density g E (u(t))R(t), where g E (u) is a softplus activation intensity function, and refractory factor R(t) models the transient suppression of the postsynaptic firing rate after a postsynaptic spike (see the "Methods" section). Following the infomax approach, we derive the optimal synaptic plasticity rule for maximizing information transmission while synaptic weights are constrained by their cost. We assumed that the synaptic weights w j change following the gradient of the utility function, which is the mutual information between the presynaptic and postsynaptic spikes minus the synaptic weight cost (24,25). Thus, the synaptic weight changes were described by dw j dt ∝ dI dw j − λ d dw j with mutual information I and synaptic weight cost (see the "Methods" section). The cost term includes the square of the synaptic weight, which eliminates the synaptic weights that do not contribute to information transmission. Coefficient λ controls the importance of the synaptic cost term relative to the information term. We omitted the homeostatic term assumed in the previous studies because it did not contribute to our results, where the postsynaptic firing rate was kept on average within the homeostatic range. The gradient of mutual information dI/dw j was explicitly derived and can be computed in real-time using only the variables observable at synapse j, namely, the EPSP time-course h j , postsynaptic spikes informed by a back-propagating action potential, postsynaptic activation intensity g E (u) as a function of postsynaptic membrane potential u, refractory factor R, and mean activation intensityḡ (see the "Methods" section). In addition, the gradient of the cost term decreased the synaptic strength by λw j for every presynaptic spike of neuron j.
We ran simulations imitating the experimental STDP protocols in vivo (20) in the single-neuron model. To mimic the experimental setup, we divided the presynaptic neurons into the stimulated and nonstimulated neurons (Fig. 1A). Twenty stimulated neurons synchronously emitted a spike upon external presynaptic stimulation, and their synaptic weights changed according to the infomax rule. One hundred nonstimulated neurons spontaneously emitted Poisson spikes at 2.0 and 0.1 Hz in the up and down states, respectively, while their synaptic weights were fixed for simplicity. The mean postsynaptic activation intensity,ḡ, was computed by taking the average of g E (u) in each state. This yieldedḡ (u) ≈ 5.9 Hz andḡ (d) ≈ 0.5 Hz for the up and down states, respectively.
We first characterized synaptic changes induced by pre-post stimulation, where a presynaptic spike was induced 10 ms before the postsynaptic spike. Representative traces of a synaptic weight from a stimulated neuron in the up and down states are plotted in Fig. 1(B). The increase in g E (u)/ḡ after presynaptic stimulation was greater when the mean firing rate was low, indicating that a postsynaptic spike can transmit a greater amount of information at a lower mean firing rate. Consequently, the information term caused a greater synaptic potentiation in the down state than that in the up state. The amount of synaptic potentiation due to the information term was roughly proportional to log(1 + g/ḡ)/(ḡ + g), where g = g E (u) −ḡ represents the increment in activation intensity due to the presynaptic stimulation (see the "Methods" section for details). Intuitively, g measured the reliability of a synapse for transmitting the signal, andḡ represented the noise level that quantifies the frequency of the postsynaptic spikes in the background. Thus, g/ḡ corresponds to the signal-to-noise ratio. By contrast, the change in the synaptic weight by the cost term was −λw j after every presynaptic spike of neuron j, regardless of the mean firing rate. Fig. 1(B) displays the representative traces of synaptic weight; however, synaptic changes also depended on other postsynaptic spikes and presynaptic spikes from nonstimulated neurons, which can occur randomly. Below, we quantify the average synaptic changes induced by three kinds of stimulation: pre-only stimulation and post-pre stimulation (a presynaptic spike was induced 10 ms after an induced postsynaptic spike), in addition to the prepost stimulation explored above. We started by simulating the pre-only stimulation. Experimentally, pre-only stimulation in the down state did not significantly change the synaptic weights of stimulated neurons (20). To reproduce this experimental result, we set the coefficient of the cost term to λ = 0.32 (mV) −2 , so that Representative traces of a stimulated neuron's activity, the postsynaptic neuron's activity, the ratio of the momentary and mean activation intensity g E (u)/ḡ, and changes of a synaptic weight from a stimulated neuron in the down and up states. Synaptic changes by the infomax rule were computed by summing the effects of the information term dI dw j and cost term d dw j . The synaptic increase by the information term was smaller in the up state than that in the down state.
the changes in the stimulated synapses were on average zero in the down state ( Fig. 2A). This value of λ was used throughout this paper. Simulations of the up state showed an overall synaptic depression because the synaptic potentiation due to the information term decreased with the mean firing rate for the reason described above. In addition, neither artificial postsynaptic depolarization to the up state level in down states nor hyperpolarization to the down state level in up states appreciably affected the synaptic changes in the current setup (SI Appendix, Fig. S2), consistent with the experimental results (20). Next, if a postsynaptic spike was evoked before the presynaptic stimulation (i.e., post-pre stimulation), the infomax rule caused the synaptic depression both in the up and down states because the induced postsynaptic spike before the presynaptic stimulation reduced the value of R(t) and prevented the synaptic weights from increasing by the information term, whereas the cost term could still decrease these synapses ( Fig. 2A).
To investigate how the STDP window of the infomax rule differs in the up and down states, the time difference between presynaptic and postsynaptic stimulations was systematically changed in the single-neuron model. Consistent with the observations above, the entire STDP curve was biased toward synaptic depression in the high mean firing rate condition (Fig. 2B). While the synaptic change caused by post-pre stimulation was relatively insensitive to mean firing rates, the high mean firing rates biased the synaptic changes toward depression with the pre-post and preonly stimulations (Fig. 2C). These results were consistent with the corresponding in-vivo experimental results (20). In addition, while the experimental results are limited to a few representative time differences (within 10 ms and within −10 ms in the down states and 10, 50, and −10 ms in the up states) (20), the infomax model predicted the whole STDP curve. Although this tendency of synaptic depression induced by the high mean firing rates did not qualitatively depend on the choices of the ac-tivation intensity function g E (u) except for the pure exponential function (see SI Appendix, Fig. S1 and the "Methods" section), the exact position of STDP curve and the activity threshold, separating potentiation and depression under the pre-only stimulations, depended on several parameters (SI Appendix, Figs. S3 and S4). Especially, the pre-only stimulation with small synaptic weights and a large number of the stimulated neurons tended to induce potentiation even in up states, albeit to a lesser degree than down states (SI Appendix, Fig. S4). In summary, the modulation of the infomax rule by the mean firing rate explained the synaptic plasticity during the up and down states in slow waves.

Firing rates of excitatory neurons are higher during local slow waves than those during global slow waves
To study how the above-mentioned findings would apply to memory reorganization during NREM sleep, we constructed the network models of cortical neurons that generate slow waves, which we call the slow-wave model. We started by constructing a spatially homogeneous model similar to that in ref. (26) and then introduced spatial heterogeneity to produce local and global slow waves. The slow-wave model consisted of recurrently connected spiking neurons, including the excitatory and inhibitory neurons (see the "Methods" section). The spatially homogeneous model assumed no connections between two inhibitory neurons but all-toall connections between two excitatory neurons and between excitatory and inhibitory neurons. Each excitatory neuron had adaptation currents that accumulated with the spikes. Adaptation currents correspond to, for example, the potassium currents involved in generating slow waves (27)(28)(29)(30). The activation functions of excitatory and inhibitory neurons were both modeled by softplus functions, but the threshold and slope were greater for the in- In the post-pre stimulations, the postsynaptic neuron emitted an evoked spike at t = 40 ms and also responded to the presynaptic stimulation at t = 50 ms with some delay due to refractoriness. After the pre-only stimulations, the synaptic weight changed little in the down state, but was depressed in the up state. In the post-pre stimulation, the synaptic weight was depressed in both the down and up states. The lines and shadows of the weight traces represent the means and SDs, respectively. (B) The synaptic changes by the STDP stimulations (blue points) and pre-only stimulations (orange dotted lines). As the value of t increased or decreased, the synaptic changes by the STDP stimulations converged to the change by the pre-only stimulations. Synaptic plasticity was biased towards depression in the up state. (C) Synaptic changes were dependent on mean activation intensity. The synaptic changes in pre-post stimulation with t = 10 ms and pre-only stimulation decreased with increasing mean activation intensity, whereas the synaptic changes in post-pre stimulation with t = −10 ms were less sensitive to mean activation intensity. hibitory neurons than those for the excitatory neurons (Fig. 3B). In these settings, the excitatory and inhibitory activities showed bistability of the up and down states, and the transitions were caused by the slowly changing adaptation currents (SI Appendix, Fig. S5A and B). This was consistent with the previous theoretical study (26). The inhibitory population was mostly inactive in down states because of the high threshold (Fig. 3B), but was active in the up state and stabilized the recurrent excitatory activity.
To consider the difference between global and local slow waves, we extended the model by embedding four local networks, each as described above, within the overall network (Fig. 3A). In cases of no between-network connections, each local network independently produced up and down cycles of slow waves (Fig. 3C). Next, we introduced sparse long-range excitatory connections between different local networks. We assumed that the long-range connections project to both the excitatory and inhibitory neurons, as demonstrated in previous modeling studies (31). In this setting, each local network showed the transitions between the up and down states, some of which were local, whereas others were global in synchrony across the local networks (Fig. 3D). To objectively define global and local slow waves, we first classified the up and down states of each local network based on the mean membrane potential averaged across excitatory neurons ( Fig. 3C and E). Transitions to the down and up states were detected when the mean membrane potential decreased below −69.75 mV and exceeded −68.25 mV, respectively (the choice of these thresholds did not affect the results; see SI Appendix, Fig. S6). We classified each state into a global or local state by counting the number of up and down states across the local networks ( Fig. 3D) (see the "Methods" section). In the global up/down states, either all or all but one local network simultaneously achieved the same state.
We then analyzed how the difference between global and local slow waves affected the learning by the infomax rule. Since the outcome of the infomax rule depends on the mean firing rates, we examined the mean firing rates in the up and down states during the global and local slow waves. The mean firing rates of excitatory neurons followed global down < local down < global up < local up in ascending order in the simulations (Fig. 3F). The difference between the global and local down states was simply explained by the strength of the long-range excitation from the surrounding networks to the local excitatory population. Because the surrounding excitatory populations had elevated activity in their up state, the long-range excitation was stronger in the local down states than that in the global down states. Note that the local inhibitory population was mostly inactive in both the local and global down states and did not contribute significantly to the difference. By contrast, the difference between the global and local up states was mainly explained by the local inhibition to the excitatory population. While the local network was in the up state, its inhibitory activity was more sensitive to the long-range excitation than its excitatory activity because of the steeper inhibitory activation function at high membrane poten- tial (see Fig. 3B and the "Methods" section). Therefore, the strong long-range excitation from the surrounding networks in the global up state effectively reduced the local excitatory activity via local inhibition. To verify this, we performed a phase plane analysis, assuming a large number of neurons (see SI Appendix Methods).
The phase planes showed that the firing rates of the two stable points (i.e., up and down states) changed depending on the state of the surrounding networks ( Fig. 3G and see SI Appendix Methods for parameter dependency). As expected, the membrane potential of the excitatory neurons was higher in local down states with elevated long-range excitation than that in the global down states. In addition, the membrane potential of excitatory neurons was lower in the global up states than that in the local up states. In this case, the long-range excitation shifted both the excitatory and inhibitory nullclines. Since the shift of the inhibitory nullcline was much larger than that of the excitatory nullcline, the membrane potential of excitatory neurons decreased with the longrange excitation (Fig. 3G). The observed higher local excitatory activity in the local up states than that of the global up states is a natural consequence of unstable recurrent excitatory dynamics being stabilized by the strong inhibition. A similar response to external input was previously demonstrated both experimentally and theoretically as a property of inhibition-stabilized networks (ISNs) (31)(32)(33). Based on these results, we hypothesized that the infomax rule, which is sensitive to the baseline firing rate, would yield distinct learning outcomes in the global and local slow waves.

Optimal synaptic plasticity in up and down states of global and local slow waves
To study the outcome of the infomax rule in global and local slow waves, we first explored the STDP window using the slow-wave model in the previous section. We introduced multiple presynaptic excitatory neurons that spike synchronously when externally stimulated, as shown in Fig. 2. These presynaptic neurons projected onto a randomly selected postsynaptic neuron in the first excitatory population, E1, of the slow-wave model (Fig. 4A). We assumed that these feedforward synaptic weights were updated by the infomax rule, whereas the recurrent synaptic weights were fixed. The changes in the feedforward synaptic weights averaged over random realizations of the model are shown below.
We first studied the STDP by evoking a postsynaptic spike at a fixed time, before or after the presynaptic stimulation to 20 external input neurons. Here, we assume strong synapses with 0.5 mV EPSPs from these neurons. In addition to this evoked spike, the The schematic description of the neuronal networks related to the task. Presynaptic neurons were divided into two populations G and L. Both G and L populations emitted synchronous spikes ("task cue") during the task. The postsynaptic neuron ("task neuron") is an excitatory neuron in the E1 population that is projected by the presynaptic neurons. Task performance is defined as the firing rate increase of the task neuron during the task period. (B) As neuronal reactivation, the G and L populations emitted synchronous spikes during the global and local up states of the E1 population during the post-learning NREM sleep, respectively, at the firing rates decreasing from 7.5 Hz at the beginning of sleep to 5.0 Hz at the end of sleep. (C) The changes of the synaptic weights and the task neuron reactivation strength during the post-learning sleep. The synaptic changes of the G and L populations are shown in red and purple, respectively (upper). The synapses of the population G were potentiated by reactivation during the global up states, whereas the synapses of the population L were depressed by reactivation during the local up states. When synaptic plasticity during the local-up or global-up is blocked, the synaptic changes of the corresponding population were inhibited. Hence, the sum of synaptic weights in two populations was global-up blocked < control < local-up blocked in ascending order. The reactivation strength of the task neuron in global and local up states are shown in red and purple, respectively (lower). The blue dotted line represents the assumed decrease in the task cue reactivation rate. The reactivation strengths of the task neuron in global and local up states were preserved and diminished, respectively, reflecting the synaptic changes of the corresponding population. In the local-up-blocked and global-up-blocked plasticity conditions, the reactivation strength of the task neuron in local and global up states became the same as the time-course of the task cue reactivation, respectively. The lines and shadows represent the means and SEMs in the 600 trials, respectively. (D) The comparison of task performance before and after synaptic changes during sleep. This tendency is the same as that of the sum of synaptic weights in the G and L populations shown in Fig. 5C. Error bars represent SEM. postsynaptic neuron could generate other spikes triggered by the network activity. In the slow-wave model, the mean activation intensityḡ was computed by averaging the activation intensity g E (u i ) of neuron i for neurons included within the same excitatory population (see the "Discussion" section for possible biological implementations). As expected from Figs. 2 and 3, the STDP results depended on the sleep states of the slow-wave model (Fig. 4B). These results indicate two important points. First, the synaptic plasticity in the up states is biased toward synaptic depression as compared with the down states, which is consistent with the experimental findings (20). Second, synaptic plasticity in the local up and down states is biased toward synaptic depression as compared with the global up and down states, respectively. This property results from the model prediction that the mean firing rates are higher in the local up and down states than those in the corresponding global states. As expected from SI Appendix, Fig. S4, stimulating many weak synapses (40 synapses with 0.09 mV EPSPs) tended to induce more potentiation (Fig. 4C, note that the scale is different from Fig. 4B), although the difference between global and local up and down states described above still existed (Fig. 4D). Especially when many weak synapses were stimulated, as in Fig. 4C, the preonly stimulation during global up states caused synaptic potentiation, and that during local up states caused synaptic depression (Fig. 4D).
To further investigate the impact of sleep states on memory reorganization, we simulate how synaptic weights that contribute to task performance change during subsequent sleep. This time, we simulated presynaptic neurons having small initial synaptic weights. The assumption is that relatively weak synapses are mainly involved in the in-vivo learning of a new task. During the awake condition, we assume that the presynaptic neurons emitted spikes synchronously upon the presentation of a task cue and projected to a postsynaptic neuron (task neuron) in the E1 population of the slow-wave model (Fig. 5A). The simulation was repeated over random realizations of the model parameters. Inspired by the brain-machine-interface task (13), we defined task Fig. 6. The proposed role of NREM sleep is to bridge neural information coding, synaptic plasticity, and memory reorganization. (A) The relationship between the mean firing rate and synaptic changes by the infomax rule. Synaptic potentiation by the information term was decreased at a high firing rate owing to many background spikes, whereas synaptic depression by the cost term was unaffected. Therefore, the high firing rate induced synaptic depression. Because the mean firing rates are global down < local down < global up < local up in ascending order, the amount of synaptic changes follows the opposite order. (B) The possible distinct roles of global and local slow waves. Reactivated patterns during global slow waves induced synaptic potentiation, whereas those during local slow waves induced synaptic depression. This could cause selective memory consolidation and forgetting of the reactivated patterns during the global and local slow waves, respectively.
performance as an increase in the task neuron's firing rate upon the presentation of a task cue. Then, we considered the synaptic changes during post-learning NREM sleep, assuming that the feedforward synaptic weights have already been potentiated to elevate the postsynaptic firing rate during the task. Experimentally, the triple coupling of slow waves, spindles, and reactivation is considered crucial for memory consolidation (2,18,34,35), in which spindles are considered to promote synaptic plasticity by facilitating dendritic activities (2,(35)(36)(37)(38). Therefore, we assumed that synaptic plasticity is induced by the reactivation inputs from presynaptic neurons in the presence of spindles. Although we did not explicitly model spindles in this study, we assumed that spindles are nested in slow waves when memory reactivation occurs in up states of the task neuron (see the "Discussion" section for details). Presynaptic neurons were divided into two populations, the global-up-reactivated neurons G and local-up-reactivated neurons L, each consisting of 40 presynaptic neurons and synchronously emitted spikes with Poisson statistics during global up and local up states, respectively. We assumed that the Poisson rate of the memory reactivation (17,19) of the task cue decreased from 7.5 Hz at the beginning of an NREM sleep to 5.0 Hz at the end of an NREM sleep to reproduce experimental results of task neuron reactivation (Fig. 5C, see below for detail). We updated the feedforward synaptic weights during postlearning NREM sleep according to the infomax rule when presynaptic reactivation happened. In Fig. 5, we restricted memory reactivation to occur during the up states of the local network only (Fig. 5B) because the intervention of the up states, not down states, mainly affected the performance of the brain-machine-interface task (13,39). More generally, memory reactivation might also happen in down states depending on the experimental setup (40). This possibility was also investigated in SI Appendix, Fig. S7. Further simulation details are described in SI Appendix Methods.
The feedforward synaptic weights of the populations G and L were further potentiated and depressed in the simulation of postlearning NREM sleep, respectively (Fig. 5C), because global and local up states promote synaptic potentiation and depression, respectively, with weak and many stimulated synapses (Fig. 4D). Task performance increased during sleep, reflecting the increased sum of synaptic weights of the population G and L, consistent with the experimentally suggested memory consolidation during NREM sleep (Fig. 5D). The reactivation strength of the task neuron (i.e., the firing rate increase of the task neuron from baseline firing rates when the task cue is reactivated) during global and local up states were kept constant and decreased, respectively (Fig. 5C). Under the gradual decrease of the task cue reactivation rate, synaptic potentiation of the population G and depression of the population L promoted the reactivation strength of the task neuron during global and local up states to be kept constant and to be further decreased, respectively. This was consistent with the experimental result that reactivation strengths in the spindles nested in global and local up states were preserved and weakened, respectively (13).
Next, we investigated the roles of global and local slow waves in memory reorganization separately by inhibiting synaptic plasticity either during global or local up states. Note that in the simulation, synaptic plasticity within 50 mec after global or local up states was also inhibited to eliminate synaptic plasticity during the transition states that is sensitive to arbitrary model assumptions (see SI Appendix Methods for detail). The synaptic depression of the population L was inhibited when synaptic plasticity was blocked during local up states. Further, synaptic potentiation of the population G was inhibited when synaptic plasticity was blocked during global up states (Fig. 5C). As a result, the task performance showed a greater increase in the former case but a decrease in the latter case (Fig. 5D). The results are consistent with experimental findings that global and local slow waves contribute to memory consolidation and forgetting, respectively (13). The reactivation strength of the task neuron exhibited changes monotonically related to synaptic weights. The decrease of the reactivation strength of the task neuron during local up states became slower when synaptic plasticity was blocked during local up states. The reactivation strength of the task neuron during global up states decreased when synaptic plasticity was blocked during global up states. This was consistent with the experimental result that inhibition of global up states promoted a gradual decrease of reactivation strength in the spindles nested in global up states (13).
These results suggest that the balance of global and local slow waves with distinct information transfer capacities regulates the spectrum of memory consolidation and forgetting via the infomax synaptic plasticity rule.

Discussion
Using a top-down approach with the information theory, we provided a unified learning rule, the infomax rule, for the statedependent synaptic plasticity during NREM sleep. The infomax rule is comprised of the synaptic changes by the information term and synaptic depression by the synaptic cost term (Fig. 6A). A high firing rate condition biases the synaptic plasticity toward depression. The reason is that the signal-to-noise ratio for the synaptic transmission declines with the background firing rate of the postsynaptic neuron, and the cost term dominates the information term under a high firing rate condition. The learning rule yields an information-theoretical interpretation of different STDP observed in the up and down states (20). Moreover, it also provides the distinct STDP during the global and local slow waves, suggesting a possible mechanism for balancing memory consolidation and forgetting. These properties are consistent with the role of neuronal reactivation in global and local slow waves (13).
The infomax rule not only reproduces the biased STDP toward depression during up states (20), but also provides a reliable prediction of the entire STDP curve during up states; the original experiment measured the synaptic change using a few representative time differences (10, 50, and −10 ms) between the presynaptic and postsynaptic spikes. The infomax rule further predicts that the STDP curve is sensitive to the initial synaptic weights and number of synchronous inputs (SI Appendix, Figs. S3 and S4). These predictions are experimentally testable using protocols similar to those of ref. (20), using various time differences, and altering the strength of presynaptic stimulations. Importantly, the infomax rule also predicts that relatively weak synapses can be potentiated, even in the up state, if the surrounding area is also in the up state and if a large number of presynaptic neurons emit synchronous spikes (SI Appendix, Fig. S4). Consistent with the predicted synaptic potentiation of weak synapses during sleep, recent experimental findings suggest that synaptic potentiation and the formation of new spines play an essential role in memory consolidation (41)(42)(43)(44)(45). Therefore, the infomax rule provides a unifying view by reproducing both the state-dependent STDP (Fig. 4) and the wave-scale-dependent memory reorganization (Fig. 5). Note that the STDP curve during up states was measured in urethaneanesthetized young mice (20). Since the animal age or anesthesia could affect synaptic plasticity (46,47), whether this STDP rule applies to physiological sleep in adult mice needs to be explored in future studies.
The infomax rule requires estimating the expected firing rate of each postsynaptic neuron in real-time to set the activity threshold separating the synaptic potentiation and depression. There are several biologically plausible implementations for computing this. The simplest estimate uses a temporally averaged firing rate. However, it tends to lag behind the true instantaneous firing rate in the presence of slow waves. This possibility is also at odds with the experimental observation, where manipulating postsynaptic membrane potential to be depolarized during down states or hyperpolarized during up states did not significantly affect the synaptic changes (20). This experimental result is reproduced if the expected instantaneous firing rate is accurately estimated using the average excitatory firing rate of the local network population because the artificial single-neuron manipula-tion does not significantly change the local population activity. It is possible that the inhibitory neurons projected by nearby excitatory neurons compute the average firing rate, which is consistent with the observation that inhibitory input could modulate the balance of synaptic potentiation and depression (48). Alternatively, astrocytes may temporally and spatially integrate nearby synaptic inputs and regulate the activity threshold separating synaptic potentiation and depression (49).
The infomax rule suggests the potential importance of down states for memory consolidation. Although some studies have assumed down states as resting periods for cellular maintenance (50), it has been reported that the hyperpolarization might be essential for slow waves to induce synaptic potentiation (44). A recent study further suggested that the neuronal activities during down states ("delta spikes") coincided with the hippocampal ripple activities and might be important for memory consolidation (40). The infomax rule suggests that down states with fewer background spikes promote synaptic potentiation more than up states, implying that the delta spikes could effectively induce synaptic potentiation and memory consolidation. Many studies have also focused on the involvement of neuronal activities in up states or the transition period from down to up states during memory consolidation (13,(51)(52)(53). Such distinct roles of the neuronal reactivation during the up and down states warrant further study.
The infomax rule in our sleep model suggests the importance of the spatial scale of slow waves (12,54). Specifically, the present model suggests that the synaptic changes induced by global slow waves are dominated by potentiation as compared to local slow waves because of different baseline firing rates. Although we did not explicitly model spindles in this study, including a spindlegenerating mechanism to the model is an important future direction. In the current study, we simply assumed that slow-wavenested spindles would be generated upon memory reactivation in the task neuron's up states and they are necessary for inducing synaptic plasticity. This assumption agrees with the argument that the triple coupling of slow waves, spindles, and reactivation is crucial for memory consolidation (2,18,34,35), in which spindles promote synaptic plasticity by enhancing dendritic activities (2,(35)(36)(37)(38). Our model behavior is consistent under this assumption with the experimental results by Kim et al. Optogenetic inhibition of the primary motor cortex during global and local up states caused a decrease and increase in the ratio of the number of spindles nested in global slow waves to that in local slow waves, respectively (13). This result is expected if cortical activity in up states is needed to facilitate thalamic spindle generation (2,15,55). An extreme condition with the total lack of nested spindles during global or local up states corresponds to the lack of synaptic plasticity then (as explored in Fig. 5) according to the assumption. The suggested role of memory consolidation during global slow waves and forgetting during local slow waves raises the possibility that there are mechanisms regulating the spatial scales of slow waves for selecting a subset of reactivation events to be consolidated (Fig. 6B). One possibility is that some neuronal populations projecting broadly to cortical neurons promote global synchronization. Previous studies have consistently suggested that the thalamus (56) and claustrum (57) play a role in synchronizing the down states of multiple cortical neurons. If such neuronal populations are co-active with reactivation patterns, these patterns could be selectively consolidated. Considering the high temporal correlation between the hippocampal sharp-wave ripples (SWRs) and cortical slow waves (58), specific neuronal populations may regulate both the generation of slow waves and neuronal reactivation. This possibility needs to be explored further.
Our theory could be verified experimentally along with the following two points. First, excitatory firing rates during global up states were lower than during local up states. Although the direct evidence supporting this prediction has not been reported to our knowledge, it is indirectly supported by theoretical and experimental studies outside sleep research. Our network operates as an ISN in its up states (31)(32)(33)59). ISN models reproduce the experimentally observed surround suppression effect, i.e., the reduction in both local excitatory and inhibitory firing rates upon the activation of surrounding populations (31,33). This property is sufficient (but not necessary) for our model to exhibit lower excitatory firing rate in the up states of more global slow waves. Notably, a hallmark ISN property, the nonmonotonic inhibitory response, is experimentally verified across cortical areas (60). Furthermore, recent technical advances in neuronal recordings from a large number of neurons (61) can annotate up and down states in each local area more precisely. In the future, such recordings could directly verify the model prediction that the mean firing rates of excitatory neurons decline with the spatial scale of the slow waves. Another testable model prediction is that global slow waves should permit more efficient information transmission than local slow waves. For example, one could optogenetically stimulate a small group of neurons and quantify the accuracy of stimulus encoding in neurons postsynaptic to the stimulated neurons during global and local slow waves.
Another interesting perspective is the qualitative reorganization of memories during sleep (62). While our model focuses on synaptic plasticity and quantitative memory reorganization (i.e., consolidation vs. forgetting), a recent theory proposes that the learning cycle mimicking wakefulness, NREM sleep, and REM sleep promote the formulation of new cortical representations, not just strengthening or weakening experiences (63). Bridging synaptic plasticity rules mainly obtained in the rodent experiment and qualitative memory reorganization proposed in the cognitive study is an interesting future direction.
In summary, the proposed theory bridges neuronal information coding, synaptic plasticity, and memory reorganization. Our normative framework provides a versatile learning rule for statedependent synaptic plasticity and memory reorganization during NREM sleep.

Spiking neuron model
We introduced a stochastic spiking neuron model; each neuron was either excitatory (E) or inhibitory (I). The spikes of each neuron in population P (P = E, I) were generated probabilistically with density ρ(t), as follows: where g P (u) denoted a softplus activation intensity function, u(t) the membrane potential, and R(t) a refractory factor representing transient suppression of the instantaneous postsynaptic firing rate after a postsynaptic spike. The function g P (u) = r P 0 log(1 + exp((u − u P 0 )/ u P )) with r E 0 = 1.5 Hz, r I 0 = 6.0 Hz, u E 0 = −69.4 mV, u I 0 = −62.5 mV, and u E = u I = 0.5 mV. The refractory factor is modeled to be the same for excitatory and inhibitory neurons as 4 ), wheret denoted the last spike time of the postsynaptic neuron and time constant τ R = 30 ms. The suppression of this factor after spiking may reflect several mechanisms, including classical refractoriness (64), afterhyperpolariza-tion (AHP) (65), and EPSP suppression by back-propagating action potential (66).

Single-neuron model
We modeled a postsynaptic neuron that received the feedforward inputs from N presynaptic excitatory neurons. Each spike of presynaptic neuron j evoked EPSP of amplitude w j and exponential time course (t) = exp(−t/τ E m )H(t) with time constant τ E m = 25 ms and Heaviside step function H(t). The EPSP time course from presynaptic neuron j was denoted by h j (t) = f (t − t f j ), summing the influence of presynaptic spikes at time t f j (f = 1, 2, …). Then, the postsynaptic membrane potential u(t) was denoted by with resting membrane potential u r = −70 mV. Postsynaptic spikes were generated probabilistically with density ρ(t) = g E (u(t))R(t), as mentioned in the previous section.

Information maximizing learning rule
We used the infomax rule for synaptic plasticity (24,25). The objective function L was described as with the mutual information term I, cost term of the synaptic weights, and coefficient parameter λ = 0.32 [1/(mV) 2 ]. I measures the mutual information between the pre and postsynaptic spike trains. We omitted the homeostatic term included in previous studies because it did not contribute to our results. Each term was denoted by with presynaptic and postsynaptic spike trains X and Y, respectively, and n j represents the number of presynaptic spikes at synapse j during duration T. The presynaptic and postsynaptic spike trains up to time t were formally denoted by post represents the f-th (f = 1, 2, …) postsynaptic spike timings. We specifically wrote X = X(T) and Y = Y(T) to represent the entire spike train from time t = 0 to t = T. Note that the angular brackets · Y, X and · X represent the averages over all possible Y, X, and X, respectively.
The optimal synaptic weight change followed the gradient ascent algorithm: with a learning rate α = 0.01 (mV) 2 . By calculating the gradient (24,25), the infomax rule was described as , expected firing rateρ(t) = ρ(t) X(t)|Y(t) , and time constant τ C = 100 ms. The expected firing rate was further computed asρ using the expected intensityḡ(t) = g E (u(t)) X(t)|Y(t) ; however, this value can be difficult to calculate ifḡ is time dependent. We estimatedḡ using slightly different methods for the single-neuron and slow-wave models, as described in the corresponding sections (SI Appendix Methods).

Theoretical analysis of the STDP effect
For the theoretical analyses in this section, the activation function is not restricted to the softplus function. We evaluated the synaptic changes due to the pre-post stimulation with synchronous presynaptic spikes at t = 0 and a postsynaptic spike immediately afterward (at t = t > 0 with the limit of t → 0). To estimate the effect of STDP analytically and qualitatively, we made some simplifications. We approximated that the refractory factor R abruptly recovered from zero to one duration τ R after the postsynaptic spike, and that the spontaneous spiking of the nonstimulated presynaptic neurons yielded a constant baseline membrane potential u 0 . We also assumed that, after presynaptic stimulation at t = 0, the membrane potential u(t) = u 0 + j ∈ stim w j h j (t) decayed back to the baseline u 0 by the time the postsynaptic neuron recovered from the refractoriness because τ E m was smaller than τ R . Here, stim denotes the set of stimulated neurons. The baseline intensity is denoted byḡ = g E (u 0 ), and the peak intensity increase after presynaptic stimulation by g = g E (u 0 + u) −ḡ, with u = j ∈ stim w j . In this case, we found C j (t) = H(t) ∂g E (u) ∂u u=u0+ u /(ḡ + g) exp(−t/τ C ) and B post (t) = y(t) log g E (u(t)) g , using the Heaviside step function H(t). Note that no extra postsynaptic spike was possible when h j was significantly positive because the refractory period was longer than that of the stimulus-caused EPSP duration. Hence, the change in the j-th synaptic weight due to the STDP stimulus was In the case that the activation function is a linear function g E (u) = g 0 · (u − u r ), the synaptic change is denoted by g0 g+ g log 1 + ḡ g , where g does not depend onḡ. This result indicated that the increase in synaptic weights due to the information term decreased with the mean firing intensityḡ.

Slow-wave model
We considered N E = 800 excitatory neurons, E, and N I = 200 inhibitory neurons, I, divided into four local networks. Each local network contained N E 4 excitatory neurons and N I 4 inhibitory neurons. Within each local network, no connections exist between two inhibitory neurons, and all-to-all connections exist between two excitatory neurons as well as between excitatory and inhibitory neurons. Between different local networks, the connection probability from E to E was p EE and the connection probability from E to I was p IE , whereas inhibitory neurons did not send long-range connections to the other local networks. The connection probabilities p EE and p IE were set to 0.05 and 0.3, respectively, except for those in Fig. 3C and SI Appendix, Fig. S5. There is no self-coupling in these neurons. The dynamics of the membrane potential u P i of neuron i in population P (P = E, I) were described by with the recurrent synaptic weight w EE i j , w EI i j , and w IE i j from excitatory neuron j to excitatory neuron i, from inhibitory neuron j to excitatory neuron i, and from excitatory neuron j to inhibitory neuron i, respectively, membrane time constant τ E m = 25 ms for excitatory neurons and τ I m = 5 ms for inhibitory neurons, the resting membrane potential u r = −70 mV, spike train S E j (t) = f δ(t − t f j ) or S I j (t) = f δ(t − t f j ) of excitatory or inhibitory neuron j described with its f-th spike time t f j , external current I ext i (t), and adaptation current I a i (t). The spikes of neuron i in population P (P = E, I) were generated probabilistically with an instantaneous firing rate g P (u P i (t))R i (t), as mentioned in the previous section, with its refractory factor R i . The recurrent synaptic weight was fixed at w EE i j = 0.16 mV, w EI i j = −0.14 mV, and w IE i j = 0.66 mV if synaptic connections existed from neuron j to neuron i, whereas it was set to 0 mV if there were no synaptic connections. The external current was a feedforward input only to the excitatory neuron 1, described as I ext i (t) = δ i1 N ext j=1 w ext j S ext j (t), with the Kronecker delta δ i1 , synaptic weights w ext  Fig. S7), although they did not emit spontaneous spikes. The STDP and task simulations (Figs. 4 and 5 and SI Appendix, Fig. S7) changed the feedforward synaptic weights w j according to the infomax rule. The dynamics of the adaptation current I a i (t) of neuron i were described as follows with a time constant of τ a = 1,500 ms and a constant value of β = 0.0077 mV/ms.

Stage classification
First, we classified the up and down states of each population using two transition thresholds. When the mean membrane potential of excitatory neurons in each local network exceeded the uptransition threshold θ up = −68.25 mV, this moment was judged as a state transition to the up state. When the mean membrane potential fell below the down-transition threshold θ down = −69.75 mV, this moment was judged as a state transition to the down state. We then classified each state into the global or local states. When a local network was in the down state, it was classified into the global down state if the number of other local networks in the down states was two or three, while it was classified as the local down state otherwise. Likewise, when a focused population was in the up state, it was classified into the global up state if the number of other local networks in the up states was two or three, while it was classified as the local up state otherwise.

Simulation environment
All numerical calculations were performed using the customwritten Python codes. The model was simulated in discrete time with time steps of 1 ms. Synaptic weights did not change during the first 10,000 ms (200,000 ms in Fig. 5 and SI Appendix, Fig. S7) in all simulations to avoid the effects of initial values. The initial value of the last spike time was set to −10,000 ms for all neurons.