Adaptation of spontaneous activity in the developing visual cortex

Spontaneous activity drives the establishment of appropriate connectivity in different circuits during brain development. In the mouse primary visual cortex, two distinct patterns of spontaneous activity occur before vision onset: local low-synchronicity events originating in the retina and global high-synchronicity events originating in the cortex. We sought to determine the contribution of these activity patterns to jointly organize network connectivity through different activity-dependent plasticity rules. We postulated that local events shape cortical input selectivity and topography, while global events homeostatically regulate connection strength. However, to generate robust selectivity, we found that global events should adapt their amplitude to the history of preceding cortical activation. We confirmed this prediction by analyzing in vivo spontaneous cortical activity. The predicted adaptation leads to the sparsification of spontaneous activity on a slower timescale during development, demonstrating the remarkable capacity of the developing sensory cortex to acquire sensitivity to visual inputs after eye-opening.


Introduction
The impressive ability of the newborn brain to respond to its environment and generate coordinated output without any prior experience suggests that brain networks undergo substantial organization, tuning and coordination even as animals are still in the womb, driven by powerful developmental mechanisms. These broadly belong to two categories: activity-independent mechanisms, involving molecular guidance cues and chemoaffinity gradients which establish the initial coarse connectivity patterns at early developmental stages (Feldheim and O'Leary, 2010;Goodhill, 2016), and activitydependent plasticity mechanisms which continue with refinement of this initially imprecise connectivity into functional circuits that can execute diverse behaviors in adulthood Richter and Gjorgjieva, 2017;Thompson et al., 2017). Non-random patterns of spontaneous activity drive these refinements and act as training inputs to the immature circuits before the onset of sensory experience. Many neural circuits in the developing brain generate spontaneous activity, including the retina, hippocampus, cortex, and spinal cord (reviewed in Blankenship and Feller, 2010;Wang and Bergles, 2015). This activity regulates a plethora of developmental processes such as neuronal migration, ion channel maturation, and the establishment of precise connectivity (Huberman et al., 2008;Moody and Bosma, 2005;Kirkby et al., 2013;Godfrey and Swindale, 2014), and perturbing this activity impairs different aspects of functional organization and axonal refinement (Cang et al., 2005a;Xu et al., 2011;Burbridge et al., 2014). These studies firmly demonstrate that spontaneous activity is necessary and instructive for the emergence of specific and distinct patterns of neuronal connectivity in the developing nervous system. Recent in vivo recordings in the developing sensory cortex have found that the spatiotemporal properties of spontaneous activity, including frequency, synchronicity, amplitude and spatial spread, depend on the studied region and developmental age (Golshani et al., 2009;Rochefort et al., 2009;Gribizis et al., 2019). These studies have shown that the generation and propagation of spontaneous activity in the intact cortex depend on input from different brain areas. For instance, activity from the sensory periphery substantially contributes to the observed activity patterns in the developing cortex, but there are other independent sources of activity within the cortex itself (Ackman et al., 2012;Siegel et al., 2012;Hanganu et al., 2006;Gribizis et al., 2019). Two-photon imaging of spontaneous activity in the in vivo mouse primary visual cortex before eye-opening (postnatal days, P8-10) has demonstrated that there are two independently occurring patterns of spontaneous activity with different sources and spatiotemporal characteristics. Peripheral events driven by retinal waves (Feller et al., 1996;Blankenship and Feller, 2010) spread in the cortex as low-synchronicity local events (L-events), engaging a relatively small number of the recorded neurons. In contrast, events intrinsic to the cortex that are unaffected by manipulation of retinal waves spread as highly synchronous global events (H-events), activating a large proportion of the recorded neurons (Siegel et al., 2012).
We know relatively little about the information content of these local and global patterns of spontaneous cortical activity relevant for shaping local and brain-wide neural circuits. Specifically, it is unknown whether spontaneous activity from different sources affects distinct aspects of circuit organization, each providing an independent instructive signal, or if L-and H-events cooperate to synergistically guide circuit organization. Therefore, using experimentally characterized properties of spontaneous activity in the visual cortex in vivo at P8-10, we developed a biologically plausible, yet analytically tractable, theoretical framework to determine the implications of this activity on normal circuit development with a focus on the topographic refinement of connectivity and the emergence of stable receptive fields.
We postulated that peripheral L-events play a key role in topographically organizing receptive fields in the cortex, while H-events regulate connection strength homeostatically, operating in parallel to network refinements by L-events. We considered that H-events are ideally suited for this purpose because they maximally activate many neurons simultaneously, and hence lack topographic information that can be used for synaptic refinement. We studied two prominent activity-dependent plasticity rules to investigate the postulated homeostatic function of H-events, the Hebbian covariance rule (Miller et al., 1989;Miller, 1994;Lee et al., 2002;Sejnowski, 1977) and the Bienenstock-Cooper-Munro (BCM) rule (Bienenstock et al., 1982). In the Hebbian covariance rule, simultaneous pre-and postsynaptic activation (e.g. during L-events) triggers the selective potentiation of synaptic connections, while postsynaptic activation without presynaptic input (e.g. during H-events) leads to the unselective depression of all connections. In the BCM rule, H-events dynamically regulate potentiation and depression. However, both rules generate receptive fields that have either refinement or topography defects. Therefore, we proposed that H-events might be self-regulating, with amplitudes that adapt to the levels of recent cortical activity. Indeed, we found evidence of this adaptation in spontaneous activity recorded in the developing visual cortex (Siegel et al., 2012). Besides generating topographically refined receptive fields, this adaptation leads to the sparsification of cortical spontaneous activity over a prolonged timescale of development as in the visual and somatosensory cortex (Rochefort et al., 2009;Golshani et al., 2009). Therefore, our work proposes that global, cortically generated activity in the form of H-events rapidly adapts to ongoing network activity, supporting topographic organization of connectivity and maintaining synaptic strengths in an operating regime.

A network model for connectivity refinements driven by spontaneous activity
How spontaneous activity instructs network refinements between the sensory periphery and the visual cortex depends on two aspects: the properties of spontaneous activity and the activity-dependent learning rules that translate these properties into specific changes in connectivity. We first characterized spontaneous activity in the mouse primary visual cortex before eye-opening, and investigated two prominent learning rules to organize connectivity in a network model of the thalamus and visual cortex.
Spontaneous activity recorded in vivo using two-photon Ca 2+ imaging exhibits two independently occurring patterns: network events originating in the retina and propagating through the thalamus, and network events generated in the cortex (Siegel et al., 2012; Figure 1A). These two types of events were first identified by a cluster analysis based on event amplitude and jitter (a measure of synchrony; Siegel et al., 2012). The analysis identified a participation rate criterion to separate network events into local low-synchronicity (L-) events generated in the retina, where 20-80% of the neurons in the field of view are simultaneously active, and global high-synchronicity (H-) events intrinsic to the cortex, where nearly all (80-100%) cortical neurons are simultaneously active. This same 80% participation rate criterion was recently validated both at the single-cell and population levels (Leighton et al., 2020). We first confirmed differences in specific features of the recorded spontaneous events (Siegel et al., 2012), and also characterized novel aspects ( Figure 1B). In particular, L-events have a narrow distribution of amplitudes and inter-event intervals (IEI, the inverse of firing frequency) that follow an exponential-like distribution. H-events have a broader distribution of amplitudes with higher values and IEIs that follow a long-tailed distribution with higher values relative to L-events. We found that L-and H-events have similar durations.
Next, we built a model that incorporates these two different patterns of spontaneous activity to investigate the potentially different roles that L-and H-events might play in driving connectivity refinements between the thalamus and the visual cortex ( Figure 1C). We used a one-dimensional feedforward network model -a microcircuit motivated by the small region of cortex imaged experimentally -composed of two layers, an input (presynaptic) layer corresponding to the sensory periphery (the thalamus) and a target (postsynaptic) layer corresponding to the primary visual cortex ( Figure 2A). Cortical activity v in the model is generated by two sources ( Figure 2B; Table 1). First, L-events, u, activate a fraction between 20% and 80% of neighboring thalamic cells (also referred to as the L-event size) and drive the cortex through the weight matrix, W. Second, H-events, v spon , activate the majority of the cortical cells (a fraction between 80% and 100%, also referred to as the  H-event size). We used a rate-based unit with a membrane time constant t m and linear activation function consistent with the coarse temporal structure of spontaneous activity during development, carrying information on the order of hundreds of milliseconds (Gjorgjieva et al., 2009;Butts and Kanold, 2010;Richter and Gjorgjieva, 2017): To investigate the refinement of network connectivity during development, we studied the evolution of synaptic weights using plasticity rules operating over long timescales identified experimentally (Butts et al., 2007;Winnubst et al., 2015). First, we examined a classical Hebbian plasticity rule where coincident presynaptic thalamic activity and postsynaptic cortical activity in the form of L-events leads to synaptic potentiation. We postulated that H-events act homeostatically and maintain synaptic weights in an operating regime by depressing the majority of synaptic weights in the absence of peripheral drive. Because they activate most cortical neurons simultaneously, H-events lack the potential to drive topographical refinements. Their postulated homeostatic action resembles synaptic depression through downscaling, as observed in response to highly correlated network activity, for instance, upon blocking inhibition (Turrigiano and Nelson, 2004), or during slow-wave sleep (Tononi and Cirelli, 2006). Therefore, to the Hebbian rule we added a non-Hebbian term that depends only on the postsynaptic activity, with a proportionality constant that controls the relative amount of synaptic depression. This differs from other Hebbian covariance plasticity rules for the generation of weight selectivity, which include non-Hebbian terms that depend on both pre-and postsynaptic activity (Lee et al., 2002;Mackay and Miller, 1990) and is mathematically related to models of heterosynaptic plasticity (Chistiakova et al., 2014;Lynch et al., 1977;Zenke et al., 2015). Hence, the change in synaptic weight between cortical neuron j and thalamic neuron i is given by:  Figure 2. A network model of thalamocortical connectivity refinements. (A) A feedforward network with an input layer of thalamic neurons uðtÞ connected to an output layer of cortical neurons vðtÞ by synaptic weights WðtÞ. (B) Properties of L-and H-events in the model (amplitude L amp ; H amp , inter-event interval L int ; H int and duration L dur ; H dur ) follow probability distributions extracted from data (Siegel et al., 2012) (see Table 1). (C) Initially weak all-to-all connectivity with a small topographic bias along the diagonal (left) gets refined by the spontaneous activity events (right). (D) Evaluating network refinement through receptive field statistics (see Materials and methods). We quantify two properties: (1) the receptive field size and (2) the topography, which quantifies on average how far away the receptive field center of each cortical cell (red dot) is from the diagonal (dashed gray line).
where t w is the learning time constant and u the proportionality constant in the non-Hebbian term, which we refer to as the 'input threshold'. The activity time constant t m is much faster than the learning time constant, t m ( t w , which allows us to separate timescales and to study how network activity on average affects learning (see Appendix). Interestingly, in this Hebbian covariance rule, the input threshold together with H-events effectively implement a subtractive constraint (see Appendix: 'Normalization constraints'). Subtractive normalization preserves the sum of all weights by subtracting from each weight a constant amount independent of each weight strength and is known to generate selectivity and refined receptive fields (Miller and MacKay, 1994). This is in contrast to the alternative multiplicative normalization, which generates graded and unrefined receptive fields where most correlated inputs are represented (Miller and MacKay, 1994) and hence was not considered here. Additionally, we investigated the BCM learning rule, which can induce weight stability and competition without imposing constraints in the weights, and hence generate selectivity in postsynaptic neurons which experience patterned inputs (Bienenstock et al., 1982). For instance, the BCM framework can explain the emergence of ocular dominance (neurons in primary visual cortex being selective for input from one of the two eyes) and orientation selectivity in the visual system (Cooper et al., 2004). An important property of the BCM rule is its ability to homeostatically regulate the balance between potentiation and depression of all incoming inputs into a given neuron depending on how far away the activity of that neuron is from some target level. The change in synaptic weight between cortical neuron j and thalamic neuron i is given by: describes the threshold j v ðtÞ between depression and potentiation which slides as a function of postsynaptic activity, v 0 is the target rate of the cortical neurons and t the sliding threshold time constant. According to this rule, synaptic weight change is Hebbian in that it requires coincident preand postsynaptic activity, as is only the case during L-events. H-events induce no direct plasticity in the network because of the absence of presynaptic activation, but they still trigger synaptic depression indirectly by increasing the threshold between potentiation and depression.
Based on experimental measurements of the extent of thalamocortical connectivity at different developmental ages (Ló pez-Bendito, 2018), we assumed that initial network connectivity was weak and all-to-all, such that each cortical neuron was innervated by all thalamic neurons. To account for the activity-independent stage of development guided by molecular guidance cues and chemoaffinity gradients, a small bias was introduced to the initial weight matrix to generate a coarse topography in the network, where neighboring neurons in the thalamus project to neighboring neurons in the cortex and preserve spatial relationships ( Figure 2C, left). Following connectivity refinements through spontaneous activity and plasticity, a desired outcome is that the network achieves a stable topographic configuration ( Figure 2C, right) where each cortical neuron receives input only from a neighborhood of thalamic neurons.
To evaluate the success of this process, we quantified two properties. First, the receptive field size defined as the average number of thalamic neurons that strongly innervate a cortical cell ( Figure 2D). We normalized the receptive field size to the total number of thalamic cells, so that it ranges from 0 (no receptive field, all cortical cells decouple from the thalamus) to 1 (each cortical cell receives input from all the thalamic cells, all weights potentiate leading to no selectivity). We also quantified the topography of the final receptive field ( Figure 2D and Materials and methods), which evaluates how well the initial bias is preserved in the final network connectivity. The topography ranges from 0 (all cortical neurons connect to the same set of thalamic inputs) to 1 (perfect topography relative to the initial bias). We note that the lack of initial connectivity bias did not disrupt connectivity refinements and receptive field formation but could not on its own establish topography ( Spontaneous cortical H-events disrupt topographic connectivity refinement in the Hebbian covariance and BCM plasticity rules Both the Hebbian and the BCM learning rules are known to generate selectivity with patterned input stimuli (Mackay and Miller, 1990;Bienenstock et al., 1982), and we confirmed that L-events on their own can refine receptive fields in both scenarios (Figure 3-figure supplement 2). We found that including H-events in the Hebbian covariance rule requires that the parameters of the learning rule and the properties of H-events (the input threshold u and the inter-event interval H int ) follow a tight relationship to generate selective and refined receptive fields ( Figure 3A,C, left). For a narrow range of H int , weight selectivity emerges, but with some degree of decoupling between pre-and postsynaptic neurons ( Figure 3A, middle). Outside of this narrow functional range, individual cortical neurons are either non-selective ( Figure 3A, left) or decoupled from the thalamus ( Figure 3A, right). These results are robust to changes in the participation rates of L-and H-events. For instance, when H-events involve 70-100% of cortical neurons, the percent of outcomes with selective receptive fields increases slightly to 19.8% (compared to 14.0% when H-events involve 80-100% of cortical neurons), while the percent of outcomes with decoupled cortical neurons increases to 60.4% (compared to 43.6% when H-events involve 80-100% of cortical neurons), reinforcing the idea that H-events are detrimental to receptive field refinements. In comparison, including H-events in the BCM learning rule does not decouple pre-and postsynaptic neurons ( Figure 3B) and selectivity can be generated over a wider range of H-inter-event-intervals H int and target rates v 0 for the BCM rule ( Figure 3C, right).
Despite this apparent advantage of the BCM rule, it generates receptive fields with much worse topography than the Hebbian covariance rule ( Figure 3D). The underlying reason for this worse topography of the BCM rule is the sign of synaptic change evoked by L-events of different sizes corresponding to different participation rates. In particular, small L-events with low participation rates generate postsynaptic cortical activity smaller than the sliding threshold and promote long-term synaptic depression (LTD), while large L-events with high participation rates generate cortical activity larger than the sliding threshold and promote long-term synaptic potentiation (LTP) ( Figure 3E,F).   . Bottom: Percentage of simulation outcomes classified as 'selective' when the average receptive field size is smaller than one and larger than 0, 'non-selective' when the average receptive field size is equal to 1, and 'decoupled' when the average receptive field size is 0 for the two rules. (D) Topography of receptive fields classified as selective in C. Horizontal line indicates median, the box is drawn between the 25th and 75th percentile, whiskers extend above and below the box to the most extreme data points that are within a distance to the box equal to 1.5 times the interquartile range and points indicate all data points. Distributions are significantly different (***) as measured by a two-sample Kolmogorov-Smirnov test (n ¼ 70; 302 selective outcomes for each rule out of 500; p<10 À10 ; D = 0.45). (E) The response of a single cortical cell to L-events of different sizes (color) as a function of the sliding threshold for the BCM rule with H int ¼ 3:5 and v 0 ¼ 0:7. The cell's incoming synaptic weights from presynaptic thalamic neurons undergo LTP or LTD depending on L-event size. (F) Probability of L-event size contributing to LTD (left) and LTP (right) for the BCM rule with the same parameters as in E. The online version of this article includes the following figure supplement(s) for figure 3:  Therefore, the amount of information for connectivity refinements present in the small L-events is limited in the BCM learning rule resulting in poor topographic organization of receptive fields. Taken together, our results confirm that H-events can operate in parallel to network refinements by L-events and homeostatically regulate connection strength as postulated. However, the formation of receptive fields by the Hebbian covariance rule is very sensitive to small changes in event properties (e.g. inter-event intervals), which are common throughout development (Rochefort et al., 2009). In this case, H-events are disruptive and lead to the elimination of all thalamocortical synapses, effectively decoupling the cortex from the sensory periphery. In the BCM rule, including H-events prevents the decoupling of cortical cells from the periphery because the amount of LTD is dynamically regulated by the sliding threshold on cortical activity. However, L-events lose the ability to instruct topography because they generate LTP primarily when they are large. Therefore, neither learning rule seems suitable to organize network connectivity between the thalamus and cortex during development.

Adaptive H-events achieve robust selectivity
After comparing the distinct outcomes of the Hebbian and BCM learning rules in the presence of Land H-events, we proposed that a mechanism that regulates the amount of LTD during H-events based on cortical activity, similar to the sliding threshold of the BCM rule, could be a biologically plausible solution to mitigate the decoupling of cortical cells in the Hebbian covariance rule. This mechanism combined with the Hebbian learning rule could lead to refined receptive fields that also have good topographic organization. Hence, we postulated that H-events adapt by assuming that during H-events cortical cells scale their amplitude to the average amplitude of the preceding recent events. In particular, for each cortical cell j an activity trace h j integrates the cell's firing rate v j over a timescale t h slower than the membrane time constant: This activity trace h j then scales the intrinsic firing rate of the cortical cells during an H-event, H amp ! h j H amp , making it dependent on its recent activity. The activity trace h j might biophysically be implemented through a calcium-dependent signaling pathway that is activated upon sufficient burst depolarization and that is able to modulate a cell's excitability in the form of plasticity of intrinsic excitability (Desai et al., 1999;Daoudal and Debanne, 2003;Tien and Kerschensteiner, 2018). A fast, activity-dependent mechanism that decreases single-neuron excitability following a prolonged period of high network activity has been identified in spinal motor neurons of neonatal mice (Lombardo et al., 2018). However, there might be other ways to implement this adaptation (see Discussion).
Using adaptive H-events, we investigated the refinement of receptive fields in the network with the same Hebbian covariance rule ( Figure 4A). In sharp contrast to the Hebbian covariance rule with non-adaptive H-events ( Figure 3A), we observed that changing the average inter-event interval of H-events in a wider and more biologically realistic range (from the data, H int~3 L int ) yields selectivity and appropriately refined receptive fields ( Figure 4A). Increasing u or decreasing H int yields progressively smaller receptive fields while mitigating cortical decoupling ( Figure 4B). The refined receptive fields also have a very good topography because L-events in the Hebbian learning rule carry nearest-neighbor information for the topographic refinements ( Figure 4C). The proportion of selective receptive fields for adaptive H-events, however, is much higher than for their non-adaptive counterparts (390 vs. 70 out of 500 simulations). These results persist when the participation rates of L-and H-events change. For instance, when H-events involve 70-100% of cortical neurons, the percent of outcomes with selective receptive fields (75.0%) and the percent of outcomes with decoupled cortical neurons (0%) remain similar.
Next, we investigated how the proposed adaptive mechanism scales H-event amplitude by modulating the relative strength of H-events. For the Hebbian covariance rule, we calculated the analytical solution for weight development with L-and H-events by reducing the dimension of the system to two: one being the average of the weights that potentiate and form the receptive field, w RF , and the other being the average of the remaining weights, which we called 'complementary' to the receptive field, w C ( Figure 4D; see Appendix, Materials and methods). We calculated the phase  Percentage of simulation outcomes classified as 'selective' when the average receptive field size is smaller than one and larger than 0, 'non-selective' when the average receptive field size is equal to 1, and 'decoupled' when the average receptive field size is 0 for the two rules. (C) Topography of receptive fields classified as selective in B. Horizontal line indicates median, the box is drawn between the 25th and 75th percentile, whiskers extend above and below the box to the most extreme data points that are within a distance to the box equal to 1.5 times the interquartile range and points indicate all data points. Distributions are not significantly different (ns) as measured by a two-sample Kolmogorov-Smirnov test (n ¼ 70; 390 selective outcomes for each rule out of 500; p ¼ 0:41; D = 0.45). (D) Top: Reduction of the full weight dynamics into two dimensions. Two sets of weights were averaged: those which potentiate and form the receptive field, w RF , and the complementary set of weights that depress, w C . Bottom: Initial conditions in the reduced two-dimensional phase plane were classified into three outcomes: 'selective', 'non-selective', and 'decoupled'. We sampled 2500 initial conditions which evolved according to Equation 16 until the trajectories reached one of the selective fixed points, ðw max ; 0Þ and ð0; w max Þ, or resulted in no selectivity either because both weights depressed to ð0; 0Þ or potentiated to ðw max ; w max Þ. The normalized number of initial coordinates generating each region can be interpreted as the area of the phase plane that results in each outcome. (E) Top: Normalized area of the phase plane of the reduced two-dimensional system that resulted in 'selective', 'non-selective', and 'decoupled' outcomes for u ¼ 0:53 as a function of H-event strength.
The darker shading indicates ranges of non-adapted H-event strength where the selectivity area is maximized. Bottom: The corresponding adapted strength of H-events was calculated in simulations with adaptive H-events and plotted as a function of the nominal, non-adapted strength of H-events. Figure 4 continued on next page plane area of the reduced two-dimensional system with non-adaptive H-events (calculated as the proportion of initial conditions) that results in selectivity, potentiation or depression ( Figure 4D, bottom). We found that adaptively modulating the strength of H-events maximizes the area of the phase plane that results in selectivity ( Figure 4E, shaded region). The range of H-event strengths that maximizes the selective area for each input threshold in the reduced two-dimensional system can be related to the scaling of H-event amplitude in the simulations (Methods). In particular, the adaptation reliably shifts the H-event amplitude that would have occurred without adaptation, which we call 'non-adapted strength of H-events', into the regime of amplitudes that maximizes selectivity, which we call 'adapted strength of H-events' ( Figure 4E). Therefore, the adaptation of H-event amplitudes controls the selective refinement by peripheral L-events by modulating the cortical depression by adapted H-events.
In vivo spontaneous cortical activity shows a signature of adaptation To determine whether spontaneous cortical activity contains a signature of our postulated adaptation mechanism of H-event amplitudes, we reanalyzed published in vivo two-photon Ca 2+ imaging data recorded in the visual cortex of young mice (P8-10) (Siegel et al., 2012). We combined multiple consecutive~300 s long recordings for up to 40 mins of data from a given animal. First, we tested for long-term fluctuations in cortical excitability in the concatenated recordings of the same animal. We identified L-and H-events based on previously established criteria (Siegel et al., 2012). We found that the average amplitude of all (L and H) events is not significantly different across consecutive recordings of the same animal ( Next, we investigated the relationship between the amplitude of a given H-event and the average activity preceding it. For each detected H-event, we extracted all spontaneous (L-or H-) events that preceded this H-event up to T max ¼ 300 s before it. We then scaled the amplitude of each previous event multiplying it by an exponential kernel with a decay time constant of t decay ¼ 1000 s, which is sufficiently long to integrate many preceding spontaneous events (compared with the inter-event intervals in Figure 1B), and averaged these scaled amplitudes to get an aggregate quantity over amplitude and frequency (see Materials and methods).
We found that this aggregate amplitude of L-and H-events preceding a given H-event is significantly correlated (r ¼ 0:44, p<10 À10 ) to the amplitude of the selected H-event ( Figure 5B). Consequently, a strong (weak) H-event follows strong (weak) average preceding network activity ( Figure 5C), suggesting that cortical cells adapt their spontaneous firing rates as a function of their previous activity levels. The correlations are robust to variations in the inclusion criteria, maximum time T max to integrate activity and the exponential decay time constant t decay (Figure 5-figure supplement 3).

Modulating spontaneous activity properties affects receptive field refinements
Our results make relevant predictions for the refinement of receptive fields upon manipulating spontaneous activity. For example, H-event frequency can be experimentally reduced by a gap junction blocker (carbenoxolone) (Siegel et al., 2012). Our work demonstrates a trade-off between H-event frequency and the learning rule's threshold between potentiation and depression on receptive field size; hence, less frequent H-events will need a somewhat higher threshold to achieve the same receptive field size ( Figure 4). Similarly, L-events can also be experimentally manipulated, for instance, by altering inhibitory signaling (Leighton et al., 2020), or the properties of retinal waves which propagate as L-events into the cortex. We performed Monte Carlo simulations with a range of input thresholds u and variable participation rates of thalamic neurons in L-events, using the Hebbian covariance rule with adaptive H-events ( Figure 6A). Larger L-events in our model produce less refined, that is, larger receptive fields in the cortical network ( Figure 6B,C, left). This result is not surprising given the proposed role of L-events in guiding receptive field refinements, and is consistent with the imprecise and unrefined receptive fields observed in the visual cortex of animals where retinal wave properties have been modified. For instance, a prominent example of retinal wave manipulations are b2 knockout mice, which lack expression of the b2 subunit of the nicotinic acetylcholine receptor (b2-nAChR) that mediates spontaneous retinal waves in the first postnatal week. In these animals, retinal waves are consistently larger as characterized by the increased correlation with distance (Sun et al., 2008;Stafford et al., 2009;Cutts and Eglen, 2014), in addition to other features. As a result, there are measurable defects in the retinotopic map refinement of downstream targets (Grubb et al., 2003;Cang et al., 2005b;Burbridge et al., 2014). Smaller L-events also refine receptive fields with better topographic organization ( Figure 6C, right) and do not impair connectivity refinements. This result could be linked to experiments where the expression of b2-nAChR is limited to the ganglion cell layer of the retina, resulting in smaller retinal waves than those in wild-type and undisturbed retinotopy in the superior colliculus (Xu et al., 2011), although the effects in the cortex are unknown.
Therefore, we suggest that certain manipulations that modulate the size of sensory activity from the periphery have a profound impact on the precision of receptive field refinement in downstream targets, making predictions to be tested experimentally. In contrast to retinal wave manipulations, the effect of altered inhibitory signaling on receptive field refinements is still unknown. It is likely that such manipulations will also affect H-events (Leighton et al., 2020), as well as shape ongoing plasticity in the network (Wu et al., 2020), and hence have less predictable effects on receptive field size and topography.
Adaptive H-events promote the developmental event sparsification of cortical activity Thus far, we focused on the development of the network connectivity in our model driven by spontaneous activity based on properties measured during a few postnatal days (P8-10, Figure 4A).   However, in vivo spontaneous activity patterns are not static, but dynamically regulated during development by ongoing activity-dependent plasticity which continuously reshapes network connectivity that lasts several days (Rochefort et al., 2009;Golshani et al., 2009;Frye and MacLean, 2016). Moreover, it is unclear if the same criteria based on event participation rates and amplitude can be used to separate the spontaneous events into L and H at later developmental ages. Hence, we next asked how our observed modifications in network connectivity that are the result of receptive field refinement further modify spontaneous activity patterns on a much longer developmental timescale of several days in our model. Therefore, we analyzed all spontaneous events of simulated cortical neurons during the process of receptive field refinement in the presence of adaptive H-events ( Figure 4B). Since the input threshold u of the Hebbian learning rule is related to receptive field size ( Figure 4B), we used u as a proxy for time of development in the model: low u corresponds to earlier developmental stages when receptive fields are large, while high u corresponds to late developmental stages when receptive fields are refined. This assumption is also in line with the fact that the input resistance of neurons in V1 and S1 decreases during development (Etherington and Williams, 2011;Golshani et al., 2009), so that the depolarizing current necessary to trigger an action potential increases with age.
At an early developmental stage in the model ( u ¼ 0:45), the unrefined receptive fields of cortical neurons in our network model propagate thalamic activity into the cortex as very broad spontaneous events, while adaptive H-events remain intrinsic to the cortical layer. As in the data ( Figure 7A,C; Siegel et al., 2012), the amplitude of events with 20-80% participation rate is approximately half the amplitude of events with greater than 80% participation rate ( Figure 7B,D, left). Moreover, we also observed a high proportion of large events with greater than 80% participation rate ( and peripheral events activate fewer cortical neurons, our proposed adaptation of H-event amplitudes decreases the overall level of intrinsic activity in the cortical layer. This changes the relationship between effective amplitude and participation rate ( Figure 7D, middle), with large events decreasing their amplitudes and density (Figure 7-figure supplement 1A). Finally, at late developmental stages in the model ( u ¼ 0:60), the relationship between effective amplitude and participation rate is almost absent ( Figure 7D, right). Overall event amplitude is much lower resulting in far fewer large events with greater than 80% participation rate (Figure 7-figure supplement 1B). Therefore, due to the progressive receptive field refinements and the continued H-event adaptation in response to resulting activity changes, spontaneous events in our model progressively sparsify during ongoing development, whereby spontaneous events become smaller in size with fewer participating cells. This finding suggests that spontaneous events in the cortex at later developmental ages can no longer be separated into L and H using the same criteria of participation rate and amplitude as during early development. We also found that the mean pairwise correlation of all cortical neurons in the model decreases as a function of developmental age ( u ; Figure 7E,F), which further supports the trend of progressive sparsification already observed in the event sizes.
Interestingly, such event sparsification of spontaneous activity has been observed experimentally in the mouse barrel cortex during postnatal development from P4 to P26 (Golshani et al., 2009) and in the visual cortex from P8 to P79 (Rochefort et al., 2009). During this period, in the visual cortex, the size of spontaneous events decreases ( Figure 7G), the amplitude of the participating cells also decreases, while event frequency increases ( Figure 7H; Rochefort et al., 2009). This progressive event sparsification of cortical activity is generated by mechanisms intrinsic to the cortex, and does not seem to be sensory-driven (Rochefort et al., 2009;Golshani et al., 2009). We found the same relationships in our model using u as a proxy for developmental time ( Figure 7I,J).
In summary, our framework for activity-dependent plasticity and receptive field refinement between thalamus and cortex with adaptive H-events can tune the properties of cortical spontaneous activity and provide a substrate for the event sparsification of cortical activity during development on a much longer timescale than receptive field refinement. This sparsification has been found in different sensory cortices, including visual (Rochefort et al., 2009), somatosensory (Golshani et al., 2009), and auditory (Frye and MacLean, 2016), suggesting a general principle that underlies network refinement. However, the event sparsification we observe is different from sparse network activity implicated in sparse efficient coding, which interestingly seems to decrease during development (Berkes et al., 2009;Berkes et al., 2011). Our modeling predicts that cortical event sparsification is primarily due to the suppression of cortically-generated H-events in the Hebbian covariance rule, which switches cortical sensitivity to input from the sensory periphery after the onset of sensory experience.

Discussion
We examined the information content of spontaneous activity for refining local microcircuit connectivity during early postnatal development. In contrast to classical works on activity-dependent refinements, which used mathematically convenient formulations of spontaneous activity (Willshaw and von der Malsburg, 1976;Mackay and Miller, 1990), we used spontaneous activity patterns characterized in the mouse visual cortex in vivo before the onset of vision (P8-10), which revealed its rich structure. Specifically, we explored the joint contribution of two distinct patterns of spontaneous activity recorded in the mouse visual cortex before the onset of vision, local (L-events) and global (Hevents), on establishing topographically refined receptive fields between the thalamus and the cortex without decoupling in a model network with activity-dependent plasticity. Because of their spatially correlated activity, we proposed that peripherally generated L-events enable topographic refinement, while H-events regulate connection strength homeostatically. We investigated two Hebbian learning rules -the Hebbian covariance and the BCM rules -which use joint pre-and postsynaptic activity to trigger synaptic plasticity. First, we studied the Hebbian covariance rule that induces global synaptic depression in the presence of only postsynaptic activity (i.e. H-events). Second, we studied the BCM rule, which is known to establish the emergence of ocular dominance and orientation selectivity in the visual system. Although L-events successfully instruct topographic receptive field refinements in the Hebbian covariance rule, naively including H-events provides too much depression, eliminating selectivity in the network despite fine-tuning (Figure 3). In contrast, in the BCM rule, H-events are indeed homeostatic, regulating the threshold between depression and potentiation. However, small L-events, which carry precise information for topographic connectivity refinements, mostly cause long-term depression in the synaptic weights and disrupt topography. Inspired by the sliding threshold in the BCM rule, we proposed a similar adaptive mechanism operating at the single-cell level in the Hebbian covariance rule. This mechanism regulates the amplitude of the cortically generated H-events according to the preceding average activity in the network to homeostatically balance local increases and decreases in activity, and can successfully refine receptive fields with excellent topography (Figure 4). Without any additional fine-tuning, this mechanism can also explain the long-term event sparsification of cortical activity as the circuit matures and starts responding to visual input (Figure 7). Therefore, we propose that L-and adaptive H-events cooperate to synergistically guide circuit organization of thalamocortical synapses during postnatal development.

The origin of cortical event amplitude adaptation
After a re-examination of spontaneous activity recorded in the developing cortex in vivo between postnatal days 8 and 10 (Siegel et al., 2012), we found evidence for the proposed H-event amplitude adaptation ( Figure 5). This mechanism is sufficiently general in its formulation that it could be realized at the cellular, synaptic or network level. At the cellular level, the adaptation mechanism resembles the plasticity of intrinsic excitability. Typically, plasticity of intrinsic excitability has been reported in response to long-term perturbations in activity or persistent changes in synaptic plasticity like LTP and LTD, where the intrinsic properties of single neurons are adjusted in an activitydependent manner (Daoudal and Debanne, 2003;Desai et al., 1999). During plasticity of intrinsic excitability, neurons can alter the number and expression levels of ion channels to adjust their inputoutput function either by modifying their firing thresholds or response gains, which could represent the substrate for H-event amplitude regulation. Our adaptation mechanism is consistent with the fast plasticity of intrinsic excitability operating on the timescale of several spontaneous events supported by many experimental studies. For instance, intrinsic excitability of spine motoneurons is depressed after brief but sustained changes in spinal cord network activity in neonatal mice (Lombardo et al., 2018). Similarly, hippocampal pyramidal neurons also exhibit a rapid reduction of intrinsic excitability in response to sustained depolarizations lasting up to several minutes (Sánchez-Aguilera et al., 2014). In addition to reduced excitability, in the developing auditory system, enhanced intrinsic excitability has been reported in the cochlea followed by reduced synaptic excitatory input from hair cells in a model of deafness, although this change is slower than our proposed adaptation mechanism (Babola et al., 2018).
At the synaptic level, our adaptation mechanism can be implemented by synaptic scaling, a process whereby neurons regulate their activity by scaling incoming synaptic strengths in response to perturbations (Turrigiano et al., 1998). A second possibility is short-term depression, which appears to underlie the generation of spontaneous activity episodes in the chick developing spinal cord (Tabak et al., 2001;Tabak et al., 2010). Similarly, release probability suppression has been reported to strongly contribute to synaptic depression during weak activity at the calyx of Held (Xu and Wu, 2005), which is more pronounced at immature synapses where morphological development renders synaptic transmission less effective (Renden et al., 2005;Nakamura and Takahashi, 2007). This is also the case in the cortex, where short-term synaptic plasticity in young animals is stronger (Oswald and Reyes, 2008). Beyond chemical synapses, plasticity of gap junctions, which are particularly prevalent in development (Niculescu and Lohmann, 2014), could also be a contributing mechanism that adapts overall network activity (Cachope et al., 2007;Haas et al., 2011). Finally, at the network level, the development of inhibition could be a substrate for amplitude adaptation of cortically generated events. The main inhibitory neurotransmitter, GABA, is thought to act as a depolarizing neurotransmitter, excitatory in early postnatal days (Ben-Ari, 2002), although recent evidence argues that GABAergic neurons have an inhibitory effect on the cortical network already in the second postnatal week (Murata and Colonnese, 2020;Kirmse et al., 2015;Valeeva et al., 2016). Thus, the local maturation of inhibitory neurons -of which there are several types (Tremblay et al., 2016) -that gradually evolve to balance excitation and achieve E/I balance (Dorrn et al., 2010) could provide an alternative implementation of the proposed H-event adaptation.

Developmental sparsification of cortical activity
On a longer timescale than receptive field refinement, we demonstrated that the adaptation of H-event amplitude can also bring about the event sparsification of cortical activity, as global, cortically generated H-events are attenuated and become more localized (Figure 7). The notion of 'sparse neural activity' has received significant attention in experimental and theoretical studies of sensory processing in the cortex, including differing definitions and implementations (Field, 1994;Willmore and Tolhurst, 2001;Berkes et al., 2009;Olshausen and Field, 2004;Zylberberg and DeWeese, 2013). In particular, sparse activity in the mature cortex has been argued to be important for the efficient coding of sensory inputs of different modalities (Olshausen and Field, 2004;Field, 1994). Hence, the developmental process of receptive field refinement might be expected to produce sparser network activity over time. However, experiments directly testing this idea have found no, or even opposite, evidence for the developmental emergence of efficient sparse coding (Berkes et al., 2009;Berkes et al., 2011). In the context of our work, sparsification simply refers to an overall sparsification of network events (fewer active cells per event). Given that our data pertain to developmental spontaneous activity before eye-opening, in complete absence of stimulation, it is not straightforward to relate our event sparsification to the sparse efficient coding hypothesis.

Assumptions in the model
Our model is based on the assumption that L-and H-events have distinct roles during the development of the visual system. Retinal waves, the source of L-events, carry information downstream about the position and function of individual retinal ganglion cells (Stafford et al., 2009), hence they are ideally suited to serve as 'training patterns' to enable activity-dependent refinements based on spatiotemporal correlations (Ko et al., 2011;Ackman and Crair, 2014;Thompson et al., 2017). Since all cells are maximally active during H-events, these patterns likely do not carry much information that can be used for activity-dependent refinement of connectivity. In contrast, we assumed that H-events homeostatically control synaptic weights, operating in parallel to network refinements by L-events ( Figure 4). Indeed, highly correlated network activity can cause homeostatic down-regulation of synaptic weights via a process known as synaptic scaling (Turrigiano and Nelson, 2004). The homeostatic role of H-events is also consistent with synaptic downscaling driven by slow waves during sleep, a specific form of synchronous network activity (Tononi and Cirelli, 2006;Vyazovskiy et al., 2008). Since during development sleep patterns are not yet regular, we reasoned that refinement (by L-events) and homeostasis (by H-events) occur simultaneously instead of being separated into wake and sleep states.
We focused on the role of spontaneous activity in driving receptive field refinements rather than study how spontaneous activity is generated. While the statistical properties of spontaneous activity in the developing cortex are well-characterized, the cellular and network mechanisms generating this activity remain elusive. In particular, while H-event generation has been shown to rely on gap junctions (Siegel et al., 2012;Niculescu and Lohmann, 2014), which recurrently connect developing cortical cells, not much is known about how the size of cortical events is modulated and how an L-event is prevented from spreading and turning into an H-event. It is likely that cortical inhibition plays a critical role in localizing cortical activity and shaping receptive field refinements (Wood et al., 2017;Leighton et al., 2020), for instance, through the plasticity of inhibitory connections by regulating E/I balance (Dorrn et al., 2010). As new experiments are revealing more information about the cellular and synaptic mechanisms that generate spatiotemporally patterned spontaneous activity (Fujimoto et al., 2019), a full model of the generation and the effect of spontaneous activity might soon be feasible.
The threshold parameter in the Hebbian covariance rule in the presence of H-events implements an effective subtractive normalization that sharpens receptive fields (see Appendix). Despite the strong weight competition, subtractive normalization seems to be insufficient to stabilize receptive fields in the presence of non-adaptive H-events (Figure 3). Multiplicative normalization is an alternative normalization scheme, but it does not generate refined receptive fields (Miller and MacKay, 1994). Therefore, we also studied the BCM rule due to its ability to generate selectivity in postsynaptic neurons under patterned input. While the BCM rule successfully generates selectivity and receptive field refinement, the resulting topography is worse than in the Hebbian covariance rule (Figure 3). Both rules have an adaptive component: in the BCM rule it is the threshold between potentiation and depression that slides as a function of postsynaptic activity, while in the Hebbian covariance rule it is the adaptive amplitude of H-events, while the rule itself is fixed. Although experiments have shown the stereotypical activity dependence of the BCM rule (Kirkwood et al., 1996;Sjö strö m et al., 2001), whether a sliding threshold for potentiation vs. depression exists is still debated. Moreover, the timescale over which the threshold slides to prevent unbounded synaptic growth needs to be much faster than found experimentally (Zenke et al., 2017). Our proposed H-event amplitude adaptation operates on the fast timescale of several spontaneous events found experimentally (Siegel et al., 2012;Sánchez-Aguilera et al., 2014;Lombardo et al., 2018). Hence, together with the better topography and the resulting event sparsification as a function of developmental stage that the Hebbian covariance rule with adaptive H-events generates, we propose it as the more likely plasticity mechanism to refine receptive fields in the developing visual cortex.
Finally, we have focused here on the traditional view that molecular gradients set up a coarse map that activity-dependent mechanisms then refine (Goodhill and Xu, 2005). In our model, this was implemented as a weak bias in the initial connectivity, which did not affect our results regarding the refinement of receptive fields. Both activity and molecular gradients may work together in interesting ways to refine receptive fields (Grimbert and Cang, 2012;Godfrey and Swindale, 2014;Naoki, 2017), and future work should include both aspects.

Predictions of the model
Our model makes several experimentally testable predictions. First, we showed that changing the frequency of H-events can affect the size of the resulting receptive fields under both the BCM (Figure 3) and the Hebbian covariance rule with adaptive H-events ( Figure 4). The frequency of H-events can be experimentally manipulated using optogenetics or pharmacology. For instance, gap junction blocker (carbenoxolone) has been shown to specifically reduce the frequency of H-events (Siegel et al., 2012), hence in that scenario our results predict broader receptive fields.
Additionally, L-events can also be experimentally manipulated. Recently, reduced inhibitory signaling by suppressing somatostatin-positive interneurons have has been shown to increase the size of L-events in the developing visual cortex (Leighton et al., 2020). With the effect of altered inhibitory signaling on receptive field refinements still unknown, our work predicts larger receptive fields and worse topography upon reduction of inhibition. L-events can also be experimentally manipulated by changing the properties of retinal waves, which can significantly affect retinotopic map refinement of downstream targets (Grubb et al., 2003;Cang et al., 2005b;Burbridge et al., 2014). Indeed, b2 knockout mice discussed earlier have larger retinal waves and less refined receptive fields in the visual cortex (Sun et al., 2008;Stafford et al., 2009;Cutts and Eglen, 2014). If we assume that these larger retinal waves manifest as larger L-events in the visual cortex following Siegel et al., 2012, then these experimental observations are in agreement with our model results. Third, our model predicts that as a result of receptive field refinement during development, network events sparsify as global, cortically generated events are attenuated and become more localized. Interestingly, the properties of spontaneous activity measured experimentally in different sensory cortices (Rochefort et al., 2009;Frye and MacLean, 2016;Smith et al., 2015;Ikezoe et al., 2012;Shen and Colonnese, 2016;Golshani et al., 2009) and in the olfactory bulb (Fujimoto et al., 2019) change following a very similar timeline during development as predicted in our model. However, in many of these studies activity has not been segregated into peripherally driven L-events and cortically generated H-events. Therefore, our model predicts that the frequency of L-events would increase while the frequency of H-events would decrease over development.
Finally, we propose that for a Hebbian covariance rule to drive developmental refinements of receptive fields using spontaneous L-and H-event patterns recorded in vivo (Siegel et al., 2012), H-events need to adapt to ongoing network activity. Whether a fast adaptation mechanism like the one we propose operates in the cortex requires prolonged and detailed activity recordings in vivo, which are within reach of modern technology Ji, 2017;Gribizis et al., 2019). Our framework also predicts that manipulations that affect overall activity levels of the network, such as activity reduction by eye enucleation, would correspondingly affect the amplitude of ongoing H-events.

Conclusion
In summary, we studied the refinement of receptive fields in a developing cortex network model constrained by realistic patterns of experimentally recorded spontaneous activity. We proposed that adaptation of the amplitude of cortically generated spontaneous events achieves this refinement without additional assumptions on the type of plasticity in the network. Our model further predicts how cortical networks could transition from supporting highly synchronous activity modules in early development to sparser peripherally driven activity suppressing local amplification, which could be useful for preventing hyper-excitability and epilepsy in adulthood while enhancing the processing of sensory stimuli.

Network model
We studied a feedforward, rate-based network with two one-dimensional layers, one of N u thalamic neurons (u) and the other of N v cortical neurons (v), with periodic boundary conditions in each layer to avoid edge effects. The initial connectivity in the network was all-to-all with uniformly distributed weights in the range w ini ¼ ½a; b. In addition, a topographic bias was introduced by modifying the initially random connectivity matrix to have the strongest connections between neurons at the matched topographic location, and which decay with a Gaussian profile with increasing distance ( Figure 2C), with amplitude s and spread s s . During the evolution of the weights, soft bounds were applied on the interval ½0; w max . We studied weight evolution under two activity-dependent learning rules: the Hebbian (Equation 2) and the BCM (Equation 3) rules. Table 1 lists all parameters. Sample codes can be found at github.com/comp-neural-circuits/LH-events (Wosniack, 2021; copy archived at swh: 1:rev:b90e189a9e1a4d0cdda097d435fa91b1236f1866).

Generation of L-and H-events
We modeled two types of spontaneous events in the thalamic (L-events) and the cortical (H-events) layer of our model (Siegel et al., 2012). During L-events, the firing rates of a fraction (L pct ) of neighboring thalamic neurons were set to L amp ¼ 1 during a period L dur and were otherwise 0. Similarly, during H-events, the firing rates of a fraction (H pct ) of cortical cells were set to H amp during time H dur . As a result, cortical neuron activity was composed of H-events and L-events transmitted from the thalamus. For each H-event, H amp was independently sampled from a Gaussian distribution with mean H amp and standard deviation H amp =3. The inter-event intervals were L int and H int sampled from experimentally characterized distributions in Siegel et al., 2012, ( Table 1). The event durations and inter-event intervals were shortened by a factor of 10 compared to the values measured in the data ( Figure 1) to speed up our simulations, but the relationships observed in the data were preserved. We note that in the experiments, both L-and H-events were characterized in the primary visual cortex; in our model, we assume that L-events are generated in the retina and subsequently propagated through the thalamus to the cortex, where they manifest with the experimentally reported characteristics (see Figure 7B for example). This interpretation is supported by experimental evidence (Siegel et al., 2012), but we cannot exclude the possibility that the retina also generates H-events or that L-events are generated in the cortex.

Reduction of the weight dynamics to two dimensions
To reduce the full weight dynamics to a two-dimensional system, we averaged all the n weights belonging to the receptive field that are predicted to potentiate along the initial topological bias, as w RF , and all the N u À n remaining weights, which we call complementary to the receptive field, as w C . When all weights behaved the same, we arbitrarily split them into two groups of the same size. Details about the classification of weights as w RF or w C can be found in the Appendix.

Computing the strength of simulated H-events
To relate the reduced two-dimensional phase planes to the simulation results, we wrote down the steady state activity of neuron j (Equation 1), which contains the rate gain from H-events relative to L-events, hR H i (also called 'Strength of H-events' in Figure 4E): since hL dur i ¼ hH dur i and hL amp i = 1.
In the absence of adaptive H-events, for a fixed set of values for H amp and L int (as in Table 1) and a chosen hR H i which we called 'Non-adapted strength of H-events' in Figure 4E, we used Equation 6 to find the H int value that satisfies the equation. Next, we ran simulations with the same H int and L int parameters, but adaptive H amp . We fixed the inter-event intervals of both L-and H-events to their mean values L int and H int instead of sampling them from distributions in Figure 4E. Then we numerically estimated the average amplitude of H-events with adaptation, which we called 'Adapted strength of H-events' in Figure 4E at the end of the simulation (final 5% of the simulation time) when the dynamics were stationary.

Receptive field statistics
The following receptive field statistics were used to quantify properties of the weight matrix W after the developing weights became stable.

Receptive field size
The receptive field of a cortical neuron is the group of weights from thalamic cells for which w ij >w max =5. The lower threshold was chosen to make the measurement robust to small fluctuations around 0, which are present because of the soft bounds. Mathematically, we compute the receptive field size of cortical neuron j as: with the I I I I I vector given by: The normalized receptive field ranges from 0 corresponding to a total decoupling of the cortical cell from the input layer, to one corresponding to no selectivity due to the potentiation of all weights from the input layer to that neuron. To compute the average receptive field size of the network, we include only the cortical neurons (N Ã ) that have not decoupled: If all the cortical cells have decoupled from the thalamus, we set R F ðWÞ ¼ 0.

Topography T
The topography of the network is a measure of how much of the initially weak biased topography is preserved in the final receptive field. Due to our biased initial conditions, neighboring thalamic cells are expected to project to neighboring cortical cells, yielding a diagonal weight matrix. For each cortical neuron, we calculated how far the center of its receptive field is from the ideal diagonal. Mathematically, for each row j of W, we determined the center of the receptive field c j and calculated the smallest distance (while considering periodic boundary conditions) between the receptive field center and the diagonal element j. Then, we summed all the squared distances and calculated the average error of the topography: To normalize the topography, we compared x to the topography error Ä of a column receptive field (Figure 3-figure supplement 1A) where the centers of all cortical receptive fields were the same, c j ¼ c (c a constant). For such a column receptive field, Ä ¼ N 2 u 12 . Therefore, we define the topography score T as: The topography will be close to one if the weight matrix is perfectly diagonal and 0 if the final receptive field is a column ( ¼ Ä).

Proportion of cortical decoupling D
To quantify the cortical decoupling, we use Equation 8 to compute the fraction of decoupled neurons divided by the number of neurons, 1 Nu P Nu j¼1 Q Nu i¼1 ð1 À I ij Þ. If the decoupling is 0, no cortical neuron has decoupled from the thalamus, while decoupling of 1 means that all the cortical neurons are decoupled from the thalamus.

Quantifying adaptation in the data
We first investigated if fluctuations in the activity across recordings could generate significant correlations. We analyzed consecutive recordings (each~5 mins long) in the same animal of which we had between 3 and 14 in all 26 animals (separated by <5 mins due to experimental constraints on data collection) to identify possible fluctuations on a longer timescale. We found that the average amplitude of all (L and H) events is not significantly different across consecutive recordings of the same animal ( Figure 5-figure supplement 1A, one-way ANOVA tests, p>0.05 in 23 out of 26 animals). Across different animals and ages, individual event amplitudes remained uncorrelated between successive recordings at this timescale, which we confirmed by plotting the difference in event amplitude as a function of the time between recordings ( Figure 5-figure supplement 1B), Kruskal-Wallis test, p>0.05.
For our reanalysis of the spontaneous events ( Figure 5), we only included events that recruited at least 20% of the cells in the imaging field of view following Siegel et al., 2012. We computed the average amplitude of all events that occurred within a time window T max before an H-event (consecutive recordings were concatenated) and compared it to the amplitude of the H-event. We excluded animals that had fewer than 12 H-events preceded by spontaneous activity within the time window T max (nine animals remained after exclusion). Next, we computed the correlation coefficient of the relationship between H-event amplitude and the average amplitude of preceding activity within T max with a leaky accumulator of time constant t decay . To estimate the 95% confidence interval, we performed a bootstrap analysis in which we generated 1000 bootstrap datasets by drawing without replacement from the valid pairs of H-event amplitudes and average amplitude of preceding activity. We repeated this analysis with different thresholds for excluding data ( All data and analysis code can be found at github.com/comp-neural-circuits/LH-events.

Spontaneous events identification in the model
To quantify the properties of spontaneous activity in the cortical layer of our model, we used the time series of activity of all the simulated cortical neurons (after weight stabilization is achieved) sampled in a high time resolution (0.01 s, Figure 7B). We defined a global activity threshold n ¼ v max =r, where v max is the highest amplitude among the cortical cells in the recording and r is a fixed scaling constant (r ¼ 8 for all recordings). For each cortical cell j, we labeled the intervals where the cell was active (1) or inactive (0) based on: We then used the trace XðtÞ ¼ P N j¼1 x j ðtÞ to define the number of active cortical cells at each time step t, that is, the participation rate. For each identified event, we averaged the amplitude of the active cells to obtain the amplitude vs. participation rate relationship. The input correlation matrix The spontaneous L-events in our network model have useful mathematical properties that allowed us to derive an analytical expression for the elements of the input correlation matrix Q. The locality and periodicity of L-events and the long-term averaging of their stationary dynamics generate a symmetric and circulant matrix Q (Gray, 2005). In a circulant matrix, each row is rotated one element to the right relative to the preceding row, and thus Q can be completely defined by a vector. Let ' be the fixed size of an L-event between 0 and N u . By computing the correlations among all the possible L-events of size ', we can write the elements of the vector q 'k ¼ ½q '1 ; q '2 ; . . . ; q 'Nu , which completely defines Q, as: where minða; bÞ returns the minimum of a and b. Since we wanted to explore the dependence of refinement on the size of L-events, defined by the minimum (' min ) and maximum (' max ) number of cells they activate in the input layer, we average over the size of a given event: Finally, using the emergent symmetry of Q, we can simplify the elements of this vector as follows: Since Cð u Þ and C 0 ð u Þ are defined in terms of Q minus a constant, both are circulant matrices as well.

Fixed point of the weight dynamics
The fixed point of the weight dynamics under L-and H-events is obtained by setting Equation 16 to zero and solving the resulting equation for w Ã : where n is a vector of ones. To study the nature of the fixed point w Ã , we need to investigate the eigenvalues of the circulant matrix C 0 ð u Þ À1 (the inverse of a circulant matrix is also a circulant matrix). A circulant matrix has the property that its eigenvectors can be written in terms of roots of unity. The eigenvalues are real and can be written as the discrete Fourier transforms of any row of the matrix (Gray, 2005). In particular, one of the eigenvalues is the sum of the elements of any row of the matrix, with a corresponding eigenvector that is a constant. We call this special eigenvalue the 'row-sum eigenvalue'. All eigenvalues except for the row-sum are non-negative.
We can write the fixed point as w Ã ¼ ðl Ã Þ À1 u hR H in, where l Ã is the row-sum eigenvalue of C 0 ð u Þ. It is clear that if only L-events are present, hR H i ¼ 0, the fixed point is always at the origin, w Ã ¼ 0. Consequently, adding H-events in the cortical layer moves the location of the fixed point along the diagonal of the phase plane, that is, equally in all directions. The fixed point will be positive (negative) when l Ã is positive (negative). Furthermore, it will be an unstable fixed point for l Ã >0, since all the eigenvalues of the dynamical system are positive. The fixed point is a saddle node when l Ã <0 since the other eigenvalues are positive.

Calculation of receptive field size
The Hebbian covariance rule on its own does not have a mechanism to prevent the weights to grow infinitely large or negative. Thus, to generate receptive fields we imposed a lower bound at 0 and an upper bound at w max . This enabled us to calculate the size of the receptive field, n, as a function of L-event properties and the input threshold, u in the absence of H-events. Because the cortical cells are not recurrently connected, here we examine the weight vector onto a single cortical neuron. Assuming the dynamical system has reached steady state, to get a receptive field of size n<N u for a direct impact on receptive field size, with larger L-events resulting in larger receptive fields for a fixed input threshold (Appendix 1-figure 1B). Low input thresholds generate refined receptive fields only if the size of spontaneous events is small. The eigenspace of C 0 ð u Þ To gain intuition for the weight dynamics in Equation 16, we first investigated the eigenspace of C 0 ð u Þ, the vector space spanned by the eigenvectors of C 0 ð u Þ. Specifically, we focused on the conditions that enabled the robust formation of cortical receptive fields. Using the fact that Q and C 0 ð u Þ are circulant matrices (Appendix 1- figure 2A, B), we identified two input thresholds, Ã and ÃÃ , that define three dynamical regions that the row-sum eigenvalue of C 0 ð u Þ, l Ã , can occupy (Appendix 1- figure 2C). To obtain the first critical input threshold Ã that characterizes the transition from region (i) to region (ii), we set, for any row j, the row-sum eigenvalue to the largest (fixed) eigenvalue of C 0 ð u Þ: and for the input statistics of experimentally measured L-events (Table 1), we obtained Ã ¼ 0:414. Similarly, the transition from region (ii) to region (iii) is achieved when the row-sum eigenvalue is set to zero: and the second critical input threshold is obtained as ÃÃ ¼ 0:564. In region (i), 0< u < Ã and l Ã >0 is the dominant (largest) eigenvalue of C 0 ð u Þ. The eigenvector corresponding to it is a constant (Appendix 1- figure 2D, (i)). This predicts that all synaptic weights to a given cortical cell will potentiate preventing the formation of a localized receptive field. Since all eigenvalues are non-negative, the fixed point is an unstable node of the linear dynamical system (Equation 16).
In region (ii), Ã < u < ÃÃ and l Ã >0. However, l Ã is no longer the dominant eigenvalue. In this setting, there is a pair of dominant eigenvalues with the corresponding eigenvectors taking the form of out-of-sync sine waves with positive and negative elements. The sign of these elements predicts that some weights will potentiate while others depress, thus enabling the formation of receptive fields (Appendix 1-figure 2D, (ii)). All eigenvalues in this second region remain non-negative and the fixed point is still an unstable node of the dynamical system.
Finally, in region (iii), ÃÃ < u <1 and l Ã <0. While the dominant eigenvectors are similar to those in region (ii), enabling the formation of localized receptive fields (Appendix 1-figure 2D, (iii)), the dynamics of the dynamical system are different because the fixed point is now a saddle node.

Analysis of the weight dynamics in two dimensions
We reduced the dimension of the weight dynamics by defining two distinct sets of weights. The first set, w RF , corresponds to the n weights which correspond to the topographically biased locations of the receptive field. The complementary set w C contains the remaining weights. To classify the weights into w RF and w C , first we solved Equation 16 with the biased initial condition and limited the sum to the dominant eigenvalues and respective eigenvectors. In the case of selectivity, the eigenvectors have half the elements positive and half negative. Therefore, if the receptive field size is n<N u =2 we needed to subsample the potentiating weights to achieve the smaller receptive field size. We did it by keeping only the n largest positive elements in w RF and moving the remaining ones to w C . If n>N u =2, we downsampled w C by moving the n À N u =2 less-negative weights to w RF . Due to the topographically biased initial conditions, w RF always contained the weights potentiating along the diagonal w RF ¼ w C .
We then regularly sampled initial conditions in ½0; w max Â ½0; w max . For a given receptive field size n, we set u ¼ n u according to Equation 27. If w RF ð0Þ>w C ð0Þ, w RF contains the n weights that form the receptive field by potentiating to the upper bound w max , while w C contains the remaining N u À n weights that depress to 0. Similarly, if w RF ð0Þ<w C ð0Þ, w C contains the n weights that potentiate to the upper bound, while w RF contains the remaining N u À n weights that depress to 0. At each initial condition, we solved the weight evolution (Equation 16) for each weight, and averaged the weights in w RF and w C for a small time interval to obtain the direction of the phase plane arrows. We computed the evolution trajectory by solving Equation 16 with a topographically biased initial condition.
Analytical solution of the two-dimensional weight dynamics with only L-events in the Hebbian covariance rule We first examined the weight dynamics in the reduced two-dimensional phase plane w RF Â w C for only L-events (hR H i ¼ 0 in Equation 16). The phase plane is symmetric about the diagonal w RF ¼ w C due to the symmetry of the dominant eigenvectors (Appendix 1- figure 3A). As predicted by the eigenvectors (Appendix 1-figure 2D), in region (i) both w RF and w C converge to the upper bound and the fixed point, which is located in the origin, is an unstable node (Appendix 1- figure 3A, left). Therefore, all weights potentiate and no receptive field can be formed. In regions (ii) and (iii), the eigenvectors predict the formation of receptive fields with w RF ! w max and w C ! 0, respectively (Appendix 1-figure 3A, middle and right), although the dynamics are different in each case because the origin is an unstable node or a saddle node, respectively. The reduced two-dimensional weight dynamics in the phase plane with the same u as Appendix 1figure 2B and D. For each plane, the red trajectory depicts the weight evolution from an initial condition where w RF ð0Þ>w C ð0Þ until the weights' upper bound (w max ¼ 0:5). Left: w C ! w max and w RF ! w max , resulting in no selectivity. Middle and right: w C ! 0 and w RF ! w max , resulting in selectivity and receptive field refinement. (B) Simulation results for the same input thresholds of Appendix 1-figure 2B,D for the full 50-dimensional system.
Our analytical predictions of the reduced two-dimensional system with only L-events were confirmed in numerical simulations of the full N-dimensional system. In particular, in region (i) all weights potentiate with each cortical cell receiving input from all thalamic inputs, such that no receptive field forms (Appendix 1- figure 3B, left). In regions (ii) and (iii), receptive fields form with good topography (Appendix 1-figure 3B, middle and right). Therefore, consistent with the analytical prediction of receptive field size (Appendix 1-figure 1), higher input thresholds resulted in smaller receptive fields.
Thus, the Hebbian covariance rule can generate receptive fields of size depending on the input threshold u in the presence of only L-events originating from the sensory periphery. This result is in agreement with previous findings of the emergence of other aspects of development including topographic maps (Willshaw and von der Malsburg, 1976) and ocular dominance (Miller et al., 1989;Miller, 1994) or other selectivity (Mackay and Miller, 1990;Miller and MacKay, 1994;Lee et al., 2002) in the presence of correlated activity in the input layer of similar feedforward