Inhibitory top-down projections from zona incerta mediate neocortical memory

Top-down

In brief Schroeder et al. identify a key pathway that flexibly tunes neocortical computations according to the individual's experience. These longrange inhibitory afferents derive from the subthalamic zona incerta, preferentially target neocortical interneurons, and encode the learned top-down relevance of sensory information in a bidirectional and balanced fashion to enable memory.

INTRODUCTION
The sensory neocortex is a critical substrate for higher brain functions including perception and memory. The underlying computations require the integration of bottom-up sensory signals with internally generated top-down information representing the previously acquired relevance of stimuli and the individual's current aims. 1,2 Decades of work have elucidated how the sensory neocortex processes physical stimulus features. By contrast, the encoding of top-down information by brain-wide afferents and the mechanisms that enable these signals to converge with bottom-up representations are only starting to emerge. 3,4 This work has so far focused on a number of topdown pathways that derive from regions with established roles for memory, including other cortical areas, 5,6 the higher-order thalamus, 7,8 and the amygdala. 9,10 These projection systems display a number of commonalities in that they all establish excit-atory afferents that are strongly enriched in the outermost layer of the neocortex (L1), where they provide input to the distal dendrites of pyramidal neurons (PNs) as well as to local interneurons (INs). 3,4,6,7,11 In addition to recruiting both excitation and inhibition, these afferents display a common regime for memory encoding through straightforward potentiation of their stimulus responses, [5][6][7][8] giving rise to the hypothesis that they collectively operate as a simple memory switch. 3 In parallel to these intensely investigated excitatory systems, the brain contains a sparser and much less understood complement of long-range inhibitory projections. 12,13 Whether such inhibitory systems might provide top-down control of sensory neocortex with potentially distinct signaling mechanisms, connectivity, and information content is unknown. To address this, we focus on the subthalamic zona incerta (ZI), a predominantly inhibitory area that is much less studied than the aforementioned sources of excitatory top-down signals. Emerging work indicates that the ZI integrates a range of multisensory signals relating to arousal, motivation, and attention, 14,15 suggesting that this area may be ideally suited to supplying top-down input to the neocortex. However, although recent investigations have revealed a range of behaviors including sleep, 16 feeding, 17 pain, 18 novelty seeking, 19 and anxiety 20 that are orchestrated by the ZI via its widespread subcortical outputs to the thalamus, hy-pothalamus, and periaqueductal gray (PAG), the function of its projection to neocortex has remained elusive.

A disinhibitory incertocortical circuit
To address whether the ZI projects to the sensory neocortex, we performed triple retrograde tracing from the auditory cortex (ACx), visual cortex (VCx), and somatosensory cortex (SSCx) using counterbalanced combinations of cholera toxin B and FluoroGold (Figures 1A and S1A-S1D). These data revealed ZI afferents to all three sensory cortices. However, the vast majority of labeled neurons project to ACx, suggesting that this pathway may preferentially contribute to auditory behavior ( Figure 1B). We therefore focused on the ACx, which in addition is suited for these analyses since it is critical for associative memory. [21][22][23] To determine the proportion of ACx-projecting ZI neurons that are inhibitory, we injected retrograde tracers in GAD2-nuclear-mCherry mice ( Figure 1C). This showed that a small subset of overwhelmingly GABAergic neurons located in the ventral ZI projects to ACx (Figures 1D and 1E; see STAR Methods and Table S1 for all statistical tests and results in the study). In line, anterograde tracing from GABAergic neurons after injection of a conditional adeno-associated viral vector (AAV) into the ZI of GAD2-IRES-Cre mice revealed a robust projection to ACx ( Figures 1F-1H). These afferents display a mediotemporal density gradient that matches the contribution of cortical areas to learning and memory. 22 Within the local circuit, incertocortical projections are strongly enriched in L1, a known hub for topdown signaling ( Figures 1H and 1I). 3,[5][6][7][8][9] By contrast, projections to VCx and SSCx were sparse with minimal innervation in L1 (Figures S1E-S1H). These data expand on previous work in other species and during development 24,25 by identifying the ZI as a major source of long-range inhibitory input to L1 of the ACx. The small number of retrogradely labeled ZI neurons together with the robust projection to ACx furthermore suggest a considerable degree of divergence in this system.
How these inputs control the local circuit depends on their targets. To assess physiological connectivity, we expressed channelrhodopsin-2 (ChR2) in GABAergic ZI neurons to perform functional circuit mapping (Figures 2A and 2B). 26 Whole-cell recordings from L1 INs, L2/3 PNs, and L5 PNs in ACx of acute brain slices indicated that light stimulation of ZI afferents in superficial layers caused inhibitory currents in L1 INs with greater amplitudes and probabilities than in L2/3 or L5 PNs (Figures 2C-2E and S1I-S1L). Similar results were obtained from recordings of neighboring L1 INs and L2/3 or L5 PNs (Figures S1M and S1N), ruling out experimental variability as the source of the observed differential connectivity. The sparse, weak innervation of PNs by this pathway is in line with previous observations in immature neocortex. 25 Incertocortical transmission was blocked by Gabazine (GZ), identifying GABA A receptors as the underlying substrate ( Figure 2F). In addition, inhibitory currents displayed longer latencies in L2/3 and L5 PNs than in L1 INs, consistent with the innervation of distal PN dendrites in L1 ( Figure 2G). These results reveal preferential targeting of INs over PNs by incertocortical projections.   Table S1 for the full results of the statistical tests. Data are shown as mean ± SEM. ****p < 0.0001. See also Figure S1.
Cortical interneurons display great morphological and functional diversity. To determine the identity of the postsynaptic partners in greater depth, we employed AAV1-mediated anterograde transsynaptic tracing from ZI neurons ( Figure 2H). This technique has been validated and used in a number of brain areas from both glutamatergic and GABAergic afferents to both excitatory and inhibitory targets (see STAR Methods for full discussion). 27,28 Our experiments produced sparse transduction of neurons that were localized throughout the cortical depth (Figures S1O-S1X). To identify the labeled neurons, we employed fluorescent in situ hybridization (FISH) against well-established, non-overlapping molecular markers. The overwhelming proportion of postsynaptic neurons expressed the IN markers Pvalb, Sst, or Ndnf, whereas only very few excitatory Camk2a cells or disinhibitory Vip INs were found ( Figures 2I  and 2J). Collectively, these results uncover the striking specificity of incertocortical axons for IN types supplying direct inhibition to PNs, whereas direct inhibition of PN dendrites in L1 is considerably weaker. This organization differs fundamentally from excitatory top-down projections, which control cortical activity by recruitment of both PNs and INs. 3,6,7,11 Functionally, the net disinhibition of the local circuit evoked by ZI inputs has been identified as a conserved processing motif enabling network plasticity during learning and memory. 11,13,29,30 Transfer of integrated information is essential for learning On top of its output connectivity, a major determinant of the in vivo function of the incertocortical pathway is the inputs it receives. To specifically identify the brain-wide inputs to ZI neurons with a direct projection to the ACx, we made these cells competent for retrograde transsynaptic tracing by injecting ret-roAAV 31 into ACx. After the injection of helper AAV and subsequently rabies viral vector to transduce starter cells exclusively in the ZI ( Figures 3A, 3B, and S2F), this approach uncovered a large number of input sources spanning across the neuroaxis ( Figures 3C, 3D, and S2A-S2E; see Table S2 for definitions of all abbreviations), including the midbrain, striatum, thalamus, cortex, and cerebellum. These data are consistent with known afferents to the ZI as a whole. 14,15 In addition, this indicates that ZI sends integrated information from diverse upstream areas to ACx, consistent with a function for top-down signaling. Of note, several of these input regions such as the higher-order auditory thalamus (HO-MG), central amygdala (CEA), and PAG have been implicated in threat learning, 32 suggesting that ACxprojecting ZI neurons may play a central role in this form of memory.
A major behavioral capacity enabled by top-down projections is memory. 3,[5][6][7][8] Based on the recent finding that the ZI is implicated in threat memory via its subcortical network, 33 we aimed to directly address whether incertocortical afferents impact learning. To this end, we employed a form of discriminative threat conditioning that critically depends on the function of the secondary auditory cortex (AuV) and adjacent temporal association cortex (TeA) due to the use of complex conditioned stimuli (CSs, trains of frequency-modulated sweeps, Figures 4A and 4B). 21,22 To achieve control over incertocortical transmission, we expressed the chemogenetic inhibitor hM4DGi 34 Table S1 for the full results of the statistical tests. Data are shown as mean ± SEM. **p < 0.01, ***p < 0.001, ****p < 0.0001. See also Figure S1. infused just prior to the memory acquisition session during which one of two initially neutral CSs (CS+) was repeatedly paired with an unconditioned stimulus (US, a mild foot shock), whereas the CSÀ was left unpaired. In the recall session, the experimental animals displayed reduced freezing levels to both CSs relative to EYFP controls, indicating that silencing of incertocortical axons during learning results in a highly significant memory deficit ( Figure 4C). By contrast, CS discrimination was not changed by the manipulation, suggesting that the strength rather than the specificity of the auditory memory was affected (Figure 4D). Notably, this differs from other systems that demonstrated an impact on memory specificity, 7,36 which may either relate to floor effects or alternatively to differential contributions to CS discrimination. No effect on acute freezing behavior during acquisition was observed, indicating that US perception and short-term memory were not perturbed ( Figures S2P and S2Q). Silencing ZI afferents moreover left contextual threat memory that is independent of the neocortex intact ( Figure 4E). These results therefore demonstrate that inhibitory ZI projections to ACx selectively mediate the formation of long-term auditory threat memory.
Encoding of auditory stimuli and primary reinforcers How does information transmitted by ZI afferents contribute to ACx function and associative memory? To address this, we performed chronic two-photon imaging of ZI synaptic boutons expressing an axon-targeted calcium indicator 37 in L1 of awake, head-fixed GAD2-IRES-Cre mice after identification of area AuV using intrinsic imaging ( Figure 5A). 7,22,38 This was combined with a novel threat conditioning paradigm in which all phases occurred under the microscope in head fixation (Figures 5B and S3A-S3K), enabling us to longitudinally track the responses of individual boutons prior to (habituation), during (acquisition), and after learning (recall). Note that only boutons that could be identified in all three sessions were analyzed (Figures 5C and S3L-S3N). To obtain an online readout of threat memory in head-fixed mice, we used changes in eye pupil dilation in response to the CSs instead of freezing behavior ( Figure 5D), metrics that display a strong correlation. 7,22,38 Pupil responses increased across the paradigm for the CS+ but not the CSÀ, leading to behavioral CS discrimination during the recall session ( Figures 5D-5F). By contrast, non-associative pseudoconditioning (PC) (auditory stimuli are termed CS1 and CS2 here) caused a reduction of pupil responses (Figures 5G, 5H, S3G, and S3H). We conclude that threat memory strength during recall is low for CS1/2, intermediate for CSÀ, and high for CS+. These data are in line with analogous results from freely behaving animals, 7,22,38 thus validating the head-fixed paradigm for longitudinal dissection of threat memory acquisition and expression.
Given that this pathway has not been investigated, our first objective was to characterize the information it transmits in naive animals. CSs elicited clear positive or negative responses (NRs) in a subset of boutons ( Figures 6A-6C), indicating that this non-canonical pathway encodes auditory information. Training a linear decoder to predict stimulus identity from the bouton response patterns furthermore showed that ZI afferents are able to discriminate auditory stimuli above chance level at both the single-bouton ( Figure 6D) and population level ( Figures 6E and S4A-S4L). Given that the CSs largely overlap in frequency content, this suggests remarkably precise encoding of auditory information by this projection. In addition, ZI boutons respond robustly to the mild tail shock used as the US 39 with an increase in average activity (Figures 6F and S5A-S5E). These results uncover that inhibitory ZI afferents transmit information about both the auditory CSs and the primary reinforcer to ACx.  Table S2. Ipsi, ipsilateral; Contra, contralateral. Data are shown as mean ± SEM. See also Figure S2.

Plasticity of incertocortical signaling
A defining feature of top-down information is that it encodes acute and long-term experiences. In line, ZI boutons displayed pronounced plasticity of CS response patterns across the behavioral paradigm. These changes were bidirectional, with some boutons developing strong positive responses (PRs) (excitatory potentiation), whereas others started to display NRs (inhibitory potentiation). Plasticity emerged during memory acquisition and was even more robust in memory recall (Figures 6A-6C and S3O-S3R). It materialized in all possible rearrangements, with either positive or negative responding boutons undergoing either excitatory or inhibitory potentiation ( Figures S3S-S3U). Moreover, the percentage of responsive boutons grew with learning ( Figure 6G), driven by the emergence of strong NRs that were largely absent prior to learning ( Figures S5H-S5L). These plastic changes occurred for the CS+ and to a lesser extent for the CSÀ, consistent with the strength of threat memory to these stimuli ( Figures 5D, 5E, 5H, S3E, and S3H). To address whether the observed plasticity relates to associative memory or alternatively to spontaneous drift of the representation, we used control animals that underwent non-associative PC and thus did not form a threat memory ( Figures 5G, 5H, S3G, and S3H). These data display smaller changes in CS responses between habituation and recall ( Figures S3S-S3U), indicating less plasticity. In addition, the changes that did occur were opposite to the ones found after threat conditioning ( Figures 6I, S3R, and S3U). Incertocortical signaling thus encodes associative memory. One prominent consequence of the bidirectional changes in threat-conditioned animals is an increase in the standard deviation of CS responses between boutons ( Figures 6H and S3V), raising the question of how this affects the transmission of sensory information. Single bouton and population decoding analysis showed that stimulus discrimination by ZI boutons improves with threat learning (Figures 6D, 6E, and S4A-S4L), in line with the enhanced behavioral discrimination of CS+ and CSÀ (Figures 5F and 5H). Moreover, decoding accuracy correlates with CS discrimination at the behavioral level during memory recall but not during habituation ( Figures S4J and S4K), highlighting the tight link between incertocortical information transfer and discriminative memory. Conversely, PC led to a reduction in both response standard deviation ( Figure 6J) and stimulus discrimination at the neuronal level ( Figures S4A-S4L), along with decreases in pupil responses for both CSs ( Figure S3G). Interestingly, we observed a transient increase in decoding accuracy during PC ( Figures S4D-S4G) that may relate to the isolated USs that the animal receives during that session. We next analyzed trial-by-trial activity correlations between boutons, which can limit the information content of sensory responses. 40 Threat memory acquisition resulted in a decrease in pairwise noise correlations between boutons, which was not observed after PC, suggesting a learning-related increase in the amount of information transferred by this pathway ( Figure 6K and S5M-S5O). To elucidate how learning changes CS encoding in ZI boutons at the population level, we combined all bouton responses in highdimensional space and computed the population vectors in habituation and recall ( Figure 6L). 41 The angle between these vectors indicates that threat conditioning causes a robust change in the representation of both CS+ and CSÀ that is almost twice as large as in control mice after PC. Importantly, these changes enabled the boutons to represent memory strength: the absolute response to a given CS in the recall session correlates with pupil dilation as an online readout of memory, whereas this relationship is absent in habituation ( Figures 6M and 6N). To address whether memory specificity impacts these mechanisms, we segregated the mice into discriminators and generalizers based on the discrimination index and observed similar relationships between memory strength and bouton responses (Figures 5F and S6A-S6D). Finally, we asked whether, in addition to learned top-down relevance, incertocortical afferents also encode bottom-up salience related to the physical properties of sensory stimuli. 42 We therefore imaged ZI boutons in response to noise-train stimuli at different intensities ( Figures S4M-S4P). This revealed a significant albeit weaker correlation between absolute response magnitude and the perceived salience of the stimulus compared with top-down encoding ( Figure S4Q). Moreover, whereas increased top-down salience causes a reduction in the population responses of ZI boutons ( Figure S3O), the opposite effect is observed for bottom-up stimuli ( Figures S4R and S4S). In conclusion, bottom-up and top-down relevance are encoded by distinct yet partially overlapping mechanisms in incertocortical synapses. Furthermore, memory manifests as a balanced form of  Table S1 for the full results of the statistical tests. Data are shown as mean ± SEM. n.s. p > 0.05, *p < 0.05, ***p < 0.001, ****p < 0.0001. See also Figure S2.
population plasticity in ZI afferents that improves CS discrimination and information transfer and encodes the strength of the memory trace, in line with the observed essential function of this pathway for learning ( Figure 4C).     that boutons with the same response type are spatially clustered and may therefore often be positioned on the same axon. Next, we directly addressed this by identifying bouton pairs that were physically connected by an axon. This revealed that the overwhelming majority of sister boutons display the same CS response type (Figure S6Z). Collectively, this is consistent with the working hypothesis that incertocortical axons derive from two subpopulations of ZI neurons that undergo either positive or NR potentiation due to threat learning. However, this does not preclude possible contributions from additional mechanisms, such as activity-dependent plasticity of ZI boutons in cortex or acute presynaptic modulation via G-protein-coupled receptors. 7,43 To define the potentially unique role of NR boutons for the encoding of top-down information, we addressed how they contribute to the learning-related plasticity of CS representation in population space. To this end, we computed the angle between the response vectors of individual acquisition trials and the average population vector during either habituation or recall ( Figures 7G and S7). For NR boutons, the angle between habituation and acquisition responses successively increased over consecutive CS+/US pairings while it simultaneously decreased between acquisition and recall ( Figures 7H, 7I, and S7B). These changes occur rapidly: within only five acquisition trials, the CS representation has become more similar to the final pattern that will emerge in recall than to the original one in habituation. By stark contrast, PR boutons show no change in CS encoding during learning (Figures 7H, 7I, and S7B). In consequence, the encoding dynamics of the entire bouton population are similar to those observed for NR boutons alone ( Figures S7C and  S7D). These results are largely mirrored, albeit with lower magnitude, for the CSÀ (Figures S7E-S7I) and do not occur in pseudoconditioned mice (Figures S7J-S7N). Moreover, reanalysis of the population level changes between habituation and recall (Figure 6L), this time for PR and NR boutons separately, reveals a greater contribution of NR boutons also to long-term memory ( Figure S7O). In conclusion, rapid NR potentiation is the major driver converting population response patterns from encoding neutral sensory information during habituation to the representation of stimuli with learned top-down relevance during recall.

DISCUSSION
The ZI has recently emerged as a major regulator of diverse brain functions. [14][15][16][17][18][19][20]33 Moreover, deep brain stimulation of this area alleviates motor and potentially mood deficits in patients with Parkinson's disease. 44 Table S1 for the full results of the statistical tests. Data are shown as mean ± SEM, except (C) and (D), which show full data range, median, and quartiles. n.s. p > 0.05, *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001. See also Figures S6 and S7. the ZI is therefore crucial, the role of its projection to neocortex has not been established. Our multidisciplinary dissection identifies ZI afferents as a major determinant of ACx function that display both similarities and important differences to classical excitatory top-down projections deriving from other cortical areas, the thalamus, and the amygdala: 3,5-9,11 the similarities include the preferred targeting of L1, broad integration of brain-wide information, necessity for memory, and highly experience-dependent signaling. Conversely, the incertocortical pathway exhibits several unique features that identify it as a distinct source of top-down input, including its inhibitory mode of transmission, disinhibitory connectivity within the neocortex, and encoding of primary reinforcers. Moreover, the manner in which learned top-down relevance is encoded by these afferents differs fundamentally from long-range excitatory projections for which only positive transients and excitatory potentiation have been observed during memory acquisition and expression. 3,5-8 By contrast, approximately half of the ZI boutons develop negative stimulus responses after just a few conditioning trials, which are the main carriers of top-down information at the population level. The resulting bidirectional changes encode the strength of the memory trace and improve stimulus discrimination in the absence of large effects on mean responses. The bidirectional implementation of this plasticity may serve to improve both the dynamic range and the metabolic efficiency of information transfer to neocortical circuits.

Statistics: (C and D) Friedman test with Dunn's multiple comparisons test, (E and F) two-way ANOVA, with Sidak's multiple comparisons test, and (H and I) twoway ANOVA with Sidak's multiple comparisons test or two-tailed paired t test. See
ZI projections likely contribute to associative learning via two distinct operations: first, acute disinhibition recruited by the US can instruct plasticity induction in the local circuit by boosting PN activity. 11,29,30 Such disinhibitory gating has been proposed as a signaling mode for long-range inhibition in the hippocampal formation, 12,13 and our data reveal that it is also a major factor governing neocortical function. In that respect, future work on the physiological properties of ZI inputs to IN types beyond L1 will be important. Second, short-and long-term changes in CS encoding by ZI afferents themselves are likely to contribute to the representational plasticity that has been observed in several ACx circuit elements in response to learning. [21][22][23]38 ZI projections preferentially target INs, and this disinhibitory connectivity provides a more flexible and dynamic substrate for circuit control than the direct excitation supplied by classical top-down afferents. This is the case since the effects on PNs depend not only on incertocortical signaling itself but also on the current activity patterns of the targeted IN types, which furthermore control different somatodendritic domains of PNs, are differentially modulated by a range of neuromodulators and are in addition optimized for signaling in different frequency bands. 38,45,46 Moreover, on top of the net disinhibition of cortical PNs that is likely caused by positive potentiation of incertocortical synapses, approximately half of the boutons displayed robust negative-going responses that lead to an increase in PN inhibition via disinhibition of the INs. A subset of PNs also receives direct inputs from ZI projections, further enriching this circuit diagram. Together, our results, therefore, identify ZI afferents as a novel top-down pathway for the experience-dependent redistribution of inhibition, with likely rich computational consequences for neocortical processing.
In addition to the ACx, the ZI projects widely to several brain areas, 14,15 such as to the higher-order thalamus. 47,48 At the brain-wide level, an attractive possibility for the mechanistic implementation of its effects is therefore that the ZI serves to temporally coordinate and organize the activity patterns within this large-scale network in a manner that enables memory acquisition and expression. 49 In addition to memory, top-down information is critical for a number of further functions including credit assignment 3,50 and predictive coding. 2 Since the ZI is tightly linked to motor function, 14,15,44 incertocortical afferents may be particularly central for the neocortical computation of sensorimotor predictions. 2,51 Importantly, given its bidirectional encoding of learned top-down relevance, the ZI may be able to contribute to the computation of both positive and negative prediction errors in the neocortex. Our study has begun to pinpoint the importance and unique attributes of inhibitory top-down projections in the neocortex and may thereby also enable future work on the contribution of additional long-range inhibitory systems 12,52 to neocortical function.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

EXPERIMENTAL MODEL AND SUBJECT DETAILS Animals
All animal procedures were executed in accordance with institutional guidelines and approved by the prescribed authorities (Regier-ungspr€ asidium Darmstadt, approval numbers F126/1027 and F126/2000). Adult (>P35) C57Bl6/J mice (JAX stock #000664, The Jackson Laboratory), homozygous GAD2-IRES-Cre mice (Gad2 tm2(cre)Zjh/J , JAX stock #010802, The Jackson Laboratory) 55 and GAD2-T2a-NLS-mCherry mice (Gad2 tm1.1Ksvo , JAX stock #023140, The Jackson Laboratory) 56 were used. Animals were housed under a 12 h light/dark cycle, and provided with food and water ad libitum, except for water restriction periods in mice trained for head-fixation (body weight loss <15%). After surgical procedures, mice were individually housed. In behavioral experiments, only male mice were used. Research was conducted following the ARRIVE guidelines.

METHOD DETAILS
The experiments were not randomized, sample sizes were not determined prior to experimentation, and the investigators were not blinded to allocation during experiments and outcome assessment.

Surgery
In all cases, mice were anesthetized with isoflurane (induction: 4%, maintenance: 2%) in oxygen-enriched air (Oxymat 3) and fixed in a stereotaxic frame (Kopf Instruments). Body temperature was maintained at 37.5 C by a feedback controlled heating pad (FHC). Analgesia was provided by local injection of Ropivacain under the scalp (Naropin) and i.p. injection of metamizol (200 mg/kg, Novalgin, Sanofi) and meloxicam (2 mg/kg, Metacam, Boehringer-Ingelheim). For surgeries involving chronic window implantation, buprenorphine (i.p injection, 0.1 mg/kg, cp-pharma) was used instead of metamizol. Adeno-associated viral vectors (AAVs) or retrograde tracers were injected from glass pipettes (P0549, Sigma) connected to a pressure ejection system (PDES-02DE-LA-2, NPI). For ZI, 200 nl of AAV was injected at -1.94 mm posterior and ±1.7 mm lateral from bregma, and -4.15 mm from the cortical surface. We experimentally tested a range of volumes and determined that 200 nl of AAV allows for selective targeting in the ZI in GAD2-IRES-Cre mice for all viruses used using these coordinates. Evidence for selective labeling using this approach is provided in Figures S1F, S1K In surgeries for calcium imaging experiments, GAD2-IRES-Cre mice were injected in the ZI with a 5:1 mix of AAV2/5-hSynapsin1-Flex-axon-GCaMP6s (Addgene, #112010-AAV5) and AAV2/9-FLEX-tdTomato (Addgene, #28306-AAV9). For chronic window implantation, a craniotomy was performed over ACx with a biopsy punch (Integra Miltex), and covered by a custom made window (a round cover glass glued with Norland optical adhesive #81 to a section of hypodermic tubing of outer diameter 3 mm). The window and a custom-made titanium head plate were fixed using Cyanoacrylate glue (Ultra Gel, Henkel) and dental cement (Paladur, Heraeus). The glass window was protected with silicone (Kwik-Cast). Calcium imaging was performed >5 weeks after surgery.
Transsynaptic tagging with fluorescent in situ hybridization (FISH) GAD2-IRES-Cre mice were injected with AAV2/1-pEF1a-DIO-FLPo-WPRE-hGHpA (Addgene #87306-AAV1) in ZI, and in ipsilateral ACx with AAV2/5-EF1a-fDIO-EYFP-WPRE (UNC). 6 weeks post-injection, mice were anesthetized with isoflurane, sacrificed and the brains were then dissected, embedded in Tissue-Tek O.C.T. compound (Sakura) and frozen in isopentane at -55 to -60 C. 16 mmthick sections from these fresh frozen brains were prepared using a cryostat (Leica) and mounted on SuperFrost Plus microscope slides (Thermo Scientific). Sections were screened for fluorescence in auditory cortex (Zeiss Axio Zoom) and then stored at -80 C until FISH was performed using the RNAscope Fluorescent Multiplex Reagent Kit (#320850, Advanced Cell Diagnostics). Heating steps were performed using the HybEZTM oven (Advanced Cell Diagnostics). Tissue sections were treated with pretreatment solutions and then incubated with RNAscope probes (EGFP (which also labels EYFP), #400281; Mm-Camk2a-C2, #445231-C2; Mm-Vip-C2, # 415961-C2; Mm-Sst-C2, #404631-C2; Mm-Sst-C3, #404631-C3; Mm-Ndnf-C3, #447471-C3; Mm-Pvalb-C3, #421931-C3), followed by amplifying hybridization processes. DAPI was used as a nuclear stain. Prolong Gold Antifade (Thermo Scientific) was used to mount slides. Images were acquired on a confocal microscope (Zeiss LSM 880). EYFP-expressing cells, their distance from the pia and overlap with markers were quantified using a custom written MATLAB (MathWorks) script. Although the EGFP (anti-sense) probe labeled both mRNA and viral DNA, resulting in sparse puncta throughout the site of AAV2/5-EF1a-fDIO-EYFP-WPRE injection in ACx, cells that expressed mRNA (made possible via FLPo recombination) could be clearly distinguished from those that did not based on the intensity of expression (see Figure S1P). Anterograde transsynaptic tracing using AAV1 has been employed in a number of brain areas from both glutamatergic and GABAergic afferents to both excitatory and inhibitory targets. 27,28,53,[57][58][59] The approach depends on high viral titers and signal amplification via DNA recombinases. 27,28,53 Although the precise molecular mechanisms are not fully understood, the available evidence indicates that AAV1 is trafficked down the axon, is not released from fibers of passage, and that transneuronal spread is strongly dependent on synaptic transmitter release. 27,53 The efficacy of transsynaptic spread was estimated to be roughly equal for excitatory and inhibitory afferents and excitatory and inhibitory postsynaptic targets. 27 This method recapitulates established connectivity patterns, 27 and AAV1 labeled neurons are more likely to receive functional synaptic input from the afferents in question than their unlabeled neighbors. 27,28 While AAV1 can also spread retrogradely along axons, this caveat does not apply to the data presented here since we identify overwhelmingly postsynaptic INs which do not extend their axons to long-range targets. These results indicate that the differences in neuronal labeling we observe using this technique are highly likely to derive from differences in synaptic connectivity. While our results therefore indicate which neuronal cell-types receive preferential innervation by ZI projections, we note that the proportion of connected neurons identified for each cell-type is likely influenced by the difference in their respective density in ACx. 38 Histology Mice were anesthetized i.p. with 300 mg/kg ketamine and 20 mg/kg xylazine (WDT) and transcardially perfused with 4% paraformaldehyde (PFA) in PBS. Brains were post-fixed overnight in 4% PFA at 4 C and then stored in PBS. Coronal sections (60-100 mm thick) were cut using a Leica vibratome (VT1000S) and washed in PBS. Vibratome sections were permeabilized with 0.5% triton (Sigma) and then blocked in PBS-0.2% gelatin with 10% normal goat serum (Sigma), 0.2 M glycine and 0.5% triton either overnight at 4 C or at room temperature for 4 h. In cases where mouse primary antibodies would be used, 1:50 goat anti-mouse IgG antigen-binding fragments were included in the blocking solution (Jackson ImmunoResearch). Sections were incubated with primary antibodies in PBS-0.2% gelatin with 5% normal goat serum and 0.5% triton for 72 h at 4 C. Primary antibodies used were the following: mouse anti-NeuN (1:500, RRID: AB_2298772, Merck Millipore) or rabbit anti-RFP (1:500, RRID: AB_591279, MBL). Sections were then washed in PBS with 0.5% triton and incubated, either overnight at 4 C or at room temperature for 4 h, with fluorophore-conjugated secondary antibodies (1:1000, goat, Thermo Fisher Scientific) in PBS-0.2% gelatin with 5% normal goat serum and 0.5% triton. DAPI was used as a nuclear stain (5nM in PBS). Sections were mounted in Mowiol 4-88 (Polysciences) and imaged on a Zeiss confocal microscope (LSM 880). Brain regions were assigned using the Paxinos and Franklin's mouse brain atlas. 60 For ZI axon quantifications in cortex, A1 and AuV were defined as regions in 500 mm blocks moving medially along the pia away from the rhinal fissure. L1 depth was calculated separately for both regions based on the DAPI signal. L1/L2 border was defined as the last bin before the DAPI fluorescence intensity (bin size 10 mm, as a function of depth from the pia) exceeded 1 standard deviation above the average of the first 80 mm for 2 consecutive bins. For retrograde labeling quantifications, ZI was identifiable from GAD2-nuclear-mCherry labeling. Labeled cells and overlap with mCherry signal were quantified manually from sections spanning the full anteroposterior extent of the ZI (from bregma, $-1.22 until $-2.92 mm). For rabies tracing, mice were sacrificed 7 days post-RV injection. Perfused brains were cut in 60 mm coronal sections and counterstained with DAPI (30 min in 0.5 mg/ml, D1306, Thermo Fisher Scientific). To quantify the cell number per animal, every third section of the entire brain was scanned using Zen software (Zeiss) and cells were counted using a custom written MATLAB (MathWorks) script. To define the cell numbers in different brain regions, images were registered to the Allen Brain Atlas. 61 Only brain regions that revealed presynaptic cells in all mice are reported. Mice which had starter cells outside of the ZI were excluded from analysis. For dye infusion experiments, animals were briefly anesthetized with isoflurane and 200nl of 1 mM Alexa Fluor 488 Hydrazide (#A10436, Thermo Fischer Scientific) was infused bilaterally via implanted cannulae in ACx using a Hamilton syringe (10 ml, 701N, Merck) and Nanoject stereotaxic syringe pump (Chemyx). 62,63 A high concentration of dye was required for these experiments to be able to visualize it in tissue sections. 15 min after recovery, animals were perfused and the brains were cut in 150 mm coronal sections. Sections were then imaged on a Leica fluorescence microscope (DFC7000 GT) and fluorescence intensity was quantified using Fiji. The following histological images are compounds obtained by 'stitching' of different fields-of-view: Figures 1G, 1H,3B, and 3C; Figures S1B, S1C, S1F, S1G, S1H, S1K, S1P, S1W, S1X, S2F, S2H, S2I, S2J, S2M, and S3C.  69 to assess associative threat memory to the CS-and CS+. Pupil dilation in response to the CSs was calculated as the integral of DD/D 0 in a 15 s time window for habituation and recall, where D 0 is the mean pupil diameter within 2 s before sound-train onset and DD=D (t) -D 0 , where D (t) is the diameter at time t. In the acquisition session, we used a 9.5 s time window since the tail-shock caused a rapid constriction of the pupil. 70,71 Pupil dilation in response to the shock was calculated the same way as for the CSs, but in a 7.5 s time window starting 3.5 s after the shock ended. To account for the different time windows between sessions, values were normalized to a time interval of 1 s. Discrimination index was calculated as CS+/(CS-+ CS+). 72 In habituation and acquisition, mice were head-fixed in a body tube made of plastic and lined with electrical insulating tape, and a plastic dish filled with 70% ethanol was placed inside the microscope chamber. In order to change the behavioral context during recall, a body tube made of metal was used, and a plastic dish filled with 0.2% acetic acid was placed inside the microscope chamber. Pseudoconditioning (PC) was conducted like DTC except that during acquisition, CSs and USs were presented in an explicitly unpaired fashion (15 presentations each for CS1, CS2 and shock). In this case, none of the CSs elicited any associative memory, and they were therefore pooled for analysis. 7,22,38 Note that the only difference between the CS+, CS-and CS1/2 is the level of threat memory they evoke, since the experiments occur under otherwise identical conditions (i.e. up-and down-sweeps used in a counterbalanced fashion between animals for the CSs, with mice being randomly assigned to DTC or PC groups). For this reason we plot these CSs together in our correlation plots ( Figures 6M, 6N, S6B, S6D, and S6E-S6H) in order to illustrate the neuronal representation of the level of threat memory regardless of the stimulus that evokes it.
In vivo calcium imaging and noise-train stimulation Secondary auditory cortex (AuV) was localized with intrinsic imaging under 1% isoflurane anesthesia as in previous work. 7,22,38 AuV was selected as a focus for these experiments since it is critical for FM-sweep threat memory. 7,22,38 In addition, technical limitations preclude imaging of the more temporal area TeA due to steric hindrance by the base of the zygomatic bone and the presence of the rhinal vein. Mice were water restricted and water delivery was used to facilitate habituation to handling and head-fixation on 6 consecutive days (4 days in the recording setup). Water was administered ad libitum before the experiments in head-fixation. Calcium imaging was performed with a resonant scanning microscope (Bruker Investigator). The femtosecond laser (Spectra Physics InSight) was tuned to 920 nm to excite axonGCaMP6s and tdTomato at an average excitation power under the objective (Nikon 16x, 0.8 N.A., 3 mm WD) of 20-25 mW. Images (512x512 pixels, 140x140 to 166x166 mm 2 ) were acquired at 30 Hz in L1. Image acquisition, sound delivery and the camera for pupil tracking were controlled using custom written software. 73,74 For image analysis, acquired time series were first corrected for motion, taking tdTomato images as a reference and using a custom MATLAB code 75 where data was temporally binned every 2 frames. Based on the average response in the red channel (tdTomato) to the CS+ in conditioning trials, or to the tail-shock alone in pseudoconditioning trials, a window of exclusion surrounding the tail shock delivery was defined to account for the fact that some boutons could not be motion corrected during this period (exclusion window from 1 s before, until 3.5 s after shock delivery, see Figure S5A). Regions of interest (ROIs) were selected in Fiji from the axonGCaMP6s fluorescence time average of the entire session. The following pipeline in Fiji allowed selection of both small and large boutons in a semi-automated way ( Figure S3L): first, the average axonGCaMP6s fluorescence image was filtered (maximum filter, 1 pixel radius); the resulting image was thresholded with the IsoData algorithm, keeping the $5% highest intensity pixels; the resulting binary image was segmented with the watershed command; ROIs were obtained by applying the particle analysis, excluding particles on edges. Finally, ROIs were corrected manually in cases where more than 1 bouton was included inside an ROI, or in cases where ROIs delineated an axon segment without any bouton. Small boutons missed by the thresholding were also added manually. To superimpose frames from all sessions, axonGCaMP6s images were first translated and cropped in Fiji. ROIs were then obtained for each session independently ( Figure S3M). ROI sets from all three sessions were overlaid, and overlaying ROIs were assigned as paired boutons ( Figure S3N). Only paired boutons were used in the analysis. Average GCaMP6 fluorescence was measured for each ROI and frame, and data was subsequently analyzed in MATLAB. ROIs that displayed flat fluorescent traces without any calcium transients in the entire session were discarded. Traces were normalized either as DF/F 0 or as z-score=DF/s, where DF= F (t) -F 0 , with F (t) being the fluorescence at a given time t, and F 0 and s the mean fluorescence before sound-train in each trial and its standard deviation, respectively. Boutons were considered sound responsive if they displayed significant activity that started no later than 10 s after sound-train onset (averaged z-scoreR1.96 or %-1.645). Responses during FM-sweep sound-train were measured as the integral of DF/F 0 from sound-train onset to the end of the sound train (10 s sound-trains), except for conditioning, where only the first 9 s were measured to exclude shock-related motion artefacts or responses ( Figure S5A). To account for the different time windows between sessions, values were normalized to a time interval of 1 s. Responses following shock delivery in conditioning or pseudoconditioning sessions were measured as the integral of DF/F 0 starting from 3.5 s after shock delivery, and then for 5 s. DTC boutons were segregated into two paired populations across sessions based on whether their mean CS response (DF/F 0 *s) in recall was above ('positive response in recall', PR boutons) or below ('negative response in recall', NR boutons) zero. Latency to response peak was calculated as the time from CS onset to the global response peak within 9 s for all 3 DTC sessions, in consideration of the shock during the conditioning session. For correlation with memory strength, we compared pupil and bouton response in habituation or recall for each mouse and stimulus (CS+, CS-, CS1/2). This was done independently for PR and NR boutons, and additionally the absolute value of these changes were added together to obtain the 'absolute bouton response'. Viral targeting allows for efficient labeling of GAD2+ neurons with axonGCaMP6s and tdTomato throughout the mediolateral axis of the ZI ( Figure S3C), making it unlikely that response plasticity could be influenced by bias or inter-individual differences in labeling.

Stimulus information in individual boutons
All computational analyses were performed in Python, using the NumPy, 76 SciPy, 77 and Scikit-learn 78 libraries. The amount of stimulus-specific information in the response of individual boutons was quantified using the sensitivity index, defined as: Here m CS + and m CS À are a bouton's mean response during the CS+ and CS-presentation, averaged over the duration of the stimulus and all trials in which the stimulus was presented. sCS + and sCS À denote the standard deviations of the responses across trials, after averaging across the stimulus window for each trial.

Decoding from populations of boutons
To assess how much stimulus information is present on the population level, we trained linear decoders to predict stimulus identity from synaptic activity. 79 For each trial, we collected the time-varying population response during stimulus presentation, binned in time windows of 1 s. We excluded the last (shock) bin. Each stimulus was presented 15 times for 10 s, yielding 9 time bins for each of the 2x15 trials. We decoded the stimuli using L2-regularized logistic regression, using leave-two-trials out cross-validation to avoid overfitting and balance the number of training samples per stimulus. The remaining 28x9 training samples were randomly divided into 10 cross-validation folds, which were used to determine the regularization strength of logistic regression. The candidate regularization strengths were 10 values, equally spaced on a logarithmic scale between 1e-4 and 1e4 (the Scikit-learn default). These values determine the inverse regularization strength, such that smaller values correspond to stronger regularization. For optimal regularizing strength, we tested the classifier on the 2 left-out trials, repeating this cross-validation procedure by leaving out all pairs of consecutive trials (1x CS+, 1x CS-each): ðf1; 2g; f3; 4g; . ; f29; 30gÞ: The average accuracy (% of time frames correct) over all of these test trials is reported. To estimate chance level performance, we repeatedly trained classifiers after randomly permuting the stimulus labels of individual trials 1000 times. After each permutation, we ensured that both the training set and the test set consist of 50% of CS+ trials and 50% CS-trials. Decoders were trained on data from individual mice and on pooled data, for which we concatenated the responses from all mice for corresponding trials in the protocol. For the pooled data, we first used principal component analysis (PCA) to reduce the dimensionality of the data from the total number of boutons down to 200 dimensions, preserving approximately 90% of the variance. This sped up the analyses without qualitatively changing the results. We did not reduce the dimensionality before decoding from individual animals. To test if the higher decoding accuracy after conditioning compared to pseudoconditioning was due to a larger number of recorded boutons, we subsampled boutons from the conditioned mice. Specifically, decoders were trained on 100 random subsets of 535 boutons from conditioned mice, since this was the total number of boutons from all mice in our pseudoconditioning dataset. To assess whether differences in decoding accuracy were significant, bootstrapping was used to estimate the sampling distribution of the decoding accuracy. Specifically, we trained decoders on 1000 resamples (with replacement) of all boutons.

Dynamics of population vectors over learning
To assess how stimulus-evoked population activity changes over the course of learning, we computed angles between population vectors at different moments in time and for different stimulus conditions. 79 The change from habituation to recall was quantified as the 'learning angle' between the mean population vectors for each session. The mean vectors were computed as the average population response, averaged across all trials and stimulus bins (again excluding the shock window), for each stimulus separately. The dynamics of the population vector during acquisition was quantified by the angle between the average population vector during habituation (or recall) and the single-trial population vectors during acquisition. The single-trial vectors were computed as the time-averaged response during stimulus presentation. To assess whether the differences in the population vector from one day to the next (i.e., between habituation and the beginning of acquisition, and the end of acquisition and recall) are due to trial-to-trial fluctuations or due to representational changes, we quantified the trial-to-trial variability during habituation and recall. To this end, we computed the average of the angle between the mean population vector during habituation (or recall) and the single-trial habituation (recall) vectors. To avoid underestimating trial-to-trial variability, we computed the angle between the population vector of trial t and the average vector from all trials other than trial t.

Response variability and correlations across boutons
We observed that conditioning generated both positive and negative stimulus responses, thereby increasing the response variability across boutons. To quantify this effect, we computed the standard deviation across boutons of their trial-averaged responses, for each moment during stimulus presentation and for each mouse. 79 We also analyzed the correlation structure of trial variability, by computing noise correlations between all pairs of simultaneously imaged boutons. To this end, we first computed the time-averaged response for each bouton and trial, before computing Pearson correlation coefficients. Correlation coefficients were then pooled across mice. We left out the first two trials, since the initial stimulus response on those trials led to unusually high noise correlations (of almost 1) for many boutons. Please note that, in part due to the relatively low number of trials per stimulus per session (n = 15), noise correlation distributions observed for our data are wider than those which have been observed in previous studies which used 100s instead of 10s of trials. 80

QUANTIFICATION AND STATISTICAL ANALYSIS
This was performed using GraphPad Prism and MATLAB. Data were considered normally distributed if Shapiro-Wilk, D'Agostino & Pearson and KS tests were passed. According to this result, and depending on whether data was paired or not, comparisons were performed using the following parametric or non-parametric tests: For two-group comparisons, 2-tailed t-test (normal distribution, non-paired), 2-tailed paired t-test (normal distribution, paired), 2-tailed Mann-Whitney test (non-normal distribution, non-paired) and 2-tailed Wilcoxon test (non-normal distribution, paired). For three-group comparisons, One-way ANOVA (normal distribution, nonpaired) or One-way repeated measures (RM) ANOVA (normal distribution, paired) followed by Tukey's multiple comparisons test, and Kruskal-Wallis test (non-normal distribution, non-paired) or Friedman test (non-normal distribution, paired) followed by Dunn's multiple comparisons test. Two-way RM ANOVA followed by Sidak's and Tukey's multiple comparisons tests were used to compare groups across more than one factor. Correlations were computed as Pearson coefficients. Statistical tests used in each instance are indicated in the figure legends, and full details on all results in the study are provided in the supplementary materials. Results are reported as: n.s. (not significant) p>0.05, * p<0.05, ** p<0.01, *** p<0.001, **** p<0.0001.