EXPERIMENTAL MODEL AND SUBJECT DETAILS
Animals
All animal protocols and procedures were conducted with the approval of the Animal Care and Use Committee at the University of California, Irvine. Transgenic mouse lines expressing GCaMP6s in excitatory neurons were used to visualize excitatory neurons with two-photon imaging. For all hippocampal and some cortical imaging the Thy1-GCaMP6s GP4.3 line (RRID:IMSR_JAX:024275) was utilized. The remaining cortical imaging was conducted on mice resulting from a cross between the CAMK2a-tTa driver line (RRID:IMSR_JAX:007004) and a line expressing the calcium indicator GCaMP6s under the control of the tetracycline-responsive regulatory element (tetO, RRID:IMSR_JAX:024742).66 Mice were group-housed until the headplate implantation surgery (>P40), and housed individually after. The mice were maintained on a 12-hour light/dark cycle in the vivarium. Animals of either sex were selected for experiments. The animals were habituated to head fixation over a few days then trained to run and lick for hidden rewards in the visual VR in a series of steps that took 4-6 weeks. Mice were either water or food restricted to motivate behavior but given supplementary food or water to maintain 80% of baseline weight. Some mice were perfused after experimentation to allow for brain histology.
METHOD DETAILS
Surgical procedures
Mice underwent a headplate implantation and craniotomy in either the same or separate surgeries. First, the mice were implanted with custom designed metal headplates. In preparation, connective tissue was cleared from the surface of the skull and a thin layer of Vetbond was applied. Then the headplate was affixed, at an angle parallel to the site of imaging, with black dental acrylic (Lang Dental). The second procedure was a craniotomy. For cortical imaging, a 4 mm diameter cranial window was drilled using methods described previously.67 The cranial window was centered either along the midline or 2 mm lateral to midline above the right hemisphere, 1.5 mm anterior to lambda. A 4mm glass coverslip (World Precision Instruments) was placed over the exposed brain and sealed with Vetbond and black dental acrylic. Occasionally bone would grow underneath the coverslip, obscuring the field of view. An additional procedure would then follow to remove the current coverslip, delicately remove the bone growth and dura with a microscapel, and replace the coverslip. For hippocampal imaging, tissue over the somatosensory cortex was aspirated and replaced with a 1.8 mm cylindrical micro-optic plastic (MOP). MOPs were formed by curing the optical polymer BIO-13368,69 with 395nm light in a custom-built aluminum mold. During all procedures, mice were anesthetized with isoflurane in O2 (2% for induction, 1-1.5% for maintenance). Carprofen (5mg/kg, s.c.) and topical lidocaine (2%, 20 mg/ml) were used as analgesics. Dexamethasone (4.8 mg/kg, i.m.) was administered 4 hours before surgery to control inflammation. Sterile eye ointment (Rugby) was used to keep the eyes hydrated during the procedure. Body temperature was stabilized to 37°C with a heating pad under control of a rectal thermoprobe. The animals recovered on a warm heating pad post-surgery and were given daily injections of Carprofen (5mg/kg, s.c.) for 3 days post-surgery.
Visual Virtual Reality Setup
The visual virtual reality (VR) system translated rotation of a 3D printed running wheel (37.7 cm circumference) into propulsion through a virtual circular track environment displayed on three tablets (T530NU Samsung). The animal was held by a head-fork over the wheel, and viewed the VR environment on tablets held at right angles 12 cm from the eyes (300° of visual field coverage along the azimuth). Rotations of the wheel were detected by a rotary encoder (Avago), processed by a data acquisition board (NIDAQ), and input into the computer. The animal’s licks in anticipation or during consumption of reward were detected by a capacitive lick sensor (Sparkfun), and also routed into the data acquisition board. A camera recorded the animal’s pupil (Allied Vision 1" GigE Vision). The data acquisition board output signals to open a solenoid valve for a specified amount of time, allowing water or diluted condensed milk to flow through to a reward spout placed in front of the animal’s mouth. Running speed, position along the track, licking, pupil size, and reward delivery were all recorded during the session.
Custom software written in MATLAB (Mathworks) managed the view of the VR environment. Based on movement of the wheel, the point of view along the circular track progressed forwards or backwards. The system updated at a 30 Hz refresh rate. Each VR environment was composed of a circular track with circumferences ranging from 314-502 cm. Distance in the real world was calibrated to match VR distance. Rewards were dispensed either automatically or in a lick-triggered manner at two locations within each environment (hidden reward sites). Each VR environment was also distinguished by a ceiling, floor, and wall images. Some environments had complex wall images (e.g. mountains) while others (including all the novel environments used in experiment two) all had simple repeating patterns. Visual objects consisted of 3D objects designed in Unity and positioned at various locations on either side of the circular track. The tunnel object was 30 cm long, while most other objects were 5-10 cm in width. Custom software (SmoothWalk) was used to add these attributes, move the camera (mouse’s viewpoint), and wirelessly project the environments onto the tablets.20
The animal began each session with the wheel blocked for 10-30 minutes. When unblocked, the animal ran for a short distance (126 cm) with the tablets blacked out, then entered into its first VR environment. The first VR environment the animal typically saw each session was the animal’s training (“familiar”) environment. The beginning of each lap was defined by the middle of the tunnel object. There were no inter-trial intervals as laps progressed continuously.
The VR environment could be programmatically changed. If the mouse was scheduled to transport into a new environment, the switch occurred when the animal was halfway through the tunnel. The animal did, therefore see the environment instantly switch through the opening in the tunnel (all the tablets flashed briefly and then the new environment appeared). VR environment and object positions were recorded and saved at the end of every session.
Environments
For experiment one, we designed three visually distinct environments (“Classroom, Landscape, and Sunset”), with complex backgrounds and densely populated with objects on either side of the track. Each environment had a “track” that was 10 cm wide, which was visually distinct from the rest of the floor. The mouse’s position remained within the center of that track. Two versions of each environment were created, one “small” (314 cm in circumference) and one “large” (503 cm in circumference). No mouse saw both the small and the large versions of a single environment. Each environment had two different reward locations, and some were traversed in different directions (Landscape clockwise, i.e. the curve of the track appeared rightward, and Classroom and Sunset counter-clockwise, i.e. the curve of the track appeared leftward). Mice underwent training in one of the environments (most commonly the small versions of Classroom or Sunset), and the other two environments were used for novel experiences. See Figures 1 and 2, and supplementary videos for images of the environments.
For experiments two and three, we designed five new environments (“Europa, Blue Room, Paw Room, Ornament Room, Dot Room”) each with distinctly patterned backgrounds (walls and ceiling) and distinct floors (with a different and finer pattern than the walls), and only eight objects each. The “track” on which the mouse would move within each environment was not distinguished from the rest of the floor, and was 377 cm in circumference. Each object was distinct, and located near (5-20 cm) to the track. All environments except Europa had cylindrical walls on the inside and outside of the track (inside walls ~20 cm from the track; outside walls ~100 cm from the track), such that the mouse could not see objects on the opposite side of the track. (Europa only contained outer walls). The track was divided into two equal zones, A and B, each of which contained 6 equidistant locations where objects could be located. Four objects were assigned to those 6 locations. The animal could sometimes see up to two upcoming objects. Zone B began at a different distance from the tunnel in each environment, but the tunnel object was always within zone A, and was counted as one of the four zone A objects. In the “fixed” configuration, all objects within both zones were fixed within and between sessions. Sessions in which the mice ran in the fixed configuration of one of these environments were also used in experiment one. In the “shifting” or “destabilized” configurations, the each of the four objects within zone B were randomly assigned to one of the 6 positions within the same zone at the beginning of each lap. The objects always appeared on the same side of the track and were rotated to face the same direction relative to the mouse regardless of their position. There were 360 possible configurations of the four objects. Because this object shift occurred at the start of a new lap (while the mouse was in the tunnel), the animal could not see the objects change locations from inside the tunnel. No flash occurred when objects were reconfigured within the same environment.
Behavior
After recovering from surgery, mice were habituated to head fixation on top of the wheel for several days then taken through a 4-6 week training procedure familiarizing them to a single VR environment (“Familiar”). We developed a multi-stage training protocol for introducing mice to head-fixation, liquid reward (water or milk) delivery through a metal spout, introduction to running in VR, and transitioning from automatically delivered rewards to operant conditioning in which the mouse had to lick in the correct location (a 15-25 cm region), in order for a reward to be delivered. The last phase allowed us to have a behavioral read-out for how well the mouse understood the location of reward delivery.
Once the animal would regularly run over 10 laps and licked in anticipation of rewards, it was moved from the training setup to an identical VR setup underneath the microscope. On the imaging setup, mice were re-habituated to the familiar environment, then introduced on separate days to a series of novel environments. The novel environments were typically re-introduced for several days, either after 10-20 laps of a previous environment or from the beginning of the new session. Mice were imaged on consecutive days with occasional breaks. Animals encountered at least one novel environment during imaging, and some encountered more. Imaging on each mouse took anywhere from 1-8 weeks, so long as the quality of the cranial window was good and animals exhibited good behavior.
To quantify licking behavior, we divided the track into 100 position bins, and determined if any licks occurred in each bin on each lap (this ensured that bursts of licks were not weighted more than single exploratory licks). The five bins prior to the beginning of each reward site were classified as anticipatory locations, and the five bins following each reward site were post-reward locations, and the rest of the track contained non-specific licking. This allowed us to analyze the behavior in the same way regardless if the rewards were delivered automatically (as was the case in most novel-environment laps), or if the mouse was required to lick (within a 10 bin zone starting at the auto-deliver location) to trigger the reward delivery (most familiar-environment laps). Lick precision was calculated as the ratio of the anticipatory lick-bins to the difference between total lick-bins and post-reward lick-bins. The chance level for this calculation is 1/9 (5 anticipatory lick-bins for each of 2 rewards divided by 100 total bins – 10 post-reward bins).
Two Photon Imaging
Calcium transients from GCaMP6s expressing excitatory cells were recorded using a two-photon mesoscope (Neurolabware). Excitation from a laser tuned to 920 nm (Insight X3, SpectraPhysics) was phase modulated by a pockels cell (Conoptics) then guided through table optics to a water immersion 10 mm objective (numerical aperture 0.5). Brain regions were imaged through this objective by scanning the laser bidirectionally across a specified field of view using resonant and galvanometer mirrors (Cambridge Technology) and an electrically tunable lens (Optotune). Emissions were captured and amplified by GaAsP PMT and filtered using a 510/84 nm BrightLine bandpass filter (Semrock).
The objective was lowered to focus on a depth of view between 100-300μm below the pia in cortex, and 200-300 μm below the alveus in hippocampus. From an initial 4 mm large panoramic field of view of the posterior cortex, one to three regions of interest (ROIs) were specified for fast imaging (each typically 1000 μm x 600 μm). These ROIs were placed over hippocampal CA1 or retrosplenial cortex based on vasculature, or over primary visual or somatosensory cortices based on widefield calcium imaging amplitude maps. Areas of bone growth were avoided over the course of imaging. The electrically tunable lens was used to switch depths between ROIs if necessary. ROIs were recorded at a frame rate of 6-8 Hz using the Scanbox acquisition software (Neurolabware).
QUANTIFICATION AND STATISTICAL ANALYSIS
Pre-Processing
The imaging data is converted into the TIFF file format using custom software, and then run through the python Suite2P pipeline for registration and segmentation.70 Generally the automatic curation was sufficient, but trained undergraduate technicians manually curated cells based on morphology of the soma and plausibility of activity traces. The time varying fluorescence for each of the curated cells is taken as the average fluorescence of all pixels in each cell mask. These fluorescence traces then enter a MATLAB analysis pipeline. First, the fluorescence of each cell body is normalized by the 10 pixel wide surrounding neuropil signal (F(t) = Fsoma(t) - 0.7*Fneuropil(t)).71 Next, the relative fluorescence change (ΔF/F0) is calculated as follows: The running baseline (F0(t)) is calculated for each time t, by smoothening the fluorescence trace based on a running average over a time window t1, and taking the minimum value from the smoothened trace within a time window t2 behind the current time point, t. t1 is set to 1 second and t2 is set to 15 seconds. The deconvolved output from Suite2p72 is used for certain analyses (Spatial Information and Bayesian Decoding), but most analyses utilize the ΔF/F.
Statistics
Parametric statistics were used for hypothesis testing. Linear mixed effects models were used wherever possible, to account for the hierarchical structure of the data. Signals coming from the same sessions or animals, which may influence the results but are not of importance to the experiment, were set as random effects in the models. Error bars indicate standard error of the mean.
Analyses
Analyses were conducted with MATLAB, Python 3 (primarily using the numpy and pandas packages for data manipulation; statsmodels, scipy, pingouin, and rpy2 for stats; and seaborn and statannotations for plotting), and GraphPad Prism.
Criteria for Inclusion of Data
There was a lot of variability between mice in licking behavior, training required to get to the imaging stage, and number of laps run per session, and so to ensure quality data, we removed datasets in which the mouse ran less than 10 laps in a single environment, we were not able to detect 50 or more cells, or the decoder error for the familiar environment was greater than 30 cm. In most cases, if mice did not reach the decoder error criterion within one week of imaging in the familiar environment, we stopped collecting data from that mouse, but in some cases the data was removed post-hoc. In experiment three, we used this same decoder error criterion on the pre-destabilized environment to remove both the pre-destabilized and destabilized datasets.
Position Binning
Certain signals, such as velocity and cell activity, were position binned and occupancy normalized. First, any time bins in which the velocity was less than 1 cm/s were eliminated, so that we did not analyze periods when the mouse was stopped or running backwards. Then we constructed an M x N matrix, where M is the total number of laps the animal ran in the environment, and N is the number of bins into which the circumference of the circular track is divided (100). The average signal at each bin is then calculated by summing over the time period the animal spent at that bin and dividing by the length of time spent at that bin. This results in a rate of activity for each entry of the position binned matrix. Note that the first and last bins of each lap are neighboring positions on the track, and are both inside the tunnel of each environment.
Visualization of Population Activity
To visualize position correlated sequences, lap-averaged activity of all recorded (or just spatially tuned) cells were plotted as a function of bins in the environment. Cells were ordered along the y-axis by the position of their most active bin (i.e. cells ordered based on how early along the track activity peaked). To ensure that any observed PCS was due to the cell consistently firing at that position, a different set of laps were averaged when determining the peak bin (a random subset of 5 of the final 10 laps) and for plotting (the rest of the laps in that environment). To see if the same sequence persisted across both environments A and B, the activity in environment B was plotted by sorting cells by their peak bin in A, and vice versa.
Classifying position-correlated cells
To identify cells that were spatially tuned in a particular environment, we used a three-step criterion. First, the position-binned activity of each cell was averaged across laps, smoothed with a Hanning window of 5 bins, and the local peaks of this average trace were found. Any peaks that were higher than 3.5 times the 50th percentile of this trace were considered candidate place fields. The boundaries of these candidate place fields were set at the closest position at which the activity dropped below the 50% percentile, or the closest trough after which the trace reached below 70% of the peak (whichever came first). Second, the peak activity on each lap within the boundary of each field was compared to the baseline activity (50th percentile minus 5th percentile of activity in all bins and all laps). The in-field activity had to be 3.5x greater than the baseline in at least ⅓ of laps, or 5 laps (whichever was greater), to continue considering it as a place field. Finally, the size of the field was calculated based on the previously set boundaries, and any fields less than 20 cm, or greater than 150 cm were eliminated. Cells were considered spatially tuned if they were determined to have at least one place field. For some analyses, place fields were considered independently of the cell they belonged to, so that multiple fields belonging to one cell could be analyzed. The thresholds in this criterion (50th percentile and 3.5x greater peak than baseline) had been determined by an experimenter visually inspecting resulting place fields for two hippocampal and two retrosplenial datasets (after this, the criteria were set the same for all datasets regardless of brain region). These thresholds are somewhat arbitrary, and not very stringent. Therefore, we considered all cells without using this criterion wherever possible. The percent of cells that pass the criterion in each dataset may reflect a combination of many factors, including the familiarity of the animal with the environment, the brain region, the size of the environment, imaging quality, and the number of environments visited. The last two are factors, because cells that do not have place fields do not fire often (especially in the hippocampus), and thus may be missed by our cell detection algorithm, especially when the images are dim, or not much imaging outside of the behavior period is considered. We attempted to limit this impact by imaging for 5-30 min before and after each running session (while the animal was resting on the immobile wheel in the dark), and using the whole session for cell detection. Despite this, caution should be used when interpreting the percentage of spatially tuned cells.
Population vector correlations
The population vector is a list of the activities of all simultaneously imaged cells in a particular time or position bin. To compare activity between laps and between environments, we calculated the activity of all cells in each position bin in a single lap, or averaged across three consecutive laps. Then we correlated the population vector in each position bin, with the same in each position bin in a different set of laps. This results in a matrix of correlations. The correlations along the diagonal represent the same position bins correlated between lap intervals. We averaged the correlations along the diagonal to get the average correlation of one lap interval with another lap interval, either in the same or in different environments. Because some environments were different sizes, we divided all environments into 100 bins, differing in size (from 3-5 cm) between environments, in order to have square population vector correlation matrices. This means that moving along the diagonal does not correspond to the same distance in different environments. However, if there was a correlation between environments at corresponding distances (instead of corresponding bins), this would be seen as an increased correlation along y=3/5x (for example), instead of the diagonal, which was not observed.
Spatial Information
Spatial information content (SI) quantifies the information available to locate the animal based on neuronal firing rate.73 We calculated the SI for each cell based on the lap-averaged deconvolved signal (because deconvolved signal, unlike ∆F/F, cannot be negative at any point) based on this formula:
N stands for the total number of bins, pi is the probability of occupying the ith bin, fi is the deconvolved activity in the ith bin, and f is the activity averaged across all fi bins. The measure is quantified in bits. The deconvolved activity is shuffled 100 times (each lap was circularly shifted by a random integer) to also obtain a null distribution of spatial information scores.
Sparsity
We measure lifetime sparsity of each neuron to determine if the cell is narrowly or broadly active across position bins, using the following formula:
The variables are the same as those used in SI. Sparsity values closer to 0 indicate sparser representations, while values close to 1 indicate broad activity across many position bins.
Bayesian Decoding
A Bayesian decoder was used to estimate the animal’s position based on neural population activity.74 At every time point, the posterior probability of being at a particular location on the track, given the neural population activity, was calculated based on the prior probability of being at a particular position multiplied by the likelihood of this neural activity being produced at that position, divided by a normalization term. The model is trained with leave one out cross-validation; so time points corresponding to when the animal is on a particular lap are decoded based on activity on all other time points (corresponding to all the other laps). The formula for this is expressed as follows:
fi is the mean deconvolved, Gaussian-smoothed fluorescence trace over position x and ni is the time course vector of the ith neuron within a time bin of length τ, which we empirically optimized to be 3 seconds. N corresponds to the number of neurons, and constant C normalizes the probability distribution to sum to 1 across all positions. A decoded position was defined as the position with the highest probability for any given time bin, and the absolute value of the difference between true position and decoded position was defined as the Bayesian decoding error.
Tuning to Position or Object Reference Frames
In analysis for object tuning, position binned neural activity on each session is aligned to three different reference frames over a window spanning 9 cm prior to the object (3 position bins) and 3 cm past (1 position bin). First, activity is aligned to the actual position of each of the four objects in zone B. This position varies randomly from lap to lap in the shifting configuration but stays in one location in the fixed configuration. Second, activity is aligned to each of the six potential locations where objects could be. This is similar to the object alignment in the fixed configuration, but includes two spots not near any objects. The potential positions are at least 1/12 of a track length away from the other positions. And third, to see what could occur by chance, activity is aligned to a set of possible locations randomly chosen from every lap. This randomization procedure is repeated 1000 times to obtain a shuffle distribution for each neuron.
Cells were deemed to be significantly tuned at any object or position: 1) if the lower bound of the trial-averaged activity (mean - SEM) was significantly greater than the shuffled distribution (97.5 percentile) at any position bin within the window; and 2) if the trial and position-binned average activity within the window has a z-score over 2.3263 (p=0.01) relative to the trial and position-binned averages of the shuffled distribution. This double criterion ensures that tuned cells are significantly active at the same relative distance away from a particular position or object on most trials.
Activity variability across laps
In order to quantify possible rate remapping in destabilized environments, we measured the standard deviation of peak activity and average in-field activity between laps. Since there were 360 possible configurations of the four objects in zone B, no configuration occurred more than once in a session, and we could not assess rate remapping as traditionally done by comparing firing rates between configurations. However, if the rate changes occurred in every configuration, then the variability of firing rates should be greater in the shifting configuration compared to the stable configuration. The activity of each cell was first z-scored, and then we calculated the average in-field activity and the peak activity in each lap for spatially tuned cells. To find the peak activity, the z-scored activity for each cell was first convolved with a Hanning window of 5 bins, to get a less noisy result. Cells were classified as having a field in zone A or zone B, depending on in which zone the peak firing rate (across laps) occurred, and zone A and zone B cells were considered separately and compared. We also compared cells in the stable version of the environment with cells in the shifting version of the same environment, using a two-way repeated measures ANOVA. In order to determine if running speed was a factor in firing rate changes, we also found the mouse’s velocity in the position bin in which each cell’s lap peak occurred, and ran the same analysis with these velocities.