Structuring Time in Human Lateral Entorhinal Cortex

Episodic memories consist of event information linked to spatio-temporal context. Notably, the hippocampus is involved in the encoding, representation and retrieval of temporal relations that comprise a context (Deuker et al., 2016; Tubridy and Davachi, 2011; DuBrow and Davachi, 2014; Ezzyat and Davachi, 2014; Hsieh et al., 2014; Jenkins and Ranganath, 2010, 2016; Kyle et al., 2015; Lositsky et al., 2016; Nielson et al., 2015; Copara et al., 2014), but it remains largely unclear how coding for elapsed time arises in the hippocampal-entorhinal region. The entorhinal cortex (EC), the main cortical input structure of the hippocampus, has been hypothesized to provide temporal tags for memories via contextual drift (Howard and Kahana, 2002; Howard et al., 2005). Recent evidence demonstrates that time can be decoded from population activity in the rodent lateral EC, putatively arising from the integration of experience (Tsao et al., 2018). Here, we asked how learning a temporal structure influences entorhinal event representations. Participants acquired knowledge about temporal and spatial relationships between object positions—dissociated via teleporters—along a fixed route through a virtual city. We analyze fMRI multi-voxel pattern similarity change from before to after learning in the EC. Object representations in the anterior-lateral EC (alEC) specifically, the human homologue of rodent lateral EC (Navarro Schröder et al., 2015; Maass et al., 2015), changed to reflect elapsed time between events. Holistic representations of the temporal structure in alEC related to memory recall behavior suggesting mental traversals of the route during retrieval. Furthermore, we reconstructed the temporal structure of object relationships from alEC pattern similarity change. Our findings demonstrate that the experienced temporal structure of events shapes representations in the alEC, potentially via the reactivation of temporal context representations derived from slowly-varying population signals during learning. This provides novel evidence for the role of the human lateral EC in representing time for episodic memory.


Results
Here, we used representational similarity analysis of fMRI multi-voxel patterns in the hippocampal-entorhinal region to test the prediction that the anterior-lateral entorhinal cortex (alEC) maps the temporal structure of events. We examined the effect of learning temporal and spatial positions of objects along a route through a virtual city ( Figure 1). Specifically, we presented object images in the same random order before and after learning and subsequently compared the change in neural pattern similarity between object representations to the temporal and spatial structure of the task [1]. Using this paradigm and data, we previously demonstrated that participants can successfully recall the subjective, remembered spatial and temporal relations between object pairs and that the change of hippocampal representations reflects an integrated event map of the remembered distance structure. Here, we demonstrate that the change of multi-voxel pattern similarity through learning in alEC (Figure 2A) reflects the objective temporal distance structure of the taskdissociated from spatial distances through the use of teleporters [1]-resulting in a consistent relationship between similarity and time elapsed between object encounters. The change in multi-voxel pattern similarity in alEC between pre-and post-learning scans was negatively correlated with temporal distances between objects pairs along the route ( Figure 2B, T(25)=-3.75, p=0.001, alpha-level of 0.0125, Bonferroni-corrected for four comparisons). Objects

Figure 1. Design and analysis logic. A.
During the spatio-temporal learning task, which took place in between two identical runs of a picture viewing task (Supplemental Figure 1), participants repeatedly navigated a fixed route (blue line) through the virtual city along which they encountered objects hidden in chests (numbered circles) [1]. Temporal (median time elapsed) and spatial (Euclidean) distances between objects were dissociated through the use of three teleporters (lettered circles) along the route (Supplemental Figure 2), which instantaneously changed the participant's location to a different part of the city. B. In the picture viewing tasks, participants viewed randomly ordered images of the objects encountered along the route while fMRI data were acquired. We quantified multi-voxel pattern similarity change between pairwise object comparisons from before to after learning the temporal and spatial relationships between objects in subregions of the entorhinal cortex. We tested whether pattern similarity change reflected the structure of the task, by correlating it with the time elapsed between objects pairs (top right matrix shows median elapsed time between object encounters along the route averaged across participants). For each participant, we compared the correlation between pattern similarity change and the prediction matrix to a surrogate distribution obtained via bootstrapping and used the resulting z-statistic for group-level analysis (see Methods). encountered in temporal proximity changed to be represented more similarly compared to object pairs further separated in time ( Figure 2C). Pattern similarity change in alEC did not correlate significantly with spatial distances (T(25)=0.81, p=0.420) and pattern similarity  [15] were moved into subject-space and intersected with participant-specific Freesurfer parcellations of entorhinal cortex. Color indicates probability of voxels to belong to the alEC (blue) or pmEC (green) subregion mask after subjectspecific masks were transformed back to MNI template space for visualization. B. Pattern similarity change in the alEC reflected elapsed time between objects along the route as indicated by z-statistics significantly below 0. A permutation-based two-way repeated measures ANOVA further revealed a significant interaction highlighting a difference in temporal and spatial mapping between alEC and pmEC. Planned post-hoc comparisons indicated that the correlations between pattern similarity change and elapsed time in alEC were significantly more negative than correlations with spatial distances in alEC and as well as significantly more negative than the correlations between elapsed time and pattern similarity change in pmEC. C. Pattern similarity change in alEC for objects encountered close together or far apart in time along the route. Lines connect data points from the same participant. D. To recover the temporal structure of events we performed multidimensional scaling on the average pattern similarity change matrix in alEC. The resulting coordinates, one for each object along the route, were subjected to Procrustes analysis, which applies translations, rotations and uniform scaling to superimpose the coordinates from multidimensional scaling on the true temporal coordinates along the route (see Methods). For visualization, we varied the positions resulting from change in posterior-medial EC (pmEC) did not correlate with spatial (T(25)=0.58, p=0.583) or temporal (T(25)=1.73, p=0.089) distances.
Can we reconstruct the timeline of events from pattern similarity change in alEC? Here, we used multidimensional scaling to extract coordinates along one dimension from pattern similarity change averaged across participants ( Figure 2D-G). The reconstructed temporal coordinates, transformed into the original value range using Procrustes analysis ( Figure 2D), mirrored the time points at which objects were encountered during the task ( Figure 2E, Pearson correlation between reconstructed and true time points, r=0.56, p=0.023, bootstrapped 95% confidence interval: 0.21, 0.79). Further, we contrasted the fit of the coordinates from multidimensional scaling between the true and randomly shuffled timelines ( Figure 2F). Specifically, we compared the deviance of the fit between the reconstructed and the true timeline, the Procrustes distance, to a surrogate distribution of Procrustes distances. This surrogate distribution was obtained by fitting the coordinates from multidimensional scaling to randomly shuffled timelines of events. The Procrustes distance from fitting to the true timeline was smaller than the 5th percentile of the surrogate distribution generated via 10000 random shuffles ( Figure 2G, p=0.026). Taken together, these findings indicate that alEC representations change through learning to reflect the temporal structure of the acquired event memories and that we can recover the timeline of events from this representational change.
What is the nature of regional specificity within entorhinal cortex? In a next step, we compared temporal and spatial mapping between the subregions of the entorhinal cortex (EC). We conducted a permutation-based two-by-two repeated measures ANOVA (see Methods) with the factors entorhinal subregion (alEC vs. pmEC) and relationship type (time elapsed vs. spatial distance between events). Crucially, we observed a significant interaction between EC subregion and distance type (F(1,25)=7.40, p=0.011). Further, the main effect of EC subregion was significant (F(1,25)=5.18, p=0.029), while the main effect of distance type was not (F(1,25)=0.84, p=0.367). Based on the significant interaction, we conducted planned posthoc comparisons, which revealed significant differences (Bonferroni-corrected alpha-level of 0.025) between the mapping of elapsed time and spatial distance in alEC (T(25)=-2.91, p=0.007) and a significant difference between temporal mapping in alEC compared to pmEC (T(25)=-3.52, p=0.001). Spatial and temporal signal-to-noise ratios did not differ between alEC and pmEC (Supplemental Figure 3), ruling out that differences in signal quality might explain the observed effects. Collectively, these findings demonstrate that, within the EC, only representations in the anterior-lateral subregion change to resemble the temporal structure of events and that this mapping was specific to the temporal rather than the spatial dimension. multidimensional scaling and Procrustes analysis along the y-axis. E. The temporal coordinates of this reconstructed timeline were significantly correlated with the true temporal coordinates of object encounters along the route. Circles indicate time points of object encounters; solid line shows least squares lines; dashed line and shaded region highlight bootstrapped confidence intervals of correlation coefficient. F. The goodness of fit of the reconstruction (the Procrustes distance) was compared to a surrogate distribution of Procrustes distances obtained from randomly shuffling the true coordinates against the coordinates obtained from multidimensional scaling and then performing Procrustes analysis for each of 10000 shuffles (left shows one randomly shuffled timeline for illustration). G. The Procrustes distance obtained from fitting to the true timeline of events (circle and dotted line) was smaller than the 5th percentile (dashed line) of the surrogate distribution (solid line), which constitutes the significance threshold at an alpha level of 0.05.
In a next step, we sought to examine potential differences in temporal mapping between alEC and hippocampus. As previously described [1], hippocampal multi-voxel patterns changed to represent remembered temporal distances between object positions, which were assessed in pairwise judgments on a computer screen (r=0.64±0.29 mean±standard deviation of correlations between true and remembered temporal distance between objects pairs). We conducted a two-by-two permutation-based repeated measures ANOVA with the factors region (alEC vs. hippocampus) and temporal distance type (objective time elapsed vs. subjectively remembered time). We observed no main effects of region (F(1,25)=0.001, p=0.966) or distance type (F(1,25)=0.17, p=0.681). Importantly, this analysis revealed a significant interaction effect (F(1,25)=11.64, p=0.002), indicating that the alEC and hippocampus represent elapsed and remembered time differently. While the alEC mapped objectively elapsed rather than subjectively remembered time (T(25)=-2.15, p=0.041), the opposite was true for the hippocampus (T(25)=2.25, p=0.034). However, we note that these post-hoc tests are reduced to trends when compared to a Bonferroni-corrected alpha level, likely due to the close relationship between objective and remembered time. While we observed differences in signal-to-noise ratio between alEC and hippocampus (Supplemental Figure 3), such differences cannot explain the specific interaction evident in our data. Overall, our findings indicate a difference between how alEC and hippocampus represent the temporal structure of experience, with alEC activity patterns reflecting the objectively experienced time and the hippocampus representing the subjectively remembered temporal distances between memories.

Discussion
We examined the similarity of multi-voxel patterns to demonstrate that alEC event representations change to reflect the elapsed time between memories. Despite being cued in random order after learning, these representations related to a holistic temporal map of the task structure. Moreover, we recovered the timeline of events during learning from the changes in representation. The alEC temporal map reflected objective time elapsed between memories, while hippocampal activity patterns resembled the remembered temporal distances [1].
Our hypothesis for temporal mapping in the alEC was based on a recent finding demonstrating that population activity in the rodent lateral EC carries information from which time can be decoded at different scales ranging from seconds to days [14]. This temporal information might arise from the integration of experience across different scales. During a  [1] demonstrated significantly negative correlations of pattern similarity change in the hippocampus with subjectively remembered temporal distances between objects along the route. A permutation-based two-way repeated measures ANOVA revealed a significant interaction between pattern similarity change in alEC and hippocampus and the correlations with objectively elapsed and remembered temporal distances. While alEC pattern similarity change reflected objectively elapsed rather than subjectively remembered temporal distances, this effect was reversed for the hippocampus. structured task in which the animal ran repeated laps on a maze separated into different trials, neural trajectories through population activity space were similar across trials, illustrating that the dynamics of lateral EC neural signals were more stable than during free foraging [14]. Consistently, temporal coding was improved for time within a trial during the structured task compared to episodes of free foraging. These findings support the notion that temporal information in the lateral EC might inherently arise from the encoding of experience [14]. The long time scales of lateral EC temporal codes differ from the observation of time cells in the hippocampus, which fire during temporal delays in highly trained tasks [17][18][19][20][21]. While the ensemble of active cells changes over minutes and days [21], time cell firing has been investigated in the context of short temporal delays in the range of seconds, leaving open the question if time cells also encode longer temporal intervals. Slowly drifting activity patterns have been observed also in the human medial temporal lobe [22] and EC specifically [9]. A representation of time within a known trajectory in the alEC could underlie the encoding of temporal relationships between events in our task, where participants repeatedly navigated along the route to learn the positions of objects. Hence, temporal mapping in the alEC as we report here might help integrate hippocampal spatio-temporal event maps [1].
One possibility for why the similarity structure of alEC multi-voxel patterns resembles a holistic temporal map of the event memories after learning is the reactivation of temporal context information. If the alEC represents elapsed time along the route, the information about when an object is encountered might serve as a temporal context tag, which is associated with the object during learning. The visual object cues during the picture viewing task following the learning phase might lead to the reactivation of these temporal context tags. This might explain the observed pattern similarity structure with relatively increased similarity for objects encountered in temporal proximity during learning and decreased similarity for items encountered after longer delays. While this interpretation is in line with the framework proposed by the temporal context model [12,13], we cannot test the reinstatement of specific activity patterns from the learning phase directly since fMRI data were only collected during the picture viewing tasks in this study. The reactivation of temporal context representations might explain why the change in multi-voxel patterns in the alEC reflects the temporal relations between objects representations after learning. Importantly, the highly-controlled design of our study supports the interpretation that alEC representations change through learning to map time elapsed between events. The order of object presentations during the scanning sessions was randomized and thus did not reflect the order in which objects were encountered during the learning task. Since the assignment of objects to positions was randomized across participants and we analyzed pattern similarity change from a baseline scan, our findings do not go back to prior associations between the objects, but reflect information learned over the course of the experiment. Further, we presented the object images during the scanning sessions not only in the same random order, but also with the same presentation times and inter-stimulus intervals; thereby ruling out that the effects we observed go back to temporal autocorrelation of the BOLD-signal. Taken together, the high degree of experimental control of our study supports the conclusion that alEC representations change to reflect the temporal structure of acquired memories.
Our assessment of temporal representations in the antero-lateral and posterior-medial subdivision of the EC was inspired by a recent report of temporal coding during free foraging and repetitive behavior in the rodent EC, which was most pronounced in the lateral EC [14]. In humans, local and global functional connectivity patterns suggest a preserved bipartite division of the EC, but along not only its medial-lateral, but also its anterior-posterior axis [15,16]. Via these entorhinal subdivisions, cortical inputs from the anterior-temporal and posterior-medial memory systems might converge onto the hippocampus [23,24]. The rodent medial EC hosts a variety of functionally defined cell types such as grid, head direction, speed and border cells [25]. In line with hexadirectional signals in pmEC during imagination [26,27], putatively related to grid cell population activity [28], one might expect the pmEC to map spatial distances between object positions in our task. However, we did not observe an association of pattern similarity change in pmEC with the Euclidean distances between object positions. One potential explanation for the absence of evidence for a spatial distance signal in pmEC might be the way in which we cued participants' memory during the picture viewing task. The presentation of isolated object images probed locations in their stored representation of the virtual city. Due to the periodic nature of grid-cell firing, different locations might not result in diverging patterns of grid-cell population activity. Hence, the design here was not optimized for the analysis of spatial representations in pmEC, if the object positions were encoded in grid-cell firing patterns as suggested by models of grid-cell function [29][30][31][32].
Interestingly, our findings suggest a differential role for the alEC and the hippocampus for processing temporal mnemonic information. Whereas alEC pattern similarity change mirrored objectively elapsed time between memories, hippocampal representational change more strongly reflected the remembered time between these memories. The observation of this dissociation is even more surprising in the light of the high correlation between remembered and true temporal distances. Our findings are in line with the role of the hippocampus in the retrieval of temporal information from memory [3,8,10,11]. Here, hippocampal pattern similarity has been shown to scale with temporal distances between events [1,10] and evidence for the reinstatement of temporally associated items from memory has been reported in the hippocampus [3]. Already at the stage of encoding, hippocampal and entorhinal activity have been related to later temporal memory [2][3][4]6,7,9,33]. For example, increased pattern similarity has been reported for items remembered to be close together compared to items remembered to be far apart in time, despite the same time having elapsed between these items [4]. Similarly, changes in EC pattern similarity during the encoding of a narrative correlated with later duration estimates between events [9]. Complementing these reports, our findings demonstrate that entorhinal activity patterns carry information about the temporal structure of memories at retrieval. The central role of the hippocampus and entorhinal cortex in temporal memory (for review see [34][35][36][37]) dovetails with the involvement of these regions in learning sequences and statistical regularities in general [5,[38][39][40][41][42][43]).
In conclusion, our findings demonstrate that activity patterns in alEC, the human homologue region of the rodent lateral EC, carry information about the temporal structure of newly acquired memories. The observed effects might be related to the reactivation of temporal contextual tags, in line with the recent report of temporal information available in rodent lateral EC population activity and models of episodic memory.

Subjects
26 participants (mean±std. 24.88±2.21 years of age, 42.3% female) were recruited via the university's online recruitment system and participated in the study. As described in the original publication using this dataset [1], this sample size was based on a power-calculation (alpha-level of 0.001, power of 0.95, estimated effect size of d=1.03 based on a prior study [44]) using G*Power (http://www.gpower.hhu.de/). Participants with prior knowledge of the virtual city (see [1]) were recruited for the study. All procedures were approved by the local ethics committee (CMO Regio Arnhem Nijmegen) and all participants gave written informed consent prior to commencement of the study.

Overview
The experiment began by a 10 minute session during which participants freely navigated the virtual city [45] on a computer screen to re-familiarize themselves with its layout. Afterwards participants were moved into the scanner and completed the first run of the picture viewing task during which they viewed pictures of everyday objects as described below (Supplemental Figure 1). After this baseline scan, participants learned a fixed route through the virtual city along which they encountered the objects at predefined positions ( Figure 1 and Supplemental Figure 1). The use of teleporters, which instantaneously moved participants to a different part of the city, enabled us to dissociate temporal and spatial distances between object positions (Supplemental Figure 2). Subsequent to the spatio-temporal learning task, participants again underwent fMRI and completed the second run of the picture viewing task. Lastly, participants' memory was probed outside of the MRI scanner. Specifically, participants freely recalled the objects they encountered, estimated spatial and temporal distances between them on a subjective scale, and indicated their knowledge of the positions the objects in the virtual city on a top-down map [1].

Spatio-temporal learning task
Participants learned the positions of everyday objects along a trajectory through the virtual city Donderstown [45]. This urban environment, surrounded by a range of mountains, consists of a complex street network, parks and buildings. Participants with prior knowledge of the virtual city (see [1]) were recruited for the study. After the baseline scan, participants navigated the fixed route through the city along which they encountered 16 wooden chests at specified positions ( Figure 1A). During the initial 6 laps the route was marked by traffic cones. In later laps, participants had to rely on their memory to navigate the route, but guidance in the form of traffic cones was available upon button press for laps 7-11. Participants completed 14 laps of the route in total (mean ± standard deviation of duration 71.63±13.75 minutes), which were separated by a black screen displayed for 15s before commencement of the next lap from the start position.
Participants were instructed to open the chests they encountered along the route by walking into them. They were then shown the object contained in that chest for 2 seconds on a black screen. A given chest always contained the same object for a participant, with the assignment of objects to chests randomized across participants. Therefore, each object was associated with a spatial position defined by its location in the virtual city and a temporal position described by its occurrence along the progression of the route. Importantly, we dissociated temporal relationships between object pairs (measured by time elapsed between their encounter) from the Euclidean distance between their positions in the city through the use of teleporters. Specifically, at three locations along the route participants encountered teleporters, which immediately transported them to a different position in the city where the route continued ( Figure 1A). This manipulation allows the otherwise impossible encounter of objects after only a short temporal delay, but with a large Euclidean distance between them in the virtual city [1]. Indeed, temporal and spatial distances across all comparisons of object pairs were uncorrelated (Pearson r=-0.068; bootstrapped 95% confidence interval: -0.24, 0.12; p=0.462; Supplemental Figure 2).

Picture viewing tasks
Before and after the spatio-temporal learning task participants completed the picture viewing tasks while undergoing fMRI [1]. During these picture viewing tasks, the 16 objects from the learning task as well as an additional target object were presented. Participants were instructed to attend to the objects and to respond via button press when the target object was presented. Every object was shown 12 times in 12 blocks, with every object being shown once in every block. In each block, the order of objects was randomized. Blocks were separated by a 30 second break without object presentation. Objects were presented for 2.5 seconds on a black background in each trial and trials were separated by two or three TRs. These intertrial intervals occurred equally often and were randomly assigned to the object presentations. The presentation of object images was locked to the onset of the new fMRI volume. For each participant, we generated a trial order adhering to the above constraints and used the identical trial order for the picture viewing tasks before and after learning the spatio-temporal arrangement of objects along the route. Using the exact same temporal structure of object presentations in both runs rules out potential effects of temporal autocorrelation of the BOLD signal on the results, since such a spurious influence on the representational structure would be present in both tasks similarly and therefore cannot drive the pattern similarity change we focussed our analysis on [1].

Data Analysis Behavioral Data
Results from in-depth analysis of the behavioral data obtained during the spatio-temporal learning task as well as the memory tests conducted after fMRI scanning are reported in detail in [1]. Here, we used data from the spatio-temporal learning task as predictions for multi-voxel pattern similarity (see below). Specifically, we defined the temporal structure of pairwise relationships between objects pairs as the median time elapsed between object encounters across the 14 laps of the route. These times differed between participants due to differences in navigation speed [1]. Figure 1b shows the temporal distance matrix averaged across participants for illustration. The spatial distances between object positions were defined as the Euclidean distances between the locations of the respective chests in the virtual city. Remembered temporal distances were obtained in a post-scan memory test in which participants indicated the remembered temporal (and, separately, spatial) relationships between object pairs on a subjective scale [1].

fMRI preprocessing
Preprocessing of FMRI data was carried out using FEAT (FMRI Expert Analysis Tool, version 6.00), part of FSL (FMRIB's Software Library, www.fmrib.ox.ac.uk/fsl, version 5.0.8), as described in [1]. Functional images were submitted to motion correction and high-pass filtering (cutoff 100s). Images were not smoothed. When available, distortion correction using the fieldmaps was applied. Using FLIRT [46,47], the functional images acquired during the picture viewing tasks were registered to the preprocessed whole-brain mean functional images, which were in turn registered to the to the participant's structural scan. The linear registration from this high-resolution structural to standard MNI space (1mm resolution) was then further refined using FNIRT nonlinear registration [48]. Representational similarity analysis of the functional images acquired during the picture viewing tasks was carried out in regions of interests co-registered to the space of the whole-brain functional images.

ROI definition
Based on functional connectivity patterns, the anterior-lateral and posterior-medial portions of human EC were identified as human homologue regions of the rodent lateral and medial EC in two independent studies [15,16]. Here, we focused on temporal coding in the alEC, building upon a recent report of temporal signals in rodent lateral EC during navigation [14]. Therefore, we used masks from [15] to perform ROI-based representational similarity analysis on our data. The ROI mask for the bilateral hippocampus was based on the probabilistic Harvard-Oxford atlas, thresholded at a probability level of 0.25 [1]. For each ROI, the mask was coregistered from standard MNI space (1mm) to each participant's functional space (number of voxels: alEC 126.7±46.3; pmEC 69.0±32.9; hippocampus 1062.3±101.9 mean ± standard deviation). To improve anatomical precision for the EC masks, the subregion masks from [15] were each intersected with participant-specific EC masks obtained from their structural scan using the automated segmentation implemented in Freesurfer (version 5.3).

Representational Similarity Analysis
As described in [1], we implemented representational similarity analysis (RSA, [49,50]) for the two picture viewing tasks individually and then analyzed changes in pattern similarity between the two picture viewing tasks, which were separated by the spatio-temporal learning phase. After preprocessing, analyses were conducted in Matlab (version 2017b, MathWorks). In a general linear model, we used the motion parameters obtained during preprocessing as predictors for the time series of each voxel in the respective ROI. Only the residuals of this GLM, i.e. the part of the data that could not be explained by head motion, were used for further analysis. Stimulus presentations during the picture viewing tasks were locked to the onset of fMRI volumes and the third volume after the onset of picture presentations, corresponding to the time 4.54 to 6.81 seconds after stimulus onset, was extracted for RSA.
For each ROI, we calculated Pearson correlation coefficients between all object presentations except for comparisons within the same of the 12 blocks of each picture viewing task. For each pairwise comparison, we averaged the resulting correlation coefficients across comparisons, yielding a 16×16 matrix reflecting the average representational similarity of objects for each picture viewing task [1]. These matrices were Fisher z-transformed. Since the picture viewing task was conducted before and after spatio-temporal learning, the two crosscorrelation matrices reflected representational similarity with and without knowledge of the spatial and temporal relationships between objects, respectively. Thus, the difference between the two matrices corresponds to the change in pattern similarity due to learning. Specifically, we subtracted the pattern similarity matrix obtained prior to learning from the pattern similarity matrix obtained after learning, resulting in a matrix of pattern similarity change for each ROI from each participant. This change in similarity of object representations was then compared to different predictions of how this effect of learning might be explained ( Figure 1B).
To test the hypothesis that multi-voxel pattern similarity change reflects the temporal structure of the object encounters along the route, we correlated pattern similarity change with the temporal relationships between object pairs; defined by the participant-specific median time elapsed between object encounters while navigating the route. Likewise, we compared pattern similarity change to the Euclidean distances between object positions in the virtual city as well as the temporal relations subjectively remembered by each participant. We calculated Spearman correlation coefficients to quantify the fit between pattern similarity change and each prediction. We expected negative correlations as relative increases in pattern similarity are expected for objects separated by only a small distance compared to comparisons of objects separated by large distances [1]. We compared these correlation coefficients to a surrogate distribution obtained from shuffling pattern similarity change against the respective prediction. For each of 10000 shuffles, the Spearman correlation coefficient between the two variables was calculated, yielding a surrogate distribution of correlation coefficients ( Figure 1B). We quantified the size of the original correlation coefficient in comparison to the surrogate distribution. Specifically, we assessed the proportion of larger or equal correlation coefficients in the surrogate distribution and converted the resulting pvalue into a z-statistic using the inverse of the normal cumulative distribution function [1,51,52]. Thus, for each participant, we obtained a z-statistic reflecting the fit of the prediction to pattern similarity change in that ROI. For visualization (Figure 2c), we averaged correlation coefficients quantifying pattern similarity change in alEC separately for comparisons of objects encountered close together or far apart in time based on the median elapsed time between object pairs.
The z-statistics were tested on the group level using permutation-based procedures (10000 permutations) implemented in the Resampling Statistical Toolkit for Matlab (https://mathworks.com/matlabcentral/fileexchange/27960-resampling-statistical-toolkit). To test whether pattern similarity change in alEC reflected the temporal structure of object encounters, we tested the respective z-statistic against 0 using a permutation-based t-test and compared the resulting p-value against an alpha of 0.0125 (Bonferroni-corrected for 4 comparisons, Figure 2). Respecting within-subject dependencies, differences between the fit of temporal and spatial relationships between objects and pattern similarity change in the EC subregions were assessed using a permutation-based two-way repeated measures ANOVA with the factors EC subregion (alEC vs. pmEC) and relationship type (elapsed time vs. Euclidean distance). Planned post-hoc comparisons then included permutation-based t-tests of temporal against spatial mapping in alEC and temporal mapping between alEC and pmEC (Bonferroni-corrected alpha-level of 0.025). Likewise, we conducted a permutation-based two-way repeated measures ANOVA with the factors region (alEC vs. HPC) and temporal relationship type (objective time elapsed vs. remembered temporal relationship) to compare temporal mapping between the alEC and the hippocampus.

Timeline reconstruction
To reconstruct the timeline of events from alEC pattern similarity change we combined multidimensional scaling with Procrustes analysis ( Figure 2D). We first rescaled the pattern similarity matrix to a range from 0 to 1 and then converted it to a distance matrix (distance = 1similarity). We averaged the distance matrices across participants and subjected the resulting matrix to classical multidimensional scaling. Since we were aiming to recover the timeline of events, we extracted coordinates underlying the averaged pattern distance matrix along one dimension. In a next step, we fitted the resulting coordinates to the times of object encounters along the route, which were also averaged across participants, using Procrustes analysis. This analysis finds the linear transformation, allowing scaling and reflections, that minimizes the sum of squared errors between the two sets of temporal coordinates. To assess whether the reconstruction of the temporal relationships between memories was above chance, we correlated the reconstructed temporal coordinates with the true temporal coordinates using Pearson correlation ( Figure 2E). 95% confidence intervals were bootstrapped using the Robust Correlation Toolbox [53]. Additionally, we compared the goodness of fit of the Procrustes transform-the Procrustes distance, which measures the deviance between true and reconstructed coordinates-to a surrogate distribution. Specifically, we randomly shuffled the true temporal coordinates and then mapped the coordinates from multidimensional scaling onto these shuffled timelines. We computed the Procrustes distance for each of 10000 iterations. We quantified the proportion of random fits in the surrogate distribution better than the fit to the true timeline (i.e. smaller Procrustes distances) and expressed it as a p-value to demonstrate that our reconstruction exceeds chance level ( Figure 2F-G).

Signal-to-noise ratio
We quantified the temporal and spatial signal-to-noise ratio for each ROI. Temporal signal-tonoise was calculated for each voxel as the temporal mean divided by the temporal standard deviation for both runs of the picture viewing task separately. Values were averaged across the two runs and across voxels in the ROIs. Spatial signal-to-noise ratio was calculated for each volume as the mean signal divided by the standard deviation across voxels in the ROI. The resulting values were averaged across volumes of the time series and averaged across the two runs. Signal-to-noise ratios were compared between ROIs using permutation-based ttests.

Supplemental Figures
Supplemental Figure 1. Overview of experimental design. Participants viewed object images in random order while undergoing fMRI before and after learning the temporal and spatial relationships between these objects. The order and timing of picture presentations was held identical in both sessions to assess changes in the similarity of object representations as measured by the difference in similarity of multi-voxel activity patterns (see Methods). In between the two picture viewing tasks, participants acquired knowledge about the spatial and temporal positions of objects along a route through the virtual city. Initially, the route was marked by traffic cones, but in later laps participants navigated the route without guidance. Participants encountered chests along the route and were instructed to open the chests by walking into them. Each chest contained a different object, which was displayed on a black screen upon opening the chest. Crucially, the route featured three teleporters that instantly teleported participants to a different part of the city where the route continued ( Figure 1). This manipulation enabled us to dissociate the temporal and spatial distances between pairwise object comparisons (Supplemental Figure 2). After the second picture viewing task, participants' memory for temporal and spatial relationships between object pairs was assessed. Here, participants adjusted a slider to indicate whether they remembered object pairs to be close together or far apart. Temporal and spatial relations were judged in separate trials. The results of these memory tests are reported in detail in [1]. Figure 2. Temporal and spatial distances are uncorrelated. Pairwise temporal and spatial distances between objects are uncorrelated (Pearson r=-0.068; bootstrapped 95% confidence interval: -0.24, 0.12; p=0.462). Median times elapsed between object encounters were z-scored and then averaged across participants. Spatial distances were defined as z-scored Euclidean distances between object positions. When correlating individual median times elapsed with spatial distances, the correlation between the dimensions was not significant in any of the participants (mean ± standard deviation of Pearson correlation coefficients r= -0.068±0.006, all p≥0.378). Figure 3. Signal-to-noise ratio in the entorhinal cortex and hippocampus. A,B. Temporal signal-to-noise ratio did not differ between entorhinal subregions (A), but differed between alEC and hippocampus (B). C,D. Spatial signal-to-noise ratio was comparable between entorhinal subregions (C), but differed between alEC and hippocampus (D).