Phase separation of competing memories along the human hippocampal theta rhythm

Competition between overlapping memories is considered one of the major causes of forgetting, and it is still unknown how the human brain resolves such mnemonic conflict. In the present magnetoencephalography (MEG) study, we empirically tested a computational model that leverages an oscillating inhibition algorithm to minimise overlap between memories. We used a proactive interference task, where a reminder word could be associated with either a single image (non-competitive condition) or two competing images, and participants were asked to always recall the most recently learned word–image association. Time-resolved pattern classifiers were trained to detect the reactivated content of target and competitor memories from MEG sensor patterns, and the timing of these neural reactivations was analysed relative to the phase of the dominant hippocampal 3 Hz theta oscillation. In line with our pre-registered hypotheses, target and competitor reactivations locked to different phases of the hippocampal theta rhythm after several repeated recalls. Participants who behaviourally experienced lower levels of interference also showed larger phase separation between the two overlapping memories. The findings provide evidence that the temporal segregation of memories, orchestrated by slow oscillations, plays a functional role in resolving mnemonic competition by separating and prioritising relevant memories under conditions of high interference.

Each day is a flow of events that take place at different times but often in overlapping contexts. 2 Unavoidably, many of the stored memories share similar features, and this overlap poses a 3 major challenge for our memory system (McClelland et al., 1995;Norman & O'Reilly, 2003). 4 The present MEG study investigates the possibility that the human brain uses a temporal phase 5 code to adaptively separate overlapping memories, enabling the targeted retrieval of goal-6 relevant information. Competition between similar memories is considered one of the major causes of forgetting 9 (Anderson & Neely, 1996;Underwood, 1957). A prominent case is proactive interference, 10 where access to a target memory is impaired when overlapping information has been stored 11 prior to target learning (Tulving & Watkins, 1974). This impairment is typically ascribed to the 12 conflict arising from the co-activation of competing memories (Kliegl & Bäuml, 2021). On a 13 neurophysiological level, mid-frontal theta oscillations (3)(4)(5)(6)(7)(8) have been identified as a 14 reliable marker of cognitive conflict in general (Cavanagh & Frank, 2014) rhythm. These findings suggest that low-frequency oscillations provide time windows for the 5 selective processing and readout of distinct units of information (Lisman & Jensen, 2013). The 6 present study investigates phase separation as a potential mechanism to avoid interference 7 between multiple competing memories that are simultaneously reactivated by a cue. The computations by which the human brain achieves a separation of multiple overlapping 10 memories are currently unknown. One computational model leverages different phases of a 11 theta oscillation to iteratively differentiate target from competitor memories (Norman et al. , 12 2006). In this model, which we will henceforth refer to as the oscillating interference resolution 13 model, a retrieval cue will activate associated units, representing target and competitor features, 14 in a phase-dependent manner. In the most desirable output state, at medium levels of inhibition, 15 the cue only activates the target units and no or few competitor features. When inhibition is 16 raised towards the peak of the oscillation, only strong features of the target memory will remain 17 active, and the model learns (via Contrastive Hebbian Learning) to strengthen through Long-18 Term Potentiation (LTP) the weaker target nodes that did not survive the higher inhibition 19 levels. Conversely, during the transition from a medium to a lower inhibition state towards the 20 trough of the oscillation, activation spreads to more weakly associated units, including some 21 competitor features. This opposite phase is used to identify and punish overly strong features 22 of the competing memory through Long-Term Depression (LTD). The mechanism (see Fig. 1) 23 is repeated across several cycles of an oscillation, which changes the similarity structure of 24 memories into a state in which they are less likely to interfere with each other. This model 25

23
Participants (n = 24) completed an associative memory task including one proactive 24 interference condition and two control conditions (Fig. 1A-B). In each learning trial, 25 6 1 Figure 1. Paradigm and rationale for decoding analyses. A) At encoding, subjects were instructed to memorize 2 the word-image associations using an imagery strategy, and to constantly update their memory with the most 3 recent associate to each word. The experiment consisted of 3 different conditions. In NC1, subjects encoded the 4 word together with one association; in NC2, subjects encoded the word with the same associate twice; and in the 5 competitive condition (CC), subjects encoded the word together with two different associates (one scene and one 6 object). B) At retrieval, participants were instructed to remember the most recently encoded associate when 7 prompted with a word cue. C) Subordinate-category classifiers (animate/inanimate for objects, and indoor/outdoor 8 for scenes) were used to obtain independent evidence for target and competitor reactivation at each sample point.

9
Note that the super-ordinate (object/scene) classifier cannot discriminate between evidence for the target and repetition, however, this is not a necessary assumption for finding consistent phase separation.

25
Behavioural indices of proactive interference 26 We first evaluated behavioural evidence for proactive interference, that is, the extent to which 27 cued recall performance suffered from having encoded two different pictures with one word 28 (CC) compared to only one picture (NC1). Only trials where participants correctly responded 29 1 Figure 2. Behavioural results, time-frequency analysis of theta power, and decoding accuracies in the non-2 competitive and competitive conditions. A) As expected, we found that memory accuracy (on the two follow-3 up questions combined), averaged across the three recall repetitions, was significantly impaired when encoding a 4 given word cue with two different images (CC) compared to just one image (NC1), indicative of proactive 5 interference (red line indicating significant difference at p < .05). Recall performance also benefited from learning 6 a cue word together with the same image twice (NC2) compared to once (NC1). B) The average intrusion score 7 shows that errors were not random (50% black line), but instead were significantly biased towards the competitor's 8 sub-category. The proportion of intrusions did not decrease significantly across repetitions. C) Contrasting 9 oscillatory power elicited by the cue in the CC and the NC2 conditions resulted in a significant cluster (500-10 700ms, darkest red) in the theta frequency range (3)(4)(5)(6)(7)(8), most prominently over right frontal electrodes (see 11 right inlay). D) Results of an LDA-based classifier trained and tested on the non-competitive conditions (NC1 and 12 NC2) at retrieval, showing a cluster of significantly above-chance decoding accuracy approximately 2500-3000ms 13 post cue-onset, with an earlier decoding peak around 1-2 seconds not surviving cluster correction. E) Results from 14 a classifier trained on the non-competitive conditions (NC1 and NC2) and tested on the competitive condition.

15
Separate classifiers were used to detect target (blue) and competitor (red) evidence at the level of sub-categories.

16
No significant cluster emerged when averaging across all repetitions. F) Realigning the trials to the time of 17 subjective recollection (i.e., response), instead of cue-onset, significant target decoding was found in the 18 competitive condition (CC) when averaging over all repetitions. G) Response-locked target decoding in the CC 19 condition was also significantly higher on correct than incorrect trials. H) Contrasting decodability in the first and 20 third retrieval repetition for target and competitor memories, respectively, yielded a significant interaction such 21 that evidence for target memories increased as a function of repetition, whereas evidence for competitor memories  3000ms post-cue onset, and separately for the target and competitor memories. The first 500ms 1 were excluded to attenuate the influence of early, cue-elicited event-related potentials, and 2 because no neural reinstatement is expected during this early cuing period (Staresina & 3 Wimber, 2019). The analysis identified 3Hz as the frequency with the highest power for both 4 target and competitor memories (same peak frequency was also observed for the first repetition; 5 chance distribution generated from surrogate-label classifiers ( Fig. 3A-B, red line indicating 7 significant deviation). For the NC condition, the peak frequency was also in the low theta range 8 but slightly faster (4-5Hz; see Figure 3 -supplement 1C-D). Since our theta-locked analyses 9 were focused on the CC condition, we used 3Hz as the modulating frequency in all subsequent 10 analyses.     to reconstruct the activity of virtual hippocampal channels in source space. To validate the 6 specificity of the source localisation, we tested for distinct frequency profiles of the 7 hippocampal region of interest compared with two non-memory control regions, as shown in 8 hippocampal theta phase. The MI was calculated separately for each retrieval repetition, and 12 we contrasted phase modulation of the fidelity values from the real-label classifier to the 95th 13 percentile of an empirical chance distribution derived from surrogate-label classifiers. For both 14 target and competitor memories, we found that fidelity values were only significantly 15 modulated (p < .05) by the hippocampal 3Hz phase in the third and final retrieval repetition 16 ( Fig. 3D-F). Note that this observation partly deviates from our preregistered hypotheses, 17 where we expected phase modulation throughout the recall phase, accompanied by an increase 18 in phase separation of targets and competitors across the three repetitions (see next paragraph). 19 20 Target and competitor reactivations peak at distinct theta phases 21 The key hypothesis was that over time, the reactivated representations of target and competitor representations are overlapping initially, such that strong competitor nodes can (incorrectly) 2 activate during the high inhibition ("target") phase of theta, but gradually get weakened and 3 thus require lower levels of inhibition to become active. Meanwhile, the weakest target nodes 4 do not survive high inhibition initially and thus only activate during a lower inhibition phase 5 of the theta cycle, however, with repeated strengthening they will become active at an 6 increasingly early, higher-inhibition phase. Therefore, while early in time the target and humans, see: Kerrén et al., 2018). Taking this model into account, we thus hypothesised that 17 the target-competitor phase segregation would over time and repetitions become optimal within 18 the retrieval portion (e.g., half) of the theta cycle, rather than spread out across the entire cycle 19 (not shown in Fig. 1D). 20

21
The phase binning method used to calculate the modulation index (see above) allowed us to 22 determine the theta phase bin at which target and competitor reinstatement was maximal in a 23 given trial and condition. Importantly, the absolute phase of reinstatement in terms of its angle 24 is likely to vary across participants (e.g., due to individual differences in anatomy). We 25 therefore contrasted the phase of maximal target and competitor reinstatement within each 1 individual participant, computing their phase distance as an index of phase separation. This 2 phase distance is expected to be consistent across participants irrespective of the absolute angle 3 of target and competitor reactivation, and can thus be subjected to group-level statistics. We 4 used a Rayleigh test for non-uniformity (i.e., clustering) to test how coherent the phase 5 separation angle was across participants; and a circular v-test to establish if the mean separation 6 angle significantly deviated from zero. Note that without significant clustering, it is difficult to 7 interpret the mean angle in such a phase analysis. On the other hand, significant clustering 8 without a significant difference from zero would indicate that consistently across participants, 9 targets and competitors reactivate at a similar theta phase. there was no significant difference in target-competitor phase separation between the first and 19 third repetition when comparing the difference in mean angle (z(20) = 1.09, p = .29). Note, 20 however, that there was no significant clustering around a stable mean angle in the first 21 repetition (see above), and this statistical comparison is therefore inconclusive. Finding the 22 separation between target and competitor memories only in the last repetition, rather than 23 throughout the entire recall phase deviates from our preregistered hypothesis, although 24 paralleling the phase modulation results reported above. Overall, the phase distance analysis 25 thus partly confirms our hypotheses, indicating that competing memories become increasingly 1 separated along the hippocampal theta rhythm over time, with significant phase separation 2 emerging after several recall cycles. The next set of analyses was not pre-registered, and tested for a relationship between target-5 competitor phase distance and behaviour. The sample was split according to each individual's 6 intrusion score in the third repetition, dividing participants into a low intrusion and a high 7 intrusion group (Fig. 4B). We found that the high intrusion group had a mean separation angle 8 of 7 degrees that was not significantly different from zero (Rayleigh test for non-uniformity, 9 z(9) = 1.77, p = .17). The low intrusion group, by contrast, showed a mean phase separation of 10 57 degrees that significantly clustered around this angle (Rayleigh test for non-uniformity, z(9) 11 = 3.74, p = .02), however, we found no significant deviation from zero phase shift in this sub-12 group of participants (Rayleigh test for non-uniformity with a specified mean angle of 0, z(9) 13 = 3.32, p = .069). A statistical comparison of the phase separation angle in the high and low 14 intrusion groups (Wilcoxon signed-rank test in non-circular space, see Methods) did not 15 indicate a difference between the mean of the high and low intrusion groups' separation angle 16 (Z = 43, p = .85). Note that each sub-sample in this comparison only contains 10 participants, 17 and the statistical comparison is thus likely underpowered. In sum, this analysis suggests that 18 only low intrusion participants exhibit significant phase separation by the end of the task, which 19 can be taken as an indication that more neural differentiation of overlapping memories along 20 the theta cycle relates to more successful behavioural differentiation. 21

22
Two more analyses were conducted to corroborate the finding of a phase difference between 23 target and competitor reactivation, complementing the pre-registered analysis reported above. 24 First, we tested for a temporal difference (lag) between the timelines of target and competitor 25 reactivation using a cross-correlation. To do so, we filtered the two fidelity timecourses from 1 each participant at the dominant 3Hz frequency, in the 3 rd repetition of the competitive 2 condition. We then quantified a temporal lag between timecourses by using a sliding-window 3 (allowing us to see phase lags that evolve over time) and calculating the cross-correlation for 4 each 330ms time bin (1/3 of the dominant 3Hz frequency). Note that this analysis does not 5 specifically take the phase of virtual hippocampal sensors into account, and instead simply 6 cross-correlates the entire target and competitor fidelity timecourses derived from all MEG 7 sensors. The analysis revealed a significant cluster starting approximately 1 sec after cue onset 8 and lasting until the end of the trial, with a maximum lag of 30 degrees phase angle (Fig. 4C). 9 This finding suggests that the reactivation of competitor memories is lagging 30 degrees behind 10 the reactivation of target memories. 11

5
The second additional analysis used the average target and competitor fidelity timecourse of 6 each participant, filtered at 3Hz and shown individually in Figure 4 -supplement 1A. Visual 7 inspection already suggests a temporal shift, and sometimes even phase opposition, between 8 target and competitor decoding in most participants. To formally quantify the phase shift on a 9 group level, we subtracted in complex space, for each time bin, the phase angle of the target

19
To remember a specific experience, the respective memory often needs to be selected against 20 overlapping, competing memories. The processes by which the brain achieves such 21 prioritisation are still poorly understood. We here tested a number of predictions derived from 22 the oscillating interference resolution model (Norman et al., 2006). In line with its predictions, 23 the present results demonstrate that the neural signatures of two competing memories become 24 increasingly phase separated over time along the theta rhythm. Furthermore, larger phase 25 differences were associated with fewer intrusion errors from the interfering memories. These 26 findings support the existence of a phase-coding mechanism along the cycle of a slow 27 oscillation that adaptively separates competing mnemonic representations, minimising their 1 temporal overlap (Lisman & Idiart, 1995;Norman et al., 2006). that target memories would be strengthened and become decodable at an earlier (higher 5 inhibition) phase (see Fig. 1D), while competing associates would be weakened, with time 6 windows of their reactivation gradually shifting to a later (low inhibition) phase. The phase 7 separation between the two competing memories should thus increase the more often the 8 competing memories are being reactivated, and we expected to find an observable difference 9 in phase separation after a number of repeated recalls. To further test whether a large phase 10 segregation is beneficial for memory performance we related phase distance to the number of attempts, target memories would be strengthened, and competitors become less intrusive. 5 Though not tested behaviourally in the present study, repeated recall is known to induce 6 enhancement of the retrieved memories (Karpicke & Roediger, 2008;Rowland, 2014) and 7 forgetting of non-retrieved, competing memories (Anderson et al., 1994). Evidence for the up-8 and down-regulation of neural target and competitor patterns, respectively, has previously been 9 shown in fMRI studies (e.g., Kuhl et al., 2007;Wimber et al., 2015). In the present work, we 10 found neural evidence for both target strengthening and competitor weakening within the time 11 window of maximum memory reactivation (Fig. 2H). The increase in target evidence was more 12 robust than the decrease in competitor evidence, with the latter not surviving multiple  our results suggest that dynamic changes in overlapping memories can be tracked using MEG. 23

24
Decoding of target and competitor memories in the high interference (CC) condition was 1 generally less robust than decoding of target memories in the low interference (NC) conditions. 2 Generally, one challenge for time-resolved decoding of reactivated memory content is the 3 considerable variance in the timing of memory recall across trials, conditions, and participants, 4 likely affecting the timing in neural pattern reinstatement. Such variance can be rectified, at 5 least to a degree, when aligning the timelines to the button press that indicates subjective recall, 6 as shown by our response-locked analyses leading to more robust target decoding (see Fig. 2F  7 and G). There are, however, several reasons why we pre-registered our analysis locked to the 8 onset of the memory cue. Most importantly, previous work suggests that the phase of theta 9 oscillations is reset by a memory cue and remains relatively stationary for a period of time (see: fundamental assumption that slow oscillations regulate the excitation-inhibition balance of 7 local neural assemblies (Buzsáki, 2006). However, they do differ in their theoretical scopes. 8 Theta-gamma code models were developed to explain how the brain can handle ongoing states 9 of high attentional or working memory load, where multiple distinct items need to be kept To our knowledge, no study in humans has explicitly tested for phase coding when several 16 overlapping memories compete for retrieval. However, one intracranial study using a virtual 17 navigation paradigm showed that the neural representations of potentially interfering spatial temporally separate co-active, competing memory representations in the human brain. They 23 additionally suggest that such a code adaptively evolves across repeated target retrievals, in 24 line with the idea that oscillating excitation-inhibition can optimise learning with respect to 1 future access to the relevant target memory. to the neural overlap in spatial patterns, and have been tested primarily in fMRI studies. One 25 such prediction is the non-monotonic plasticity hypothesis (NMPH), derived from the same 1 learning dynamics along the oscillatory cycle as described above. If a competitor is sufficiently 2 strong to be co-activated in the high inhibition (target) phase, it will benefit from synaptic 3 strengthening of common connections and thus be integrated with the target memory. 4 Moderately co-active competitors, on the other hand, will be subject to synaptic depression and to long-term depression. Our data support the idea that low temporal separation is related to 24 integration, behaviourally. The high-intrusion group had a mean phase difference of 7 degrees between target and competitor memories, theoretically in line with a time window of strong 1 co-firing hence synaptic strengthening. Integration might thus have led to higher amounts of 2 intrusions (Brunec et al., 2020). In the low-intrusion group, more pronounced phase-separation 3 between target and competitor memories (57 degrees on average) might have led to only 4 moderate co-firing and hence promoted differentiation, resulting in lower levels of behavioural 5 interference. Although speculative, temporal separation could be a prerequisite for spatial 6 differentiation. Future studies, combining high temporal with high spatial resolution, and 7 paradigms to track the representational distance of overlapping memories, are needed to fully 8 understand these dynamics. comparing the frequency profiles of rhythmic memory reactivation between non-competitive 24 and competitive conditions, where conditions in which only one associate needs to be 25 remembered showed slightly faster rhythmic peaks (4-5Hz) than the condition in which two 1 associates compete for retrieval (3-4Hz, see Figure 3 -supplement 1). While offering a possible 2 explanation why slower oscillations are often observed in tasks with high memory demands, 3 this posthoc interpretation will need to be corroborated by empirical evidence in future studies. demonstrated that phase coding facilitates the separation of overlapping, associatively linked 10 memories. Together with the computational model used to derive these predictions, these 11 findings offer a possible mechanism utilized by the human brain to resolve competition 12 between simultaneously active memories. More generally, they add to a growing literature 13 showing that slow oscillations orchestrate the intricate timing of neural processing with direct, 14 observable effects on behaviour. Competing interests 23 The authors declare that no competing interests exist.  Participants received task instructions and first performed one short practice block. All 2 participants (see exclusion above) then performed 6 experimental blocks (40 encoding trials 3 and 72 (24x3) retrieval trials per block), each consisting of an associative learning phase, a 4 distractor task, and a retrieval test with 3 repetitions per target item (Fig. 1). At encoding, 5 participants were asked on each trial to encode a word together with an image associate. In the 6 competitive condition (CC), a word was encoded together with two associates, separated by at 7 least three intervening trials. The instruction was to always memorise the most recent associate 8 that was presented together with a given word for the subsequent memory test. Therefore, the 9 second associate in the CC always served as the target, with the previously learned first 10 associate (i.e., competitor) assumed to elicit proactive interference. In the non-competitive 11 single exposure condition (NC1), a word was encoded together with only one associate, and 12 these associations were never repeated during encoding. This condition served as the 13 behavioural baseline for measuring the effect of proactive interference on memory performance 14 (i.e., having previously encoded a competing associate compared with only one associate). In 15 the non-competitive double exposure condition (NC2), participants also encoded a word 16 together with only one associate, but these associations were presented twice. This condition 17 served as the neurophysiological baseline and was specifically designed to control for neural 18 effects induced by the repetition of the word cue (including but not limited to repetition In total this summed up to 40 trials for one block of learning (NC1 = 8 trials, NC2 = 16 trials, 1 CC = 16 trials). The order of the trials belonging to the 3 conditions was pseudo-randomized 2 such that the average serial position of each condition within a block was equal. Images were 3 pseudo-randomly assigned to these conditions for each participant, with the constraint that in 4 the CC, the associates needed to be from different image categories (one object and one scene, 5 split such that on half of the CC trials, the target was an object and the competitor a scene 6 image, and vice versa for the remaining half). 7 8 A learning trial consisted of a jittered fixation cross (between 500 and 1500ms), a unique action 9 verb (1500ms), a fixation cross (1000ms), followed by a picture of an object or scene that was 10 presented in the centre of the screen for 4 seconds. Participants were asked to come up with a 11 vivid mental image that linked the image and the word presented in the current trial. As soon 12 as they had a clear association in mind, they pressed the right-thumb key on the button box. 13 Participants were aware of the later memory test. 14 15 A distractor task followed each learning phase. Here participants had to indicate if a given 16 random number (between 1 and 99) presented on the screen was odd or even. They were 17 instructed to accomplish as many trials as they could in 45 seconds and received feedback about 18 their accuracy at the end of each distractor block. followed by one of the words as a reminder for the association. Participants were asked to bring 5 back to mind the most recent associate of this word as vividly as possible. The cue was 6 presented on the screen for 500ms and thereafter a blank screen with a black empty frame was 7 presented. To capture the particular moment when participants consciously recalled a specific 8 association, they were asked to press the right-thumb key as soon as they had a vivid image of 9 the associated memory in mind. If they did, the frame flashed once, and participants were 10 presented for 4 seconds with a blank screen and asked to hold the image in mind. A question 11 then appeared on the screen asking if the retrieved item was an object, a scene, or they were 12 unable to remember. Across trials, the object and scene options randomly shifted between the 13 left and right sides of the screen. If the participant did not remember the association, they were 14 told to press the left-thumb button. If participants selected "object" or "scene", a follow-up 15 question appeared (dependent on the response to the first question), asking if the retrieved 16 associate was an inanimate or animate object, or whether it was an indoor or outdoor scene. 17 The two follow-up questions were self-paced, and there was no feedback. The general hypotheses and analysis steps were pre-registered and can be found on OSF 3 (https://osf.io/pz4v2/?view_only=fbb676ccb2e74ccbb16d5d8aa8f9c58f). 4

5
After excluding participants based on the criteria stated in the pre-registration (more than 2 SD 6 deviation from the mean accuracy for each condition separately), 24 participants (16 female, 8 7 male) remained, with an average age of 24.5 years (SD = 5.73). For the 8 magnetoencephalography (MEG) analysis, a further 3 participants were excluded due to noisy 9 data, which resulted in 21 participants included in the analyses. All participants performed all 10 six blocks except for two, for whom time limit and button box errors occurred, resulting in only 11 five blocks for these participants for MEG analysis. The MEG was recorded at the Centre for Human Brain Health (CHBH), Birmingham, UK, 3 using an Elekta Neuromag TRIUX system, with 306 channels (204 planar gradiometers and 4 102 magnetometers; only gradiometers are used for all analyses reported here), sampled at 5 1000Hz (Elekta, Stockholm, Sweden). EEG was recorded with a 64-channel electrode cap in 6 the initial 10 participants as a sanity check, in order to verify that a strong mid-frontal theta 7 signal can be observed in the CC > NC2 condition, and how its topography compares between 8 MEG and EEG (Cavanagh & Frank, 2014). Since we found a comparably strong theta increase 9 over EEG and MEG (gradiometer) sensors, the EEG data is not reported in any of the analyses 10 presented here. The experiment was shown on a projector screen using a PROPixx projector 11 (VPixx Technologies, Saint-Bruno, Canada) with a 1440Hz refresh rate, and participants' 12 responses were collected using two button response boxes (fMRI Button Pad (2-Hand) System, 13 NAtA Technologies, Coquitlam, Canada).  Fahrenfort, 2021). Briefly, the data for this correction step were divided into epochs spanning 1 15 seconds before cue onset and 15 seconds after cue onset. The experimentally relevant events 2 started 1000ms before cue onset until 4000ms after cue onset. This time-window was masked 3 out from each trial. To make sure all data would be included, the continuous data were 4 symmetrically mirror-padded with 15 seconds prior to segmentation. To improve the fit of the 5 higher order polynomial, a 1st order polynomial was used to detrend the entire epoch (in 6 accordance with de Cheveigné and Arzounian (2018). Thereafter, a 30th order polynomial was 7 fitted and removed from the data. Note that the events of interest were not a part of the fitting 8 procedure, making sure the fit was not being influenced by cognitively relevant processing. 9 The method was partly implemented using the Noise Tools toolbox 10 (http://audition.ens.fr/adc/NoiseTools) together with custom-written MATLAB-code. After 11 this step, the detrended data were cut into the experimentally relevant epochs (-1000 to 4000ms 12 around cue onset at retrieval). 13 14 An automatized trial and component rejection was applied to the data. In a first step, to remove 15 high-frequency bursts, data were high-pass filtered at 100Hz, and trials that exceeded 4 times 16 the median absolute deviation of the amplitude distribution across trials were automatically 17 removed. In a second step, independent component analysis (ICA) was used to detect artifacts 18 to be removed in the data. To this end, MEG data were downsampled to 250Hz, and only the 19 first 1.5 seconds after cue onset were used for ICA (to reduce computational load). The After these components were removed, an additional automatized artifact rejection was 9 conducted, similar to the initial step. The same procedure as in the first rejection round was 10 followed but was done on channels instead of trials. Again, data were band-pass filtered 11 between 1 and 100Hz, and the channels that exceeded 3 times the median absolute deviation 12 of the channel distribution were rejected. We chose three times the median absolute deviation 13 for the last two rounds of rejection to be more conservative. Bad channels were interpolated 14 (SD = 6.36) and 77 (SD = 7.29). At the end of preprocessing, a sanity ERF-check on occipital 24 channels of the retrieval data was conducted and the average waveform can be found in Figure  1 3 -supplement 2A. trained on the non-competitive ("pure") conditions (i.e., retrieval of objects and scenes in the 25 NC1 and NC2 conditions), and were then tested on both the non-competitive and the 1 competitive conditions. The time interval of interest for these multivariate analyses started 2 500ms before onset of the retrieval cue and lasted up to 3000ms post-cue. This time window 3 was selected because no memory reactivation is expected earlier than 500msec (Staresina & 4 Wimber, 2019), and previous work using a similar (though non-competitive) cued recall 5 paradigm suggests that participants need approximately 3 sec to mentally reinstate a image 6 Training/testing in the non-competitive condition was done as a sanity check for classifier 16 performance in a memory retrieval situation with no interference. Since the same data was used 17 for training and testing in this case, a 10-fold cross-validation was used and repeated 5 times. 18 To test for reactivation of target and competitor memories in the competitive condition, trials 19 were split into those where the target was an object and the competitor was a scene, and vice 20 versa, and the corresponding object and scene classifiers (trained on non-competitive retrieval) 21 were then used to separately indicate evidence for target and competitor reactivation on each 22 single trial. Only correct trials were used in the competitive condition. As the training and 23 testing data came from different trials, cross-validation in the competitive condition was not 24 necessary. To avoid overfitting, the covariance matrix was regularized using shrinkage 1 regularization, with the lambda set to automatic (Blankertz et   accuracy was derived by calculating the fraction correctly predicted labels, whereas chance 7 was defined as 50% for a binomial classifier. More specifically, during training, the classifier 8 found the decision boundary that could best separate the patterns of activity from the two 9 classes (animate vs inanimate for objects, indoor vs outdoor for scenes) in a high-dimensional 10 space. The classifier was then asked to estimate whether the unlabelled pattern of brain activity 11 in any given retrieval trial and at each time point was more similar to one or the other class. 12 This training-test procedure was repeated until every single retrieval trial had been classified. As part of a (not preregistered) set of analyses conducted in response to peer review, we also 17 analysed decoding performance after realigning the data to the button press that indicates 18 subjective recollection of the target associate. These analyses followed the same approach as 19 above, except that we excluded trials in which responses occurred in the first 500ms or last 20 200ms to be able to plot decoding accuracy from -500 to +200 around the button press. 21 22 Determine peak frequency of fidelity values using IRASA 23 The Irregular-Resampling Auto-Spectral Analysis (IRASA) method has been shown to 24 robustly find and separate the oscillatory from the fractal signal both in ECoG and MEG data, 25 and was here used to quantify the oscillatory signal component of the fidelity values (Wen & 1 Liu, 2016). More specifically, the brain produces task-related rhythmic (oscillatory) 2 components, but also arrhythmic scale-free (fractal) components (Buzsáki & Draguhn, 2004). 3 The rhythmic oscillatory components are regular across time, whereas the fractal components 4 are irregular (Wen & Liu, 2016). In short, IRASA resamples a time-series signal and computes 5 a geometric mean of every pair (oscillatory and fractal) of the resampled signal. The median of 6 the geometric mean is then used to extract the fractal power spectrum. The difference between 7 the original power spectrum and the fractal power spectrum is the estimate of the power 8 spectrum of the oscillatory component of the signal (Wen & Liu, 2016). In the present study, 9 IRASA (as implemented in the FieldTrip toolbox) was applied to the fidelity values, in a time 10 window from 500 to 3000ms post-cue, padding each trial length up to the next power of 2. 11 Apart from the reasons mentioned above, this large time window also assures that low 12 frequencies can be properly estimated. 13

14
To test for significance in the frequency-transformed decoding timecourses, the LDA as of the real-label classifier could then be contrasted against this chance distribution (Cohen, 24 2014). Note that due to the algorithm for separating the fractal components from the oscillatory 25 components, the output from IRASA yields a higher frequency resolution than 1 point per 1 frequency. However, our a priori frequency range was set to 3 to 8 Hz and we therefore tested 2 for significant oscillatory components in each 1Hz frequency bin between 3 and 8 Hz, with the 3 estimated chance distribution subtracted from the real value and subsequently divided by the 4 standard deviation of the estimated chance distribution (Fig. 3A-B). This gave a z-value, which 5 was compared to the critical threshold of z = 2.32 at p = .01, correcting for 5 multiple was again done manually to check that the alignment worked, and adjusted if it did not. The 20 realigned model was used to reslice and segment the brain to make the axes of the voxels 21 consistent with the head position and subsequently to extract the brain compartments. The 22 segmented brain was then used to create a forward model (head model). In our case, we used a 23 semi-realistic forward model (Nolte, 2003). The forward model was used to create the source 24 model (lead field), where for each grid point the source model matrix was calculated with a 1 25 cm resolution, and the virtual sensors were placed 10mm below the cortical surface, and 1 subsequently warped into each brain. In total we modelled 3294 virtual sensors for each 2 participant with whole-brain coverage. LCMV beamforming was used to reconstruct the 3 activity of all virtual channels in source space (see below for selecting hippocampal channels) 4 with twenty percent regularisation. To confirm that the source localization provided reliable 5 results, we checked sources for the early visually elicited response to the cue word in an early Further source-level analyses were conducted to estimate the frequency profile of the raw trial 23 data in the hippocampal region of interest, and compare it to control regions (superior occipital 24 cortex and precentral gyrus), with the hypothesis that the hippocampus would show a stronger 25 power in the theta frequency range (3-8Hz) than other regions. We used IRASA on each trial 1 and each virtual channel, following the same procedure as we did when calculating the 2 frequency profile of the fidelity values. For statistical comparison of the hippocampal profile 3 against each of the two control regions, we directly contrasted the average power in our 4 predefined 3-8Hz frequency window of interest using two paired-samples t-test. Tests were 5 conducted one-tailed because we expected higher theta power in the hippocampus compared 6 to each of the other two regions. Since we conducted two separate tests (one per control region), 7 we thus set the Bonferroni-corrected p-threshold for each test to .05. of the complex number (the real and the imaginary part) representing the average maximum 5 reinstatement peak for each participant was obtained by taking the square root of the sum of 6 the squares of the parts (using the Pythagorean Theorem). The amplitude was arbitrarily set to 7 1 because we were only interested in phase angle. The phase vectors were then point-wise 8 divided with the complex modulus. And finally, to subtract the angles from each other, one 9 vector was rotated by multiplying it with the other's complex conjugate. To obtain the angle, 10 the inverse tangent of the ratio was taken of the product between the vectors. In MATLAB the 11 following code was used: 12 where θ is the vector of angle difference. 14 15 We used the CircStat toolbox (Berens, 2009), as implemented in MATLAB, to statistically test 16 the phase difference between target and competitor memories. More specifically, to test the 17 difference in each repetition, as well as the average distance collapsed across all repetitions, 18 we used a Rayleigh test for non-uniformity (circ_rtest function in the toolbox). This test will 19 show significant clustering if the phase differences between target and competitor memories 20 are non-uniform, independent of their absolute mean angle. To test whether the mean phase 21 difference we obtained was significantly different from zero phase shift, we used the circ_vtest 22 function with a specified mean angle of 0. This test indicates whether the mean angle of the 23 phase distance is significantly different from zero. Lastly, we wanted to test the difference 24 between the high and low intrusion groups' mean angle. To do so, we calculated the mean 25 phase angle of target and competitor reactivation for each participant. We then calculated the 1 circular mean of the two vectors within each participant, where the length of this vector defines 2 the similarity between target and competitor phase angles, independent of absolute angle. This 3 vector is non-circular and can be subjected to a Wilcoxon sign-rank test. We choose a Wilcoxon 4 sign-rank test as the sample size was low in each sub-group (n = 10 in each group, removing 5 one participant from the larger group randomly to equate group size). Cross-correlation and maximum phase difference between fidelity timecourses of 8 target and competitor 9 As another part of the set of new analyses conducted in response to peer review, we wanted to 10 establish that the phase shift between target and competitor reactivation can also be observed 11 when using the continuous fidelity timecourses, rather than zooming in on the reactivation 12 peaks only. Single-trial fidelity timecourses from the third recall repetition (correct trials only), 13 as used for the above phase-to-classifier locking analyses, were filtered at 3Hz, and their lag 14 was quantified using a cross-correlation. A sliding-window approach was used to preserve 15 some degree of time information and investigate whether phase lag between signals evolves 16 over time (within trials). Each sliding window had a length of 330ms (1 divided by 3Hz), 17 allowing us to express the lag in phase angles from -180 to 180 degrees. For each participant, 18 25 surrogates were computed where in each iteration, decoding was repeated 25 times, trained 19 on random labels. The same cross-correlation procedure was then conducted on these surrogate 20 classifiers. To obtain a z-value, the surrogate data was subtracted from the real data and 21 subsequently divided by the standard deviation of the surrogate data. Non-parametric cluster-22 based permutation testing, as implemented in the FieldTrip software, was used to account for 23 multiple comparisons across time points (500-3000ms) and phase angle (lag: -180 to 180 24 degrees). The threshold for statistical testing was set to a cluster alpha level of 0.05. T-values 25 above the threshold of 0.05 were then summed up within a cluster and compared against a 1 distribution where condition labels were randomly assigned 1000 times with the Monte-Carlo 2 method, following the default method implemented in FieldTrip. 3 4 To further corroborate a phase shift between the average target and competitor fidelity 5 timecourses, we filtered the single trials from the third repetition at the relevant 3Hz frequency. 6 We then obtained the phase using a Hilbert transform. To statistically quantify the phase shift 7 on a group level, we subtracted, for each time bin, the phase angle of the target from the angle 8 of the competitor (in complex space) for each participant's averaged fidelity value, using the 9 same method as described in Phase-amplitude coupling between MEG data and fidelity values. 10 11 Code and data availability 12 All data and code supporting this study are available on OSF and Zenodo.    from -200 to -50ms before cue onset. The spatial pattern (i.e., sensor amplitudes) of all gradiometers at a given 5 time point was used as feature vector to train the classifier. A k-fold-10 cross-validation repeated 5 times was used 6 to minimize dependencies between training and testing data. This analysis revealed a cluster of significantly 7 above-chance decoding performance peaking around 150ms after image onset. B) To identify an unbiased time 8 window (i.e., not biased towards target or competitor decoding; see: Cohen, 2014) we collapsed classification 9 accuracy over target and competitor memories and averaged across all retrieval attempts. We then tested for 10 changes over repetitions in this select time window. We found significant decoding (uncorrected p < .05, with 11 cluster not surviving stringent cluster-correction, pcluster = .15) between 1.77 and 1.93 seconds after cue onset. C)

12
Contrasting target decoding between first and third repetition revealed significantly higher decoding (at pcluster <

13
.05) in a similar time window as in B), in line with the pre-registered hypothesis. Testing the 3 rd repetition against 14 chance revealed a significant (puncorr < .05) peak of target decoding ranging between 1.81 and 2.08 seconds after 15 cue onset, that did however now survive stringent cluster correction (pcluster = .07). D) When instead testing for a 16 down-regulation of competitor memories across retrieval attempts, despite a qualitative difference in the expected 17 direction around 2 sec, the contrast between first and third repetition did not reveal any statistically significant 18 clusters (at pcluster < .05), contrary to our preregistered hypotheses. E) Stronger competitor reactivation was found 19 when only analysing incorrect trials across all repetitions, with peaks of early reinstatement (puncorr < .05, not 20 surviving cluster correction) around 500ms-1500ms.   Figure 3 -supplement 2. A) An ERF analysis was conducted after preprocessing to ensure that a clean visually-2 evoked potential was obtained from posterior sensors. The data were low-pass filtered at 30Hz, and the vertical 3 and horizontal gradients were summed for occipital channels. Before averaging the trials, the data were baseline 4 corrected based on the activity at each channel from -200 to -50ms before cue onset at retrieval. Results show a 5 typical ERF with a maximum around 200ms after cue onset. B) A variance index was calculated to check the 6 source distribution in response to a visual stimulus. The data were source localised using an LCMV beamforming 7 approach (Gross et al., 2001). We then subtracted the temporal variance from -200 to 0 ms pre-cue from the 0 to 8 200 ms post-cue variance, and subsequently divided by the pre-cue variance. We found the expected source 9 distribution in this early time window with an occipital maximum, thus providing a basic sanity check for our 10 source analysis approach. C) and D) We used IRASA to extract the frequency profile of our main hippocampal 11 region of interest and two control regions, one in superior occipital cortex (C) and one in motor cortex (precentral