Individual-specific and shared representations during episodic memory encoding and retrieval

Although human memories seem unique to each individual, they are shared to a great extent across individuals. Previous studies have examined, separately, subject-specific and cross-subject shared representations during memory encoding and retrieval, but how shared memories are formed from individually encoded representations is not clearly understood. Using a unique fMRI design involving memory encoding and retrieval, and representational similarity analysis to link representations from different individuals, brain regions, and processing stages, the current study revealed that distributed brain regions showed both subject-specific and shared neural representations during both memory encoding and retrieval. Furthermore, different brain regions showed stage-specific representational strength, with the visual cortex showing greater unique and shared representations during encoding, whereas the left angular gyrus showing greater unique and shared representations during retrieval. The neural representations during encoding were transformed during retrieval, as shown by smaller cross-subject encoding-retrieval similarity (ERS) than cross-subject similarity either during encoding or during retrieval. This cross-subject and cross-stage similarity was found both within and across regions, with strong pattern similarity between the encoded representation in VVC and the retrieved representation in the angular gyrus. Simulation analysis further suggested that these patterns could be achieved by incorporating stage-specific representational strength, and cross-region reinstatement from encoding to retrieval, but not by a common transformation from encoding to retrieval across subjects. Together, our results shed light on how memory representations are encoded and transformed to maintain individual characteristics and at the same time to create shared representations to facilitate interpersonal communication.


Introduction
Theoretical discussions and behavioral studies have long considered memory as personal experiences that are unique to each individual. This intuitive perspective is also supported by empirical evidence. Previous studies, including neural imaging studies, have found that in response to the same stimuli and task instruction, subjects showed significant individual variances regarding how many items, which items, and the precise details associated with each item that can be remembered (Kirchhoff, 2009;Loftus et al., 1992;Miller et al., 2002;Munday, 1985;Shapiro and Penrod, 1986).
Nevertheless, memory is also shared across individuals, forming strong collective memory at different levels (Hasson et al., 2004(Hasson et al., , 2008J€ a€ askel€ ainen et al., 2008;Wilson et al., 2008). Supporting the notion of shared memory experience, existing studies revealed shared neural representations (i.e., the distributed pattern of neural activities associated with given stimuli (Vilarroya, 2017)) during memory retrieval. For example, Chen et al. (2017) reported that after viewing the same movie, the cross-subject similarity of the neural activity during free recall of the movie details was significantly above the baseline . In another study, Zadbood et al. (2017) examined how memories transmitted from one person to another person and found shared response patterns across participants when watching, recalling, and listening to spoken descriptions of movie scenes (Zadbood et al., 2017).
The above studies, however, did not simultaneously examine the unique and shared representations within and across processing stages and brain regions, leaving several important questions unaddressed. First, are the retrieved representations more likely to be shared than are the encoded representations, or vice versa? It is conceivable that the encoded representations might be shared to a greater extent because subjects perceive the same stimulus whereas the retrieved representations are more subjective and uncontrollable. Alternatively, it is also possible that the encoded representations are more unique because of their high fidelity, whereas the retrieved representations are more abstract and hence are probably more likely to be shared by subjects.
Second, although memory retrieval involves the reinstatement of perceived representations, this pattern of reinstatement is a constructive process (Schacter et al., 1998; for a review, see Xue, 2018). The question thus is how representations are transformed from encoding to retrieval and is the transformation process similar across subjects? Memory transformation can be characterized by the change in representational content or format, as reflected by the reduced pattern similarity within brain regions between encoding and retrieval, as well as by the shift in brain regions that showing item-specific representations between the two memory stages, given that different brain regions contain different aspects of the representations. For example, in our previous study (Xiao et al., 2017), we found that during encoding the visual cortex showed strong item-specific representations, but during retrieval higher-level brain regions such as the angular gyrus showed strong item-specific representations. Existing studies have shown that the representation in the VVC contains perceptual details, whereas that in the AG is identity-specific and invariant to viewpoints (Jeong and Xu, 2016), and is modulated by semantic similarity (Ye et al., 2016). These results thus suggest a change in representational format between encoding and retrieval.
Furthermore, we compared the encoding-retrieval similarity (i.e., ERS) with pattern similarity during encoding (i.e., encoding-encoding similarity, EES) and retrieval (retrieval-retrieval similarity, RRS) (Xiao et al., 2017). The results revealed that the ERS was smaller than EES in the ventral visual cortex (VVC) and RRS in angular gyrus (AG), suggesting that the neural representational patterns during retrieval were transformed from those during encoding. In light of this stage-specific representations, we further tested and verified the hypothesis that the encoded representation in the VVC could be transformed and reinstated in associative regions such as the angular gyrus, suggesting possible cross-region reinstatement of memory representations (Xiao et al., 2017).
Besides the within-subject transformation, another study by Chen et al. (2017) compared the cross-subject similarity in retrieval (i.e., RRS) with cross-subject encoding-retrieval similarity (ERS). They found greater cross-subject RRS than cross-subject ERS, again suggesting that the encoded representation was transformed into common representations shared by different individuals during retrieval. Their simulation analysis suggested that this pattern could be achieved via a common transformation from encoding to retrieval across subjects . Nevertheless, this shared transformation can conceivably result in greater shared representation during retrieval than encoding, but this possibility was not examined in that study. In addition, the mechanism for this shared representation during retrieval is not clear.
The present study examined subject-specific and shared episodic memory representations during encoding and retrieval. Furthermore, we examined whether the two types of memory transformation, i.e., withinregion memory transformation and cross-region memory reinstatement, were shared by all subjects or were specific to each individual. We adopted a design where subjects were required to study and retrieve word-picture associations. Importantly, each picture was associated with two different cues in separate runs. To capture the individual-specific and shared episodic memory representations, we generated for each subject two representational similarity matrices (RSMs) (Kriegeskorte et al., 2008) for encoding, and two matrices for retrieval, each containing one repetition of the pictures. These RSMs allowed us to compare the representations across processing stages, brain regions, and subjects ( Fig. 1), and at the same time to overcome the anatomical differences between subjects and brain regions . To examine the within-region memory transformation, we compared cross-subject ERS with cross-subject EES and cross-subject RRS. To examine the cross-region memory transformation, we examined whether there was significant above-chance cross-subject, cross-region ERS. Finally, we conducted simulations to investigate whether the shared representations during retrieval were a result of a common transformation, or through stage-specific representation and cross-region memory transformation. Fig. 1. Experiment paradigm and diagram of analytical strategies. A. Each picture was used twice and paired with two different word cues (set 1 and set 2). Half of the pictures in set 1 were exchanged with the same pictures in set 2 to form two new picture sets containing the same pictures arranged in the same order. B. Calculation of representational space; For each subject, we generated the representational spaces separately for the two data sets by processing stage (encoding and retrieval). The representational space was calculated as the correlations between each target's neural spatial pattern and that for each of other targets from the same data set. Only the correlation coefficients in the upper triangle were used so only one correlation coefficient for each pair of materials was used. The similarity (Pearson correlation) for representational spaces was calculated within and across subjects. Using the averaged representational similarity matrix, we calculated the similarity across processing stages, brain regions, and individuals.

Subjects and experiment design
Detailed information regarding the subjects, experimental design, behavioral data analysis, and functional magnetic resonance imaging (MRI) data preprocessing can be found in our previously published paper (Xiao et al., 2017). Briefly, 20 healthy subjects (11 males, mean age ¼ 20.95 AE 1.96 years, range of 18-25 years) participated in this study. Informed written consent was obtained from the participants before the experiments. The fMRI study was approved by the institutional review board of Peking University and the State Key Laboratory of Cognitive Neuroscience and Learning at Beijing Normal University in China.
Subjects studied and retrieved 96 word-picture pairs. Pictures were 48 well-known scenes, including 32 architectures (half from China and the other half from abroad) and 16 natural landscapes (half depicting water landscapes and the other half depicting terrestrial landscapes). The cues were 96 two-character Chinese verbs. Each picture was associated with two different word cues (cue 1 and cue 2). Words and pictures were randomly paired across subjects.
One day before the fMRI scan, subjects were trained to be familiar with all pictures and then to memorize all 96 word-picture associations. The overtraining paradigm was used to make sure subjects could recall the visual details of the pictures during retrieval. It included three stages: picture familiarization, self-paced word-picture association learning, and recall. The training ended once subjects could correctly report the category and four details of the picture associated with each cue. Subjects spent about 2 h in this session.
During the fMRI scan, subjects were asked to restudy the word cuepicture pairs (the encoding run) and then to recall the visual details of the pictures associated with the word cues (the retrieval run) (Fig. 1A). A slow event-related design (16 s for each trial) was used to obtain better estimates of single-trial BOLD responses for both encoding and retrieval. During encoding, each trial started with 4 s presentation of the word cuepicture association, and subjects were asked to try to remember as many details as they could (i.e., the encoding stage). The frame of the picture then turned green for 2 s (i.e., the category judgment stage), during which subjects were asked to judge the category of the picture but held their response until the frame turned red and the response labels showed on the screen. The response labels representing the four possible picture categories, i.e., domestic architectures, foreign architectures, water landscapes, and terrestrial landscapes, were introduced to prevent subjects from planning motor response during the category judgment stage. Specifically, each response key/button corresponded to one of the four label locations (lined up from left to right) instead of the picture category, and the order of the four category labels presented on the screen was randomized across trials. Subjects had another 2 s to make the response (i.e., the response stage) according to the response labels. To prevent further processing of the word cue-picture association, subjects were asked to do a perceptual orientation judgment task for 8 s. During this task, an arrow pointing either left or right was presented on the screen, and subjects were asked to judge the orientation of the arrow as quickly as possible. A self-paced procedure was used to make the task engaging.
The retrieval stage was similar to the encoding stage, except that, for the first 4 s, only the retrieval cue was presented, and subjects were asked to retrieve the visual details of the associated picture (Fig. 1A). For both the encoding and retrieval stages, subjects were told explicitly to focus on the details rather than simply the category of the pictures. The 48 pictures were divided into two groups (to keep the scanning time of each run within 7 min). For each group, each picture was paired with cue set 1 in one encoding-retrieval session and then with cue set 2 in the next session. In each run, the word-picture pairs were presented in random order. In total, there were four encoding-retrieval sessions.
After the scan, subjects finished an oral test outside the scanner to report the details of the picture associated with each cue. Pictures correctly recalled with more than four different kinds of details (e.g., objects in the scene, color, structure, name, and so on) were scored as remembered with details.
Image preprocessing and statistical analyses were carried out using FEAT (FMRI Expert Analysis Tool) version 5.98. The first 10 images from each run were automatically discarded by the scanner to allow scanner equilibrium. Functional images were realigned, and temporally filtered (nonlinear high pass filter with a 90s cut-off). The EPI images were first registered to the first volume of the fifth run and then registered to the MPRAGE structural volume using Advanced Normalization Tools (ANTs) (Avants et al., 2011). Registration from structural images to the standard space was further refined using ANTs nonlinear registration SyN (Klein et al., 2009). All fMRI analyses were performed in each subject's native space and then transformed into standard space for group analysis.

Single-trial response estimate
The GLM models were separately created for each of the 96 encoding and retrieval trials to estimate the single-trial response. A least-square single method was used, where the target trial was modeled as one experimental variable (EV), and all other trials were modeled as another EV (Mumford et al., 2012). The trial was modeled at its presentation time, convolved with a canonical hemodynamic response function (double gamma). The whole first 4 s were modeled during both encoding and retrieval for each trial. The t-statistic was used for representation similarity analysis to increase the reliability by noise normalization (Walther et al., 2016).

Region of interests (ROI) definition
Following previous studies and our previous findings, 8 ROIs were defined based on the Harvard-Oxford probabilistic atlas (threshold at 25% probability), including the bilateral ventral visual cortex (VVC, containing ventral lateral occipital cortex, occipital fusiform, occipital temporal fusiform, and parahippocampus) (Danker et al., 2017;Wing et al., 2015), angular gyrus (AG) (Kuhl and Chun, 2014), the inferior frontal cortex (IFG) (Ritchey et al., 2013), medial prefrontal cortex (mPFC) (Guise and Shapiro, 2017), and posterior medial cortex (PMC)   (Fig. 2A). The hippocampus was not included, as we did not find item-specific representation in this region in our previous study (Xiao et al., 2017). These regions have been consistently involved in memory encoding and retrieval. All ROIs were defined in MNI space, and then were realigned to each subject's native anatomical space.

Representational space construction
To compare the within-vs. cross-subject memory representations, it is not sufficient to directly contrast a stimulus's within-subject similarity with its cross-subject similarity due to finer differences in anatomical structure , as well as the differences in structure-function mapping between subjects (i.e., the same function might be localized in different voxels within a region for different individuals). This issue can be ameliorated by comparing the similarity in the representational space, which relies on the second-order similarity. The representational space, namely representational similarity matrix (RSM), is the pairwise pattern correlations for all stimuli in a given brain region. Use the RSM, we could calculate the representational similarities within and between memory stages (e.g., encoding-encoding similarity, retrieval-retrieval similarity, and encoding-retrieval similarity), between brain regions (e.g., cross-region similarity), and across individuals (e.g., cross-subject similarity).
Since each of the 48 pictures was associated with two cues during encoding and retrieval, this design enabled us to construct, for each subject, two RSMs for both encoding and retrieval. This enabled us to compare the within-subject similarity and cross-subject similarity during encoding and retrieval (Fig. 1). Due to the autocorrelation of BOLD signal, the similarity of activation pattern between two stimuli might be affected by their temporal distance within a scan session. As a result, the RSM for a given set would be affected by the temporal sequences of the stimuli, which varied between the two sets within a subject as well as across subjects. To overcome this issue, we did a resampling procedure by randomly switching half of the stimuli between set 1 and set 2. Specifically, each time we randomly selected half of the stimuli from set 1 and the other half from set 2 to form one RSM, and used the remaining stimuli to form another RSM. Each "exchanged" RSM included a full set of the pictures, which were arranged in the same order. The withinsubject similarity was calculated by correlating the two exchanged RSMs from the same subject. The cross-subject similarity was calculated by correlating a given subject's RSM with the RSMs from each of the remaining 19 subjects. These 19 similarities were averaged to represent one subject's cross-subject similarity (Fig. 1). The procedure was repeated 1000 times and the results were averaged to obtain more reliable results. We then compared the within-vs. cross-subject pattern similarity.
Meanwhile, as each exchanged RSM was noisy, we also averaged 1000 RSMs to form an averaged RSM, separately for each individual at each memory stage (encoding and retrieval). As indicated by Fig S2, within-subject similarity increased after averaging the original noisy RSMs, and reached the maximum similarity of 1 after averaging around 400 noisy RSMs. As a result, we could only use the unaveraged, noisy RSMs to compare within-and cross-subject similarity during encoding and retrieval. When examining the shared representation during encoding and retrieval, and comparing the cross-stage and cross-region similarity between within-vs. cross-subject, the averaged RSMs were used for three reasons. First, compared to the individual RSMs, the averaged RSMs were more reliable and less noisy and were less affected by stimulus sequence and BOLD autocorrelation. Second, the averaged RSMs had a higher statistical power to detect the possible shared representation across subjects during encoding and retrieval, if there was any. Third, it was more feasible to do the permutation test on the averaged RSMs than on the individual RSMs from 1000 permutations on each of the 1000 shuffles when the noisy RSMs were used.

Representational similarities analysis
Three types of similarity (Pearson correlation) in RSM were calculated within each subject: between the two sets of stimuli during encoding, between the same two sets of stimuli during retrieval, and between encoding and retrieval for both sets of stimuli (Fig. 1B). Corresponding to the three types of within-subject similarity in RSM, we calculated three types of cross-subject similarity in RSM, i.e., crosssubject similarity during encoding, during retrieval, and between encoding and retrieval (i.e., ERS). The cross-subject representational similarity for a given subject was obtained by calculating the similarity between the subject's RSM and each of the 19 other subjects' RSM ( Fig. 1B), which was then averaged. As stated above, to compare withinand cross-subject representational similarity, the correlation analysis was done on each of the 1000 unaveraged noisy RSMs, and the results were obtained by averaging the 1000 correlations. To examine the degree of shared representations, we used the averaged RSM from the 1000 Cross-subject representational similarity for encoding and retrieval, using averaged RSMs. Error bars indicate the within-subject error. ***q < 0.001, **q < 0.01, *q < 0.05.
shuffles. All r values were then transformed into Fisher's Z scores for further analysis.
To examine the cross-region and cross-stage pattern reinstatement (Xiao et al., 2017), we also calculated the similarity between RSM in VVC during encoding and RSM in the mPFC and AG during retrieval. This was done both within and across subjects, using the averaged RSM.

Permutation test of significance
We used a permutation test to determine statistical significance. In particular, we examined (1) whether there were significant shared representations during encoding and retrieval, and between encoding and retrieval, and (2) whether there was significant within-subject pattern reinstatement during retrieval. To test the significance of shared representations (cross-subject similarity) during encoding or retrieval, we randomly shuffled the labels of the pictures and generated an RSM based on the shuffled picture labels, which was then correlated with the original RSM from each of the remaining 19 subjects. The results were averaged to generate one cross-subject similarity for each shuffle. This was done 1000 times, and the mean was used as the baseline for this subject. When this was done for all subjects, a paired sample t-test was conducted between the similarities of the original RSM and the baseline. To test the within-subject pattern reinstatement, we examined the significance of within-subject ERS. The baseline was obtained by the correlation between the shuffled RSM during encoding and the original RSM during retrieval from the same subject.
We also used the permutation test to examine (3) whether there was subject-specific representation, by comparing within-vs. cross-subject representations. In this analysis, we randomly permuted the labels of within/cross-subject similarity for 1000 times. For each permutation, a difference value between within-and cross-subject similarity was calculated. We then constructed a null distribution of the differences based on the 1000 permutations, and used this distribution to determine the p-value of within-vs. cross-subject comparison. The statistical significance was also corrected for multiple comparisons across all comparisons using FDR (threshold q ¼ .05).

Simulation study
To further investigate the nature of memory transformation, we used simulation to examine the relative contributions of the following factors to memory transformation: (1) stage-specific strength of representation, because different brain regions are involved to a different extent during encoding and retrieval (e.g. greater representational strength in the VVC during encoding and greater representational strength in the AG during retrieval) (Xiao et al., 2017); (2) shared memory transformation (by adding a common random pattern across subjects to their encoding patterns) ; and (3) cross-region reinstatement (by adding the encoded pattern from another brain region). Please note that this encoded pattern from another brain region was both idiosyncratic to each subject and shared across subjects according to our experimental results.
Following Chen et al., we simulated neural activation patterns (125 features each) for five subjects, each containing two brain regions, using the following equation: P_encoding ¼ (subject-shared pattern þ subject-specific pattern * k_specific þ region-specific pattern * k_region) *k_strength_encoding þ random noise Subject-shared patterns were constructed by creating five identical 125-element/voxel random vectors, whereas the subject-specific patterns were five unique 125-voxel random patterns. The strength of the subjectspecific pattern was controlled by k_specific. The region-specific pattern was a 125-voxel random pattern shared by subjects, whose strength was controlled by k_region. The k_strength_encoding indicates stage-specific representational strength during encoding. The random noise is also a 125-voxel random pattern unique to each subject. The strength of the random noise was controlled by the "noise strength factor" (k_noise).
During retrieval, we assumed two mechanisms, including a common transformation pattern shared by subjects, and cross-region transformation, so the pattern during retrieval was: P_retrieval ¼ (P_encoding þ common_pattern * k_trans_shared þ P_encoding(other_region) *k_CR) *k_strength_retrieval þ random noise Where k_ trans_shared indicates common transformation patterns shared by subjects and k_CR indicates cross-region transformation. k_streng-th_retrieval here indicates stage-specific representational strength during retrieval.
We systematically varied the following parameters: (1) stage-specific strength of representation during encoding and retrieval (k_strength_encoding/retrieval, ranged from 0 to 1, 4 steps, with smaller values indicating weaker representation, e.g., the AG during encoding and VVC during retrieval); (2) the extent of common transformation shared by subjects (k_trans_shared, ranged from 0 to 1, 4 steps, with bigger values indicating stronger shared transformation and 0 representing no shared transformation); (3) the extent of cross-region reinstatement (k_CR, ranging from 0 to 1, 4 steps, with bigger values representing stronger cross-region reinstatement and 0 representing no cross-region reinstatement); (4) the strength of the random noise (K_noise, ranging from 1 to 4, 3 steps, with bigger values indicating stronger random noise).
Please note that for some parameters, such as subject-specific and region-specific representations and noise strength factor, we initially tried a wide range of values, and narrowed down to one value (i.e., 0.6 for subject-specific and region-specific representation, and 4 for noise strength factor) for each parameter to achieve the best match with the experimental data. To reduce the complexity of the results, we only reported results using that value to demonstrate the effect of these factors. As shown in supplementary results, the selection of different values only affected the absolute similarity values, not the overall pattern of results across conditions. For the four critical parameters, stage-specific strength of representation during encoding and retrieval (k_strength_encoding/ retrieval), strength of common (k_trans_shared) and cross-stage (k_CR) transformation, we chose relatively finer steps. The simulation was done 1000 times. For each simulation, we calculated within-and cross-subject similarity during encoding and retrieval, within-and cross-subject encoding-retrieval similarity (ERS), and cross-region ERS.
Code and data availability. The code and data supporting the findings of this study are available from the corresponding author upon request, with a formal data-sharing agreement.

Item-level vs. representational space level similarity
Subjects performed very well during the memory test in the scanner (hits 94.3 AE 5.1%). The post-scan test further showed that subjects could correctly report more than four details associated with each retrieved picture. Together, the behavioral results suggest that overtraining before the scan was effective.
In our previous study (Xiao et al., 2017), item-specific memory representation was calculated by examining the pattern similarity between each pair of stimuli sharing the same visual pictures but different cues (i.e., first-order similarity). In the current study, to facilitate within-vs. cross-subject comparison, we examined the item-specific representation by examining the similarity between two representational similarity matrices (RSMs) consisting of identical set of pictures (i.e., second-order similarity). Due to the autocorrelation of BOLD signal, the similarity of activation patterns between two stimuli was affected by their temporal distance within a scan session. We thus shuffled the data by randomly switching half of the stimuli between set 1 and set 2. Within-and cross-subject similarities were then calculated using these exchanged, and noisy RSMs. As shown in Fig S1, after around 800 exchanges, the averaged within-subject similarity became stable. To be conservative, we used the results based on the average of 1000 exchanges for subsequent statistical analyses. To validate this second-order analysis, we directly compared the results of first-order similarity and second-order similarity. We found that the overall patterns were quite similar, suggesting that both measures could reliably capture the item-specific neural representations ( Fig S3).
Although we found shared representations during both encoding and retrieval, different ROIs showed different patterns: the RAG and LAG only showed shared representations during encoding and retrieval, respectively, whereas the VVC, LIFG, RIFG, mPFC, and PMC showed shared representations during both encoding and retrieval. Direct comparisons showed that cross-subject similarity was significantly greater during encoding than that during retrieval in the LVVC (F(1,19) ¼ 39.298, q < 0.001) and RVVC (F(1,19) ¼ 57.047, q < 0.001), whereas it was greater during retrieval than that during encoding in LAG (F(1,19) ¼ 7.148, q ¼ 0.024) and mPFC (F(1,19) ¼ 7.231, q ¼ 0.019) (Fig. 2B). These findings suggest that there were shared representations during both memory encoding and retrieval, but in different regions.

Comparing shared and individual-specific representations during encoding and retrieval
Having revealed both subject-specific and shared representations during encoding and retrieval, we then examined whether there was greater subject-specific than shared representations. In this analysis, within-and cross-subject similarities were calculated using the noisy, unaveraged RSMs. This was done on 1000 exchanged RSMs (between set 1 and set 2) and the results were averaged.

Subject-specific and shared pattern reinstatement during memory retrieval
The above analyses revealed both shared and subject-specific representations during encoding and retrieval. Existing studies suggest that the encoded representation might be reinstated during retrieval, showing significant item-specific encoding-retrieval similarity (ERS) (Xiao et al., 2017). This pattern reinstatement has also been found across subjects, as revealed by the significant cross-subject ERS . Nevertheless, it is still unknown whether there is a subject-specific pattern reinstatement.
To investigate this question, we calculated both within-and crosssubject ERS, using averaged RSMs. We found that all regions showed significant within-and cross-subject ERS compared with baseline (all qs < 0.024, corrected for multiple comparisons), except for the RVVC, LAG, RAG and RIFG, which did not show significant within-subject ERS (RVVC: t (19)  Interestingly, although within-subject ERS was numerically greater than cross-subject ERS, the direct comparison revealed no significant differences (all ps > .08, without correction for multiple comparisons) (Fig. 4). These results thus suggest strong shared pattern reinstatement across subjects, but no evidence for subject-specific reinstatement, which is consistent with a previous study which revealed that the within-subject ERS was only slightly greater than cross-subject ERS (with two voxels in temporoparietal junction surviving correction) .

Shared within-region memory transformation from encoding to retrieval
The lack of subject-specific reinstatement might be due to overall weak ERS, as a result of memory transformation (Xiao et al., 2017), as well as shared memory transformation across-subject . For example, there was greater pattern similarity during encoding (i.e., encoding-encoding similarity, EES) and retrieval (i.e., retrieval-retrieval similarity, RRS) than the similarity between encoding and retrieval (ERS) (Xiao et al., 2017). Furthermore, there was greater a cross-subject RRS than cross-subject ERS .
To examine the shared representational transformation from encoding to retrieval in our study, we compared cross-subject ERS with crosssubject similarity during encoding and retrieval. We found that crosssubject ERS was smaller than cross-subject EES in the VVC (LVVC: F(1,19) ¼ 42.097, q < 0.001; RVVC: F(1,19) ¼ 46.876, q < 0.001). Meanwhile, cross-subject ERS was also marginally smaller than crosssubject RRS in the LAG (F(1,19) ¼ 3.07, q ¼ 0.096) (Fig. 4). This evidence indicates that the representational patterns might be transformed in a systematic way across subjects from encoding to retrieval in VVC and LAG. Together, although using very different designs and materials, our results showed a similar pattern with previous studies Zadbood et al., 2017).

Subject-specific and shared cross-region pattern reinstatement
Representational reinstatement not only occurs within a brain region, but also across regions, such that the encoded representation in one region is reinstated in a different region during retrieval (Xiao et al., 2017), due to their differential roles in memory encoding and retrieval (See Introduction). It is still unclear whether cross-region reinstatement can be found across subjects.

Simulation analysis to examine the mechanisms of memory transformation
The above analysis revealed several important features regarding shared and subject-specific representations, as well as memory transformations across memory stages and brain regions. First, we found that Fig. 4. Comparison of cross-subject ERS, encoding similarity, retrieval similarity, and within-subject ERS, using the averaged RSMs. Error bars indicate standard error. ***q < 0.001, **q < 0.01, *q < 0.05. Error bars indicate the within-subject error. ***q < 0.001, **q < 0.01, *q 0.05. Fig. 6. Simulation results. The top row (A) shows the result when no stage-specific representation strength was introduced for VVC and AG, in this case, the representation patterns of VVC and AG were the same. The bottom two rows (B & C) show the results when stage-specific representation strength was included (for AG: k_strength_encoding ¼ 0.5, k_strength_retrieval ¼ 1; for VVC: k_strength_encoding ¼ 1, k_strength_retrieval ¼ 0.5). Each plot shows pattern similarity as a function of shared within-region transformation across subjects (k_trans_shared, varying from 0 to 1, with 4 steps). The five panels from left to right show the results with different levels of cross-region reinstatement (k_CR, varying from 0 to 1, with 4 steps). It is clear that the stage-specific representation is necessary to generate greater pattern similarity during encoding than during retrieval (e.g., VVC, Fig. 6A vs. Fig. 6B), and stage-specific representation strength together with cross-region reinstatement could generate greater pattern similarity during retrieval than during encoding (e.g., AG, Fig. 6A vs. Fig. 6C). ERS: encoding-retrieval similarity; ENC: encoding; RET: retrieval; WS: within-subject; BS: between-subject. the same brain regions contained both shared and subject-specific representations, and that there was greater within-subject similarity than cross-subject similarity during both encoding and retrieval, in both the visual cortex and higher-order brain regions, such as the LAG. Second, the representational strength was stage-specific across regions, with the VVC and LAG showing stronger representational strength during encoding and retrieval, respectively. Third, cross-subject ERS was smaller than cross-subject similarity during encoding and retrieval, suggesting that the representation was transformed. Fourth, within-subject ERS was numerically bigger but statistically comparable to cross-subject ERS, suggesting some common transformation shared by subjects. Finally, there existed cross-stage and cross-region reinstatement both within and across subjects, with stronger within-subject reinstatement in the LAG.
A critical question then is how these memory transformations are achieved. One possibility, as suggested by Chen et al. (2017), is that a common transformation could be introduced from encoding to retrieval, by adding a common memory pattern to all subjects at retrieval. Alternatively, enlightened by the findings of stage-specific representational strength, as well as cross-region transformation in the current study, the comparable cross-and within-subject ERS might be due to the overall small ERS as a result of memory transformation, and the common transformation might result from the cross-region transformation (e.g., from VVC to AG), given the shared representation in VVC during encoding.
In this section, we used simulation to adjudicate these hypotheses. In particular, we focused on three factors: (1) the stage-specific strength of memory representation in different brain regions (k_strength_encoding/ retrieval), (2) the extent of shared cross-stage transformation (k_trans_shared), and (3) the extent of cross-region reinstatement (k_CR) (see Methods).
Our simulation showed that these three factors affected the pattern similarities in different ways. Specifically, the shared transformation alone increased retrieval similarities (blue lines) (Fig. 6A, left panel), consistent with Chen et al. (2017). A weaker stage-specific representation strength decreased the pattern similarities during encoding and retrieval, as well as the ERS (Fig. 6B and C). Finally, the cross-region reinstatement alone increased the retrieval similarities enough to surpass encoding similarities and ERS (Fig. 6A).
We then asked which mechanism(s) could better account for the empirical data. Our simulation showed that although the shared transformation mechanism increased cross-subject retrieval similarity (blue lines) (Fig. 6), itself alone did not lead to greater cross-subject retrieval similarity than cross-subject encoding similarity in AG (Fig. 6A, left  panel). Also, the strength of shared transformation did not affect the cross-region pattern similarity (Fig. 7, left panel). In contrast, with stage-specific representation (especially the weak representation during encoding for AG), we simulated the greater pattern similarity during retrieval than encoding (Fig. 6C, right panel). Combined with increasing strength of cross-region transformation, the cross-subject retrieval similarity surpassed cross-subject ERS (Fig. 6), and the between-region ERS (black lines) was comparable to the within-region ERS (red lines) (Fig. 7).
Taken together, the simulation results suggest that stage-specific representational strength and cross-region reinstatement can fit the observed data very well, whereas the shared transformation mechanisms alone cannot.

Discussion
Combining a unique experimental design with cross-subject representational similarity analysis, the current study for the first time systematically examined within-vs. cross-subject representations during memory encoding and retrieval, as well as the transformation of representations between the two processing stages. Our results found both individual-specific and shared memory representations in distributed brain regions, with the visual cortex showing greater item-specific representation (both subject-specific and shared) during encoding and the angular gyrus showing greater item-specific representation (both subjectspecific and shared) during retrieval. Importantly, we found that memory representation was transformed systematically across regions and stages. Our simulation captured well the response pattern by incorporating stage-specific representational strength and cross-region pattern reinstatement. Together, these results help to advance our understanding of the nature of common and unique memory representations among different individuals and their dynamic changes.
Given the personalized nature of individuals' memories, it is thus of great surprise and significance to find common neural representations during memory retrieval (Charest et al., 2014;Chen et al., 2017). However, the degree of commonality and uniqueness of memory representations across individuals remains unclear. One previous study found that individuals' semantic representation patterns could predict their false memory responses better than could the group-averaged responses (Chadwick et al., 2016). Our study represents a major extension of existing studies by providing a comprehensive picture of the unique and shared representations during memory encoding and retrieval. In particular, we found that almost all regions showed significant shared representations across subjects, but there were also much greater within-than cross-subject similarity, during both encoding and retrieval. This provides clear neural evidence to support the behavioral observations that each brain has distinct representational patterns in perceiving, remembering, and recalling an event, although there are also Fig. 7. Simulation of cross-region pattern similarity for VVC to AG. Each plot shows pattern similarity as a function of shared within-region transformation across subjects (k_trans_shared, varying from 0 to 1, with 4 steps). The five panels from left to right show the results with different levels of cross-region reinstatement (k_CR, varying from 0 to 1, with 4 steps). With increasing strength of cross-region transformation, cross-region similarity (black lines) increased to match and even surpass within-region similarity (red lines). The strength of shared within-region transformation had no significant effect on the cross-region pattern similarity. WR: withinregion; BR: between-region; WS: within-subject; BS: between-subject. cross-subject similarities.
Interestingly, we found that the degree of shared representations was modulated by processing stages: Whereas the bilateral VVC showed greater shared representations during encoding, the AG and mPFC showed greater shared representations during retrieval. This is consistent with a previous study of shared representations during movie watching and recall . Using different analytical approaches, the current study and a previous study (Xiao et al., 2017) found the same pattern of results that emphasizes the role of VVC and AG in representing high-fidelity item-level information during encoding and retrieval, respectively. Consistently, another study only found category-level but not item-level reinstatement in the VVC during retrieval (Kuhl and Chun, 2014), and it was not possible to reconstruct the perceived images from the occipitotemporal representations during working memory (Lee and Kuhl, 2016). Given these regions' distinct roles in memory encoding and retrieval, it might be reasonable to argue that the shared memory representations during recall in the high-level cortical regions are mainly due to their representational role in memory retrieval. Our simulation analysis further emphasizes the importance of stage-specific memory representations.
These results suggest that retrieval might not involve a faithful reinstatement of encoded representations, but rather may engage additional abstraction processes to enable the formation of conceptual knowledge from perceptual experience (Binder and Desai, 2011). Indeed, the AG and VVC have been posited to represent distinct information: the representation in the VVC contains perceptual details, whereas the representation in the AG is identity-specific and invariant to viewpoints (Jeong and Xu, 2016), and is modulated by semantic similarity (Ye et al., 2016). Besides, the cross-subject encoding-retrieval similarity was smaller than that during encoding and retrieval, suggesting the representation in the same regions might be altered from encoding to retrieval. Combining both lines of evidence, we further hypothesized and verified that this abstraction might be achieved by cross-region reinstatement, i.e., the encoded representation in VVC was reinstated in the mPFC and AG. In particular, we found the VVC-LAG ERS was comparable to LAG-LAG ERS, suggesting some cross-region information transformation.
It should be noted that encoding and retrieval involve different task structures and visual inputs, which might have contributed to the different activation patterns in encoding and retrieval. For example, we found a significant shift in brain regions that showed shared and subjectspecific representations during encoding (e.g., VVC) and retrieval (e.g., AG), which could be due to the task differences between the two memory stages. Nevertheless, by focusing on the neural representations, our data also revealed changes in representational format between encoding and retrieval. Furthermore, the second-order pattern analysis, like the firstorder analysis of item-specific representation used in the previous study (Xiao et al., 2017), could control the effect of common cognitive processes shared by items on the representation patterns. As a result, the differences between encoding and retrieval within regions mainly reflected the change of item-specific representations. One potential caveat is that this approach could not account for the possible task by stimuli interactions, which could affect the representational structure. Future studies should further examine the nature of representations during encoding and retrieval, such as the representational format, content, and dimension, to further elucidate the transformative nature of memory retrieval.
The current study revealed no significant differences between withinand cross-subject ERS, although the former was numerically greater. This is also quantitatively consistent with a previous observation that only a few voxels showed greater within-than cross-subject ERS . Inconsistent with the previous study  which found significantly greater cross-subject retrieval similarity than cross-subject ERS in the PMC, the current study only found a marginally significant difference in the angular gyrus but not PMC. This could be due to the obvious differences between the two studies in their experimental design, materials and analytical approaches. First, the materials used in the current study were rather homogeneous, as compared to various events in a movie in Chen et al. (2017). This could reduce the overall statistical power in the representational similarity analysis. Second, we did not ask subjects to verbally describe the visual scenes during memory retrieval, which probably reduced the encoding-to-retrieval transformation. Third, the visual representation might be less likely to be shared across individuals than the abstract verbal description, reducing the cross-subject similarity during retrieval. Future studies should examine the shared and unique memory representations using different memory paradigms.
What could lead to the stronger cross-subject similarity during retrieval than ERS, and also the comparable within-and cross-subject ERS? As suggested by Chen and colleagues , this observation could suggest a common memory transformation. Our simulation results suggest that adding a strong new pattern shared by all subjects during retrieval could simulate the greater retrieval similarity than cross-subject ERS , but this common transformation alone could not generate the stage-specific representation strength, the comparable within-vs. cross-subject ERS, or the comparable cross-vs. within-region ERS in the LAG. Still, it is unclear how this common transform could be achieved. Unlike other studies (Hasson, Ghazanfar, Galantucci, Garrod and Keysers, 2012;Zadbood et al., 2017), the experimental paradigm used in the current study and Chen et al. (2017) did not engage social interaction or social exchange.
In contrast, by incorporating stage-specific representational strength and cross-region reinstatement, as revealed by our data, we were able to simulate the greater cross-subject retrieval similarity than cross-subject ERS. Since the encoded representation in VVC contained both shared and subject-specific features, this cross-region transformation thus is both subject-specific and shared, which could well account for the subject-specific cross-region transformation. It should be noted that in Chen et al.'s study, a subject-specific random pattern was added to the common transformation pattern, which was termed noise in memory transformation. Without this noise, it is impossible to obtain greater within-than cross-subject similarity during retrieval. Together, our results suggest that the cross-region transformation mechanism could well account for the empirical data, as well as provide a good mechanism for the origin of subject-specific and shared patterns of cross-stage memory transformation.
How the cross-region pattern reinstatement is achieved in the brain is still unclear. One possibility is that different brain regions contain different aspects of representations of the stimuli, which share the same representational structures, allowing the consistent mapping from one brain region to another brain region during information transformation. It has been shown that neural information flow is reversed between object perception and object reconstruction from memory (Linde-Domingo et al., 2019). In particular, when seeing an object, low-level perceptual features were discriminated faster and could be decoded from brain activity earlier, than high-level conceptual features. During retrieval, the encoded representation was actively reconstructed and represented in the angular gyrus. This is achieved through the hippocampal pattern reinstatement mechanisms (Estefan et al., 2019;Lohnas et al., 2018;Xiao et al., 2017), by integrating the information from the visual cortex as well as long-term knowledge. Future studies should examine the retrieval processes with higher spatiotemporal measures of brain activation patterns.
The capacity to share memories is essential for our ability to interact with others and form social groups, and the current study provides a practical methodological framework to study cross-subject representation. Social interaction has been primarily studied using neural synchronization across subjects in the temporal domain (Nummenmaa et al., 2018), but cross-subject pattern similarity analysis could provide a more direct measure of the shared mental representation. So far, this type of analysis has been mainly conducted on fMRI data in terms of spatial patterns, but it could be readily extended to the spatiotemporal domain (Feng et al., 2019;Lu et al., 2015). Several methodological caveats should be mentioned here. First, as emphasized in our analysis, the autocorrelation of the BOLD signal would affect the activation pattern, which could create artificial patterns of similarities when the trial orders are matched between two representation similarity matrices (RSM), or decrease the similarity results if the orders are not matched. The random exchange could help to solve this issue, but increase the computational demand and make the whole-brain searchlight analysis less feasible. Second, the similarity between two RSMs might not suggest the same representational features, but rather some higher-order similarity. Third, the regions were defined based on the group-based anatomical template. Recent studies have shown with dense sampling of resting-state data that we can achieve precise parcellation of individual brains that are very well aligned with the functional architecture (Braga and Buckner, 2017;Choe et al., 2015;Gordon et al., 2017;Laumann et al., 2015;O'Connor et al., 2017). It remains to be seen whether we can reveal stronger shared representations using individualized brain parcellation.
Finally, the current study overtrained the subjects to achieve high memory performance during retrieval. This is critical for us to examine the representation during retrieval both within and across subjects. This overtraining, however, might have changed the encoding processes, stabilize the representation (Huang et al., 2013;Wiestler and Diedrichsen, 2013), make the encoding and retrieval processes more similar (i.e., the restudy of the same item might involve the retrieval of a prior representation, termed study-phase retrieval). Although these factors might reduce the differences between encoding and retrieval, the fact that we still found low encoding-retrieval similarity provides strong evidence that memory is transformed. We would expect even greater transformation for one-shot learning, an interesting hypothesis that could be examined in future studies.
Taken together, this study tested the nature of the individual-specific and shared memory representations simultaneously, and also tested their transformation patterns from encoding to retrieval. Our results underscore the stage-specific representation and reconstructive nature of memory. Future research should further investigate how social interaction shapes the subject-specific and shared memory representation.

Declaration of copeting interest
The authors declare no conflict of interest.