Repetition suppression to faces in the fusiform face area: A personal and dynamic journey

I review a number of fMRI studies that investigate the effects of repeating faces on responses in the fusiform face area (FFA). These studies show that repetition suppression (RS), as well as repetition enhancement (RE), are sensitive to multiple factors, including pre-existing stimulus representations, cognitive task, lag between repetitions and spatial attention. Parallel EEG studies provide additional constraints on the timing of these repetition effects. Together, the results suggest that RS is not a unitary phenomenon, but likely subsumes multiple mechanisms that operate under different conditions. These mechanisms of course need to relate to single-cell data and known physiological mechanisms; but to make further progress, I believe we need dynamical neural network models that relate these mechanisms to the properties of neural populations that are measured by fMRI and EEG data. One example model is sketched, in which RS reflects an acceleration of neural dynamics, owing to reduced prediction error within a recurrent visual processing hierarchy.


Empirical review
This review is a highly personalised one, but with the advantage that data can be directly related across experiments by virtue of using the same stimulus sets and analysis methods. For simplicity, the review will focus on a right midfusiform region that consistently appeared across experiments, and likely corresponds to what has been functionallydefined as the Fusiform Face Area (FFA, Kanwisher, McDermott, & Chun, 1997), though it should be remembered that several other brain regions also show face repetition effects under various conditions. Where relevant, repetition effects on event-related potentials (ERPs) recorded in the same paradigm will also be discussed.
or not the next stimulus will be a repeat, minimising confounding effects of expectancy. This contrasts with designs that compare blocks with different frequencies of repetition (e.g., Grill-Spector & Malach, 2001), for which expectation is likely to affect the results (e.g., Summerfield, Trittschuh, Monti, Mesulam, & Egner, 2008). 2. Except where indicated otherwise, the experiments involved many different faces intervening between repetition of any one face. This might be called "long-lag" or "delayed" repetition, and avoids low-level effects of sensory adaptation/habituation/iconic memory, which likely affect immediate repetition of the same visual stimulus. The associated temporal lag between presentations was typically several minutes. This choice is not because shorter-lived repetition effects are not interesting or important, but a consequence of the original theoretical interest in implicit memory (priming) that can operate over much longer time-scales. 3. With the exception of the masked priming experiment below, faces were presented for several hundred msec (<1 sec), and the brain's response modelled as a brief impulse. It is possible that interesting neural dynamics occur during the period that a face is displayed (including sensory adaptation; e.g., Kar & Krekelberg, 2016), but these could not be distinguished in the present experiments.
These boundary conditions are important, because they mean that the RS effects observed below may have quite different properties and underlying mechanisms to those observed in other paradigms, particularly those employing rapid presentations of the same face (for which the term "fMR adaptation", Grill-Spector & Malach, 2001, might be better reserved).
A final point about fMRI analysis: this was done in a voxelwise analysis across individuals after normalising their brains to a common space defined by anatomy. This contrasts than the alternative approach of defining individual FFAs functionally using a localiser scan (see Friston, Rotshtein, Geng, Sterzer, & Henson, 2006vs Saxe, Brett, & Kanwisher, 2006, for further discussion of pros and cons of localisers). Thus the present use of the term "FFA" is not strictly correct, and it is possible that some repetition effects were missed because different individuals' FFAs had different anatomical locations. Nonetheless, the repetition effects that were found were clearly in a mid-fusiform region that responds strongly to faces, and whose peak MNI coordinates were very close to the modal FFA coordinates across individuals.

1.2.
The excitement of the early years: priming My journey began because of an interest in implicit memory, specifically the behavioural phenomenon of priming, whereby people typically respond faster or more accurately to repeated stimuli, even if repetition is not relevant to their task, and (arguably) even if they are unaware of the repetition. In particular, I was interested in the role of pre-existing representations in perceptual priming. This is because some theories assume that priming reflects a reduced threshold, or residual activity, for re-activating an existing stimulus representation ("abstractionist" theories, Tenpenny, 1995). This is consistent with claims that priming is found for familiar faces (e.g., famous ones), but not unfamiliar (novel) faces, which has been attributed to modification of "face representation units" (Ellis, Young, & Flude, 1990). Other theories however assume that even the first presentation of a novel stimulus can leave some form of trace (new representation) that can affect responses to that stimulus when it is repeated ("episodic" theories, Tenpenny, 1995). Episodic theories can explain why priming is sometimes found for novel stimuli. This abstractionist/episodic distinction breaks down on closer inspection, for example when one considers that novel stimuli can consist of new combinations of familiar features (Henson, 2003); nonetheless, I wanted to see if the brain's response differed for the repetition of faces presumed to have preexisting representations (familiar/famous faces) versus those without (unfamiliar/novel faces). The paradigm used in our first study (Henson, Shallice, & Dolan, 2000) is shown in Fig. 1A. The paradigm was taken from the ERP literature on repetition effects, in which the participant's task was to respond only to pre-specified, infrequent targets (in this case, an inverted face). This task ensures a certain level of attention is required on each trial, but gives no reason for differential attention to familiar versus unfamiliar faces, or to initial versus repeated faces.
A right mid-fusiform region showed a significant interaction between repetition and familiarity, with RS for familiar faces but the opposite pattern of RE for unfamiliar faces (Fig. 1B). Both RS and RE decreased with lag between repetitions, and also persisted for up to five presentations (suggesting that repetition of the same image is not sufficient to make an unfamiliar face equivalent to a famous one; see also Bonner, Burton & Bruce, 2003). This pattern was replicated, and also found for familiar versus unfamiliar symbols (Henson et al., 2000), as well as words versus nonwords in other brain regions (Henson, 2001). Regardless of detailed explanations, this cross-over interaction suggested that the effect of repetition is sensitive to the presence or absence of preexisting representations. However, this initial excitement was tempered by subsequent experiments, considered next, where the task was manipulated.

1.3.
The sobering effects of task In Henson, Shallice, Gorno-Tempini, and Dolan (2002), we used the same procedure as above, except that participants performed two different tasks in two different sessions (using distinct stimulus sets). In the fame-detection (implicit) task, they decided whether each face was famous or not, regardless of whether it was repeated, whereas in the repetitiondetection (explicit) task, they decided whether it was the first or second time they had seen that face in the experiment, regardless of whether it was famous. These two tasks therefore orthogonally oriented participants towards either the familiarity or repetition dimension (Fig. 1C). The type of task had a dramatic effect on the pattern of repetition effects across the brain, including the peak FFA voxel taken from Henson et al. (2000). In the fame-detection task, RS was observed for familiar faces, but RE was no longer observed, whereas in the repetition-detection task, no repetition effects were significant (Fig. 1D). Again, several  Henson et al. (2002). Trial procedure during Phase 1 (E) and Phase 2 (F) and FFA fMRI results (G) from Henson et al. (2003), together with ERP results from right occipitotemporal (ROT) sensor in Phase 1 (H) and right prefrontal (RPF) sensor in Phase 2 (I). * ¼ significant difference. detailed explanations were considered, but the important lesson was that face repetition effects in this paradigm, at least as measured by fMRI, are not automatic "bottom-up" effects, but depend on the task-relevance of the faces.
One potential explanation that deserves special mention is the possibility that RS was the consequence of stimulusresponse (S-R) bindings (Henson, Eckstein, Waszak, Frings, & Horner, 2014). According to this account, an association is made between a particular stimulus and a particular response (e.g., right finger press) after the first presentation, such that when that stimulus is repeated, the response can be retrieved directly, without requiring detailed perceptual processing. It is this curtailment of processing that is hypothesised to lead to RS in perceptual regions. Using a similar paradigm with visual objects, Dobbins, Schnyer, Verfaellie, and Schacter (2004) found support for this account by showing that RS in fusiform cortex was abolished simply by reversing the yes/no assignment of responses, which prevents the use of S-R bindings. For the case of Henson et al. (2002), the repetitiondetection task, but not fame-detection task, prevents use of S-R bindings by virtue of requiring different responses on first and second presentations. This could therefore explain the lack of RS in the repetition-detection task, and presence of RS in the fame-detection task, at least for famous faces.
This possibility is countered by the experiment described in Henson et al. (2003), in which the task was switched between the first and second time each face was seen. This experiment involved two phases. In Phase 1 (Fig. 1E), half of the familiar and unfamiliar faces were presented for the first time (together with phase-scrambled faces that allowed separate assessment of face perception). In Phase 2 (Fig. 1F), these faces were repeated, together with faces not seen in Phase 1. Importantly, the task in Phase 1 (symmetry judgment) was largely orthogonal to the task in Phase 2 (male/female judgment), such that approximately one half of repetitions involved the same yes or no response, while the other half involved the opposite response. Thus any effects of S-R bindings should average out (though see Henson et al., 2014, for a more nuanced perspective). The FFA results of comparing repeated versus nonrepeated faces in Phase 2 are shown in Fig. 1G. The pattern resembled that in the implicit task of Henson et al. (2002), in that RS was seen for familiar faces, but no repetition effect reached significance for unfamiliar faces. Thus S-R bindings do not appear to explain RS in FFA, at least for famous faces.

The need for temporal information
The complex pattern of repetition effects across the above three studies raised the question of whether the sluggish nature of the BOLD response hides a mixture of distinct neural repetition effects, operating at different times during the first few hundred msec after face onset. For example, an early, "bottom-up" RS effect may be swamped by a later, "topdown", task-dependent RE effect (e.g., increased attention that occurs when repetitions are task-relevant). This prompted me to record brain activity with EEG as well. Fig. 1HeI shows the ERPs for the same paradigm shown in Fig. 1EeF. The earliest difference between faces (familiar plus unfamiliar) and scrambled faces ("face perception") started around 150 ms over right occipitotemporal sensors (the "N170", Bentin, Allison, Puce, Perez, & McCarthy, 1996). However, the difference between familiar and unfamiliar faces ("face recognition") only emerged later, onsetting around 500 ms, and maximal over frontal sensors. More importantly, in Phase 2, there was no effect of repetition on the N170 (for either familiar faces, shown in Fig. 1I, or unfamiliar faces, not shown). Rather, a repetition effect was only found for familiar faces (as in the fMRI data), which onset around 300 ms, again over frontal sensors. The latter most likely reflected more rapid recognition of familiar faces when primed (seen in Phase 1) than unprimed.
The relationship between maximal ERP differences over the scalp and their underlying cortical generators is always difficult to determine, though a number of methodological studies (e.g., Henson, Mouchlianitis, & Friston, 2009) suggest that right FFA is at least one of the generators of the scalp N170 (others being right occipital face area, OFA, and right superior temporal sulcus, STS). Therefore the lack of repetition effects on the N170 suggest that the RS effects in FFA that were seen (for familiar faces) by fMRI are arising later, possibly after recurrent input from higher regions in the visual processing hierarchy, such as anterior temporal or even prefrontal regions (consistent with the more frontal distribution of the ERP repetition effect from 300 ms onwards). The issue of reentrant effects is discussed later, but first we consider the effect of lag between repetitions.

The dramatic effects of lag
To directly test the effects of lag, we compared immediate repetitions (with 2.4 sec between face onsets) with delayed repetitions (with more than 94 intervening faces; over 225 sec) that were randomly-intermixed within a single session of a gender-judgment task (Henson, Ross, Rylands, Vuilleumier, & Rugg, 2004). For delayed repetition, we again replicated the interaction between familiarity and repetition in FFA, with RS for familiar faces, but no significant repetition effect for unfamiliar faces (if anything, a trend for RE; Fig. 2A). For immediate repetition however, we found RS for both familiar and unfamiliar faces. This can explain why many other studies found RS (or fMR-adaptation) for unfamiliar faces; they mainly used short-lags. Interestingly, ERP data on the same paradigm again showed no effect of repetition on the N170, even for immediate repetition (though see Caharel, d'Arripe, Ramon, Jacques, & Rossion, 2009;Ewbank, Smith, Hancock, & Andrews, 2008). Instead, immediate repetition produced a modulation that peaked around 250 ms (Fig. 2B), corresponding to the "N250r" discovered by Schweinberger, Huddy, and Burton (2004). The N250r has also been associated with FFA, and is generally bigger for famous faces, which is numerically consistent with fMRI results in Fig. 2A. The earliest effect of delayed repetition, on the other hand, was a small increase in a parietal P600-like component from 400 to 600 ms (which may reflect the same broad positivity maximal over frontal sensors in Fig. 1I). Though the relationship between the P600/frontal positivity and FFA is unclear, these findings further support the general idea that face repetition effects, even from immediate repetition, arise after initial category-specific responses (N170).

1.6.
The importance of attention but not awareness One important determinant of the size of FFA responses is attention, and some of the above effects of task may relate to differential attention to the dimensions of repetition and/or familiarity. Henson and Mouchlianitis (2007) examined the role of spatial attention in repetition effects. Participants fixated centrally while faces and houses were presented on both sides of fixation (Fig. 2C). They were told to attend left or right (in different blocks) in order to make a face/house decision on the attended stimulus. Repetition was manipulated such that faces that were attended or ignored on first presentation were crossed with being attended or ignored on second presentation. Note that these were unfamiliar faces, with a relatively short lag of 2e16 intervening trials. Only when faces were on the attended side for both initial and repeated presentations was RS observed in FFA. RS was not significant in any of the other three conditions (nor was RS observed for houses in FFA, even when houses were attended on both presentations; Fig. 2D). These data suggest that attention is necessary to observe RS to faces in FFA.
One can attend to a point in space but still not be aware of a stimulus presented there, for example when it is presented briefly between forward and backward masks. In the final experiment reviewed here, Kouider, Eger, Dolan, and Henson (2009) used such a sandwich masked paradigm. This involved a prime face being presented for 33e50 ms, followed by a backward mask for 33e50 ms, and then a probe face for 700 ms, for which a fame judgment was required (Fig. 2E). Separate discrimination tests suggested that participant's awareness of the brief prime was minimal, and repetition effects remained even when awareness was extrapolated to zero. The prime and probe were either both famous or both nonfamous, and either the same image of the same person, different images of the same person, or two different people (all previous studies above used the same image of faces). FFA RS was found for both famous and nonfamous faces, and for both same and different images of the same person (relative to two different people), suggesting some degree of abstraction across low-level image properties (Fig. 2F). ERP data on the same paradigm (Henson, Mouchlianitis, Matthews, & Kouider, 2008) showed a priming-related modulation as early as 100e150 ms post-probe onset, which likely reflects a modulation of the N250r component to the prime, given the 100 ms between prime and probe onset (Fig. 2G). The main message of this study however is that, unlike attention, awareness is not necessary to obtain RS in FFA.

Summary
The above results suggest that the magnitude of FFA repetition effects depends on face familiarity, in that RS is consistently greater for familiar than unfamiliar faces in all experiments, but whether one sees RE or RS for unfamiliar faces depends on the lag between repetitions. Moreover, repetition effects are likely to depend on the precise processes performed on the faces, as normally determined by the task: When repetition is task-relevant, for example, repetition effects are likely to be modulated by increased attention to repeated relative to initial presentations (since repetitions are likely to be perceived as the "targets"). It is possible that repetition effects are also modulated by explicit (conscious) memory for repeats, though the masked priming experiment showed that awareness in general is not necessary to see FFA RS. Both initial and repeated presentations of faces do need to be attended, however, in order to see FFA repetition effects. These findings have implications for comparing results across other fMRI studies of RS. For example, the findings of studies using frequent repetition of faces (e.g., within the blocked designs often used to investigate perception) are likely to reflect different mechanisms from those using less frequent, longer-lag repetitions (as often used to investigate memory). One is also likely to see different repetition effects depending on whether repetitions are relevant (e.g., in the repetition-detection, "1-back" tasks often used in studies of perception) versus irrelevant (as in studies of implicit memory). Finally, RS effects for different types of stimuli, such as faces versus words, may differ because of different levels of pre-experimental familiarity, rather than different stimulus categories per se (Kov acs, Kaiser, Kaliukhovich, Vidny anszky, & Vogels, 2013).
Furthermore, the ERP data on the same paradigms above remind us that fMRI will conflate multiple, temporally-distinct processes that operate within a few hundred msec of each other. (Likewise, EEG will conflate multiple, spatially-distinct processes within a few millimetres of each other). This consideration is relevant to more dynamic perspectives on repetition effects, as discussed later. First though, we consider other fMRI studies of RS in FFA to face repetitions.

Related fMRI studies
The studies reviewed above have used repetition of the same face image, which raises the possibility that RS arises from low-level visual representations, e.g., view-specific rather than identify-specific representations. (Note that low-level visual adaptation is ruled out by the presence of intervening faces in most of the above studies, though such adaptation may contribute to RS for immediate repetitions.) This issue of view-specificity is particularly relevant to the distinction between familiar and unfamiliar faces, since an extensive behavioural literature suggests that people must be familiar with faces before they can easily extrapolate over different views (e.g., Hancock, Bruce, & Mike Burton, 2000). At least three other fMRI studies have examined this question of image-dependence of FFA RS, though all using immediate repetition. Eger, Schweinberger, Dolan, and Henson (2005) showed that FFA RS does occur across different views of immediately repeated faces (for both familiar and unfamiliar faces during a gender-judgment task). Nonetheless, this RS was smaller than for repetition of the same view, particularly for famous faces (contrary to expectations of greater generalisation over views for famous faces, though some suggestion of this generalisation was found in more anterior fusiform regions). These results suggest either a mixture or view-independent or view-specific face repetitions in FFA, or that RS (for immediate repetition) is modulated continuously by the degree of low-level visual overlap (since different views of the same face still entail some visual overlap). Winston, Henson, Fine-Goulden, and Dolan (2004) compared RS for pairs of faces that were either the same or a different person, which had either the same or different emotional expression (while participants monitored for a rare non-face target). FFA RS was sensitive to identity but not to expression; i.e., FFA showed reduced responses to the same person even with a different expression. Nonetheless, one could argue that images of the same face with two different expressions are still visually more similar to each other than are images of two different faces. Rotshtein, Henson, Treves, Driver, and Dolan (2005) addressed this concern by morphing between two famous c o r t e x 8 0 ( 2 0 1 6 ) 1 7 4 e1 8 4 faces. They found that FFA RS was sensitive to perceived identity but showed no evidence that it was modulated continuously by degree of visual overlap (on the morph continuum), suggesting that FFA RS involves higher-level visual representations (unlike the occipital face area, which showed sensitivity to morph distance instead).
More generally, a PubMed search for "fMRI repetition suppression faces fusiform" revealed 11 papers (beyond those reviewed above) that used intermixed designs (where repetition is unpredictable) on healthy volunteers. Many of these studies used trials containing two stimuli that are either the same face or two different faces, and compared these to trial pairs consisting of a face and non-face stimulus, or to a single face trial. Soon, Venkatraman, and Chee (2003), for example, explored the SOA between pairs of unfamiliar faces during a gender-judgment task. Reduced FFA responses were found for pairs of two different faces relative to single face trials; a category-level RS effect that decreased as SOA increased from 3 sec to 6 sec. Repetition of the same face however showed even greater reductions, which did not interact with SOA, suggesting additional exemplar-specific RS. Kaiser, Walther, Schweinberger, and Kov acs (2013) used famous faces in a gender-judgment task, where the first face was either of the same category (e.g., female) or the same person (and same image). FFA showed reduced responses to same category trials (compared to trials when the first stimulus was a scrambled faces) and further reductions still for same person trials. They called the former effect "adaptation" and the latter effect "priming", but these data again suggest that FFA shows RS to both the face category and specific face exemplar. Podrebarac, Goodale, Van Der Zwan, and Snow (2013) reported RS to pairs of two different, consecutive unfamiliar faces that were either of the same versus different gender during a facialattractiveness task in left FFA (and a right collateral sulcus region), again supporting idea that FFA RS can operate at the level of face categories too.
Other studies have explored the paradigm originally introduced by Summerfield et al. (2008), where trial-pairs like those above are presented in blocks in which the proportion of trial-pairs that contain repetitions is either high or low (thereby manipulating the probability, and hence expectation, of repetition). These authors found that FFA RS was greater when repetitions were more probable (the theoretical implications of this finding are discussed later). Using this paradigm, Kov acs, Iffland, Vidny anszky, and Greenlee (2012) found that RS and its modulation by repetition probability were invariant to retinal position, arguing against low-level visual contributions to these effects (that use the same face image). De Gardelle, Stokes, Johnen, Wyart, and Summerfield (2013) used the same paradigm (though with an explicit repetitiondetection task, rather than the original indirect target monitoring task used by Summerfield et al., 2008) and examined responses of individual voxels. They found that while some FFA voxels showed RS, others showed RE, and the voxels showing these effects (at least in left FFA) were i) consistent across runs (i.e, unlikely to reflect random noise), ii) correlated with each other, and iii) showed correlated effects of repetition probability (expectation). They argued that RS and RE (at the level of voxels) may reflect two types of expectation signal (an issue returned to later). Ishai, Pessoa, Bikle, and Ungerleider (2004) used a samplematching (working memory) task for unfamiliar faces that were either fearful or neutral, and that either matched the sample (targets) or did not (repeated distractors). For targets, FFA RS was greater for fearful than neutral faces, while for repeated distractors, RS was negligible. Though the paradigm (based on animal studies) differs somewhat from the ones reviewed above, the results reinforce the importance of topdown effects like task-relevance, as well as stimulusdependent effects like emotional valence, possibly mediated by attention. Suzuki et al. (2011) showed that the FFA RS for immediate repetitions of unfamiliar neutral or angry faces was attenuated for repetition of happy faces, suggesting that prolonged emotional/attentional processing of happy faces counter-acts RS (though the task appeared unconstrained in this study, increasing possible attentional differences across conditions). Bunzeck, Schü tze, and Dü zel (2006) reported that the size of FFA RS to unfamiliar faces repeated at short-lags did not correlate across participants with the amount of RT priming in a gender-judgment task (rather it was RS in prefrontal cortex that correlated with this RT priming effect), reinforcing the robustness of FFA RS to response factors, but leaving uncertain the contribution of FFA RS to behaviour. Xue et al. (2011) compared four consecutive repetitions with four spaced repetitions of unfamiliar faces during an intentional memorization task and found that FFA RS was reduced in the spaced condition (which is likely to simply reflect the smaller repetition lag, but could also reflect expectation). More interestingly, this RS was also smaller for faces later remembered, suggesting that RS impairs encoding into episodic memory. Finally, Kremers et al. (2014) showed RS in FFA to short-lag repeated presentations of an unfamiliar face superimposed on a scene during a task that required associating the face and scene, and that this RS correlated with hippocampal RS. They suggested that FFA RS for these (associative) repetitions reflected modulation of a top-down signal from hippocampus.
Overall, it is difficult to see a clear pattern across these studies, mainly because of the large range of stimuli, tasks and lags employed. Nonetheless, they do suggest that FFA RS reflects more than low-level visual overlap, but at the same time, that it operates at the level of both exemplars and category (faces vs nonfaces). The studies also reinforce the importance of modulations by visual attention, which likely depend on the task and stimulus properties (e.g., emotional valence), and the possible top-down influences of other brain regions.

2.
Theoretical review I am not aware of a theory that explains all of the results reviewed above, let alone those in the larger literature on face repetition effects. Nonetheless, it is worth considering a few features that such a theory might have.

Facilitation (dynamical) models
Foremost is the need for a dynamic perspective. The brain is clearly a dynamical system, in which neural activity reflects c o r t e x 8 0 ( 2 0 1 6 ) 1 7 4 e1 8 4 transient responses to new sensory input, before the system settles on a more stable (less energetic) "attractor" state, corresponding to the final interpretation/significance of that stimulus. If this process leads to a certain degree of synaptic change, such that the attractor for that stimulus is widened/ deepened, then stabilisation of the network is likely to occur faster when that stimulus is repeated. This corresponds to the "Facilitation" account discussed by Grill-Spector, Henson, and Martin (2006). There is some indirect evidence for this account. For example, because the fMRI BOLD response integrates over several sec of neural activity, a shorter duration of activity, following repetition, will result not only in a reduced amplitude of BOLD response (i.e., RS), but also an earlier peaking of that response (under linear convolution assumptions; Fig. 3A). Henson and Rugg (2001) binned every sec the trial-averaged FFA fMRI data from the famous conditions of the famedetection task of Henson et al. (2002), and fit an haemodynamic response function (HRF) that was explicitly parameterized by its amplitude, peak delay and onset delay (Fig. 3B). Across participants, there was evidence that repeated presentations had both a smaller peak amplitude and an earlier peak latency than initial presentations, but no difference in onset latency (Fig. 3B). This is consistent with repetition causing a shorter duration of neural activity.
Another piece of evidence for facilitation was reported by Henson (2012). This involved a singular-value decomposition of the ERP data in Henson, Wakeman, Litvak, and Friston (2011), in which faces were repeated immediately (with an SOA of approximately 3 sec). The first spatial mode resembled the topography of the N170, broadly consistent with a fusiform source, while the first temporal mode suggested that the evoked response for repeated presentations was a compressed version of that for initial presentations (Fig. 3C). Formal analysis, which involved stretching the time-axis of the initial response until it best fit that of the repeated response, revealed a "stretch factor" that was significantly less than 100% across participants. This is again consistent with repetition accelerating the neural dynamics. Note also that an empirical consequence of this continuous dynamical perspective is that the effects of repetition will be smaller and therefore harder to detect the earlier they occur with respect to stimulus onset, which may explain why many ERP studies fail to detect repetition effects on early responses like the N170.
One problem with this dynamic account is that little evidence has been reported for repetition affecting the latency of single-cell responses, e.g., in terms on onset of firing rate histograms (Kar & Krekelberg, 2016;Vogels, 2016). It is possible that analyses of the duration of such histograms might reveal repetition effects; or that the human extracranial ERP results reflect the summation of local-field potentials across multiple neurons, including ones less selective than those typically selected for recording. Note also that there are other mechanisms, apart from accelerated dynamics, that are likely to contribute to the RS recorded by fMRI, such as the fatigue and sharpening mechanisms of Grill-Spector et al. (2006). Indeed, impressive work has tried to separate the fatigue and sharpening models in terms of the tuning curves of the underlying neuronal populations, both with fMRI (Weiner, Sayres, Vinberg, & Grill-Spector, 2010) and singlecell recording (Verhoef, Kayaert, Franko, Vangeneugden, & Vogels, 2008). Nevertheless, these mechanisms are not incompatible with concurrent dynamic changes; indeed, future modelling work may reveal that fatigue, sharpening, facilitation (and synchrony, Gotts, Chow, & Martin, 2012) are all consequences of synaptic change within recurrent neural networks.

2.2.
A specific predictive coding model One specific example of a recurrent neural network model is the predictive coding model developed by Friston (2005). According to this hierarchical model of perception, neurons at one level of the hierarchy receive predictions from higher levels, and feed forward the difference between these predictions and the input from layers below e the "prediction error". The synapses between levels then adjust so as to reduce prediction error in future. As a consequence, when a stimulus is repeated, the prediction error is reduced more rapidly (Fig. 3D), as the whole hierarchy settles into an interpretation of that stimulus. Because the feed-forward neurons tend to be the large pyramidal neurons in superficial layers of the cortex that produce the signal detected by EEG/MEG, these techniques are assumed to measure the prediction error directly. Henson and Friston (2006) produced a toy version of this model with two levels, in which the FFA was mapped to the lower level, while the upper level was assumed to reflect more anterior temporal regions with more abstract face representations (e.g., of identity). Simulations showed that repeating a stimulus produced reduced responses in both layers, but this reduction was greater and later in the lower level than in the upper level (Fig. 3E). This prediction could be tested with concurrent recording from neurons in two cortical areas assumed to map to different levels of the visual processing pathway. Future work could also specify more complex dynamical interactions, to test ideas about synchrony of firing (Gotts et al., 2012) and to fit data on changes in oscillatory power, e.g., in high-frequency gamma range (Gruber & Mü ller, 2002).

A digression on expectation versus prediction
Before closing, it is important to distinguish the concept of "prediction" assumed by the type of models above, and the concept of "expectancy" that has received much recent interest in fMRI and single-cell studies of RS. Summerfield et al. (2008) showed that fMRI RS in FFA was greater when the probability of repetition was higher. They interpreted this in terms of stronger predictions, resulting in greater reduction of prediction error when repetition did occur. However, singlecell studies have not yet found this interaction between RS and repetition probability; finding instead RS regardless of repetition probability (Kaliukhovich & Vogels, 2011; while other human fMRI studies have shown that the probability effect is contingent on attention, Larsson & Smith, 2012). One possibility is that repetition probability induces a conscious expectancy that arises outside the visual processing pathway, e.g., from prefrontal cortex. This is different from the c o r t e x 8 0 ( 2 0 1 6 ) 1 7 4 e1 8 4 perceptual predictions following synaptic change within the visual pathway discussed above: Expectancy might reflect a general top-down bias/working memory signal indicating that the previous stimulus is likely to be seen again, whereas perceptual predictions arise automatically from contentspecific synaptic changes following every stimulus encounter, which are part of normal perceptual adjustment/ learning. It is possible that non-human primates do not . First spatial (left) and temporal (right) modes of ERP data averaged across participants from Henson (2012) (C). Schematic of predictive coding model (D) with two levels, using notation from general linear model, where y ¼ input to a level, Y ¼ predictions from later above; X ¼ forward weight matrix; X ¡1 ¼ backward weight matrix; b ¼ activity in higher level; e ¼ (residual) prediction error. The weights X and X ¡1 are more accurate after first presentation, reducing e during second presentation. Results of a simulation (E) from Henson and Friston (2006), where PSTH ¼ peristimulus histogram; ATL ¼ anterior temporal lobe; a.u. ¼ arbitrary units; yellow circles indicate repetition effects.
develop such strong expectancy in the paradigms used for single-cell recording, explaining why no modulation by repetition probability is found. Importantly however, the lack of a probability-by-repetition interaction does not falsify a prediction error account of the type described above, which would predict a main effect of repetition regardless of the probability of repetition.

Conclusion
My personal journey, like the journeys of many others contributing to this special issue, has revealed that repetition suppression (RS), even for a single stimulus-type (faces) in a single brain area (FFA), is a complex phenomenon that is likely to have multiple physiological causes, operating under different conditions and at different time-scales. Nonetheless, we have now developed a considerable database of empirical findings, not only from human fMRI and EEG/MEG, but also from single-cell recording. It seems important to me that future work uses computational models that simulate both firing rates and local field potentials across populations of neurons, in order to relate these different types of data. These models may reveal that concepts like fatigue, sharpening and prediction error are all reflections of the same neural principles. Whatever the details of these models, they cannot ignore the fact that the brain is a dynamical system, in which repetition effects have temporal as well as spatial dimensions.