Formalizing a Perceptual-Mnemonic Theory of the Medial Temporal Lobe

Animals rapidly transform sensory experience into memory. How the mammalian brain supports these transformations has been subject to an enduring debate: Do medial temporal lobe (MTL) structures, typically implicated in memory-related behaviors, also play a role in perception? A rich experimental literature exists, yet reliance on descriptive accounts of stimulus properties (e.g. “feature ambiguity”) has made it difficult to synthesize results. In order to formalize perceptual demands on the MTL, in particular, the role of perirhinal cortex (PRC), here we adopt a combination of meta-analytic and computational approaches. We begin by designing a null model of PRC function in visual discrimination tasks, building from a computational proxy for the primate ventral visual system (VVS). With this model, we identify stimuli from previous studies that may not be diagnostic of PRC’s role in perception. We then demonstrate a striking correspondence between model and PRC-lesioned behavior across ten experiments (r=.80). Critically, the model and PRC-lesioned subjects fail on the same visual discrimination tasks, unlike controls. This approach formalizes the MTL’s role in perception by providing a tractable, stimulus-computable proxy for visual discrimination tasks in a PRC-lesioned state.


Introduction
Animals rapidly transform perceptual experience into memory, allowing this information to guide future behavior. Here, we focus on how sensory information from the ventral visual system (VVS) is transformed within the medial temporal lobe (MTL). Traditionally, the MTL has been characterized as a system dedicated to supporting memory-related behaviors (Squire & Wixted 2011). In contrast, the Perceptual-Mnemonic Theory (PMT) argues that the MTL is also necessary for certain perceptual tasks (Bussy & Saksida 2007). This theory has centered on the role of perirhinal cortex (PRC), a MTL structure situated at the apex of the VVS (Suzuki & Naya 2014). Accordingly, PRC is thought to support "configural visual representations", enabling subjects to perform "complex" visual discrimination tasks that rely on configural properties of objects (Barense et al. 2007). In line with PMT, patients with perirhinal lesions have shown deficits on tasks designed to assess PRC's role in perception (Lee et al. 2005, Barense et al. 2007. By contrast, in other work, PRC lesioned patients are unimpaired on similar visual discrimination tasks (Buffalo et al. 1998, Levi et al. 2005, Knutson et al. 2013, leading some to aruge that observed deficits reflect memory impairment. Unfortunately, reliance on informal, descriptive accounts of stimulus properties (e.g. visual "complexity," "feature ambiguity," or "high-level" perceptual demands) have made it difficult to compare results across experiments.
We begin by formalizing a null model of PRC function in accordance with PMT. As subjects with focal lesions to PRC still have an intact VVS, we suggest that visual discrimination behaviors of lesioned subjects should reflect those discriminations that are supported by the VVS. In particular, we should expect PRC-lesioned behavior to rely on Inferior Temporal cortex (IT) when performing object discrimination tasks. If a visual discrimination task is "IT computable," we suggest that the task is nondiagnostic of PRC's role in perception, as no perceptual processing beyond the VVS is necessary. Moreover, if PRC plays a role in perception, then PRC-intact control subjects should be able to perform non-IT computable visual discriminations; PRC-related deficits would only be evident for these tasks.

Methods
To implement this formalization of PMT, we borrow from a model class that makes quantitative predictions about neural activity within the primate visual system: task-optimized convolutional neural networks (CNNs). Given an image as input, these models largely recover the response patterns observed across the VVS (Yamins et al. 2014, Schrimpf et al. 2018. For simplicity, we refer here to a single instance of this model class optimized to perform object recognition (VGG16: Simonyan & Zisserman 2014). As primate visual behaviors have been shown to track a linear readout of population-level neural representations in IT (Majaj et al. 2015) we use linear separability as our metric of IT computability.

733
This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0 We obtained stimulus sets from previously published studies (Buffalo et al. 1998, Barense et al. 2007, Stark & Squire 2000, Knutson et al. 2012 where human subjects with perirhinal lesions and PRC-intact controls performed visual discrimination tasks. Tasks involved detecting the "oddity" with a choice array of relatively similar objects. Stimuli from each experiment were originally formatted as concurrently presented choice arrays, with between 3 and 8 objects on each screen. When appropriate, we used a kmeans clustering approach to identify the centroid of each object, otherwise the stimuli were split into quadrants of uniform size. Each stimulus screen was segmented into N object-centered images of equal size. We formated each experiment such that the inputs to the model reflect the stimuli shown to subjects--the visual statistics of each image, as well as the distribution of objects. For each trial, we passed these N trial images to the model and extracted feature vectors of length F from an intermediate layer, resulting in an F x N response matrix. We defined a protocol (Figure 1) to determine the oddity from each trial's F x N array, extracted from the first fully connected layer. To operationalize linear separability within the model, we first used Pearson's correlation as a uniform linear readout, computing the covariance matrix between model responses to each item within a trial, resulting in an NxN array. We identified the item with the lowest mean off-diagonal covariance as the oddity. If the resulting model-selected oddity matched the ground truth, then that trial was classified as correct. This was repeated for each trial, resulting in an average accuracy for each experiment.

Results
First, this analytic approach identified published stimulus sets that are nondiagnostic of PRC's involvement in perception: Stimuli are perfectly separable in the model (e.g. model accuracy of 100%) and, by proxy, should not require visual processing beyond the VVS. This includes stimuli from studies on both sides of the debate (Knutson et al 2012, Experiment 2 in Barense et al. 2007). PRC-lesion deficits (or lack thereof) on these tasks may be for reasons unrelated to the perceptual demands placed on PRC.
Second, we observed a striking correspondence between PRC-lesioned patient and model performance ( r =.80). We analyzed ten visual discrimination experiments from two studies, where each study has experiments with putatively PRC-relevant and -irevelant stimuli. After computing the model accuracy for each experiment, we standardized results into a common metric space, which is agnostic to stimuli used, the number of items in the choice array, the number of trials, etc., across all experiments, we computed the correlation between model performance and PRC-lesioned patients (Figure 2).

Figure 2. Correspondence between model and PRC-lesioned behavior.
Solid dots along the diagonal suggest the model predicts the behavior of PRC-lesioned patients across experiments. While patient and control behaviors are well predicted by the model in Stark et al. 2000, the model poorly predicts control behavior in Barense et al. 2007. This lack of correspondence is driven by "high feature ambiguity" conditions that require putatively PRC-dependent representations.
Finally, we note a qualitative divergence between model and control behaviors within these two studies. In Stark et al. (2000) model performance was well matched with control as well as lesioned behaviors (r = .77 and .93, respectively). For Barense et al. (2007), the model was far worse at predicting control relative to lesioned behavior (r=.28 and .89, respectively). The lack of correspondence was driven by the "high feature ambiguity" conditions, designed to rely on representations putatively unique to PRC. For these experiments, controls significantly outperformed model behavior, as well as lesioned subjects (Figure 2; open blue squares, upper left quadrant). PRC-lesioned subject and control behaviors diverge only when experimental stimulus sets are not IT computable, as determined by our computational approach.

Summary
We have leveraged computational and meta-analytic approaches to formalize the role of the MTL in perception. With this formalization, we suggest that not all existing experimental stimulus sets are diagnostic of the role that PRC may play in perception. When a stimulus set is "IT computable" it should not require perceptual processing beyond the VVS. Consequently, a lack of impairment for PRC-lesioned patients in these tasks would be expected, given that they still have an intact VVS. Additionally, across ten visual experiments, we have demonstrated a correspondence between the model and human subjects in a PRC-lesioned state. Critically, it appears that the behavior of PRC-lesioned and -intact subjects only diverges when a visual discrimination task is not IT computable, as indicated by the model. These results offer tentative support for the PMT, as the debate is currently construed, and suggest that theoretical disagreements may be due to differing perceptual demands across experiments.