An integrated deep learning based model of hippocampal spatial cells that combines self-motion with sensory information

A special class of hippocampal neurons broadly known as the spatial cells, whose subcategories include place cells, grid cells and head direction cells, are considered to be the building blocks of the brain’s map of the spatial world. We present a general, deep learning-based modeling framework that describes the emergence of the spatial cell responses and can also explain behavioral responses that involve a combination of path integration and vision. The first layer of the model consists of Head Direction (HD) cells that code for preferred direction of the agent. The second layer is the path integration (PI) layer with oscillatory neurons: displacement of the agent in a given direction modulates the frequency of these oscillators. Principal Component Analysis (PCA) of the PI cell responses showed emergence of cells with grid-like spatial periodicity. We show that the response of these cells could be described by Bessel functions. The output of PI layer is used to train stack of autoencoders. Neurons of both the layers exhibit responses resembling grid cells and place cells. The paper concludes by suggesting a wider applicability of the proposed modeling framework beyond the two simulated behavioral studies.


Introduction
The realization that Deep neural networks (DNNs) match human performance in certain perceptual classification tasks has prompted researchers to consider DNNs, not just as a tool for solving artificial pattern recognition problems, but as a potential model of human behavior. In the visual domain, for example, DNNs trained on visual object recognition matched even human error patterns across object classes 12 , variation of viewpoint 3 shape 4 , and judgement of object similarity 5 . However, the similarity between DNN and human error pattern stops at the level of object class and does not carry over to individual images. In the auditory domain also, DNNs trained on speech and music recognition closely match human performance 67 . Encouraged by the above successes of DNNs in emulating human behavior, researchers explored the possibility of DNNs being able to predict neural responses in sensory areas of the brain. Although DNN hidden layer responses do not resemble responses of brain's sensory cortical areas in themselves, it was shown that they can predict neural responses in both visual 8 and auditory domains 9 . Lower layers of DNNs are able to better predict activity of primary sensory cortices while deeper layers could predict activities of higher cortical areas in both the sensory domains. These relatively recent developments open the possibility that DNN models, despite drawbacks like use of a biologically unrealistic learning algorithm, when appropriately constrained and interpreted by knowledge from neurobiology, can serve as suitable and effective models of brain function.
Unrelated to deep learning approach, there is a long line of theoretical studies that sought to model hippocampal spatial cells. Classical models of grid cells fall under two broad categories: oscillatory interference models 10 and attractor-based models 11 12 13 14 . Solstad et al 15 describe a model that can produce place cells from grid cells; the 3 model, however, is not trainable and depends on special constraints on the weights connecting the grid cell layer to the place cell layer.
There are, however, trainable networks with a small number of stages (1 or 2), capable of reproducing hippocampal spatial cell responses. Dordek et al 16  Soman et al 19 20 describe a hierarchical model that performs path integration by integrating velocity into the phases of a layer of oscillators, followed by a cascade of layers in which the weights are trained by Hebbian (feedforward) and anti-Hebbian (lateral) learning. This model in 2 dimensions was able to generate predominantly grid cells in the lower layer and place cells in the higher layer. The model, when extended to 3D navigation in bats, was able to reproduce 3D place cells and a novel cell type dubbed the plane cell 19 . However, the last two modelling approaches were able to explain single cell responses but do not model behavioral studies. There is a need to develop integrated models that successfully account for single neuron responses and at the same time be able to explain navigational behavior.
In this study, we present two approaches namely, a PCA based model and a deep learning-based model of spatial navigation that exhibit neural responses of hippocampal spatial cells like grid cells and place cells, and also explain behavioral experiments involving rats navigating two-dimensional mazes. We begin with the oscillator model of Soman et al 20 for path integration, but eliminate the time-dependency using an averaging process. We also use scale parameter () as an additional parameter in path integration (PI). Four different variations of path integration are considered. As an effort to establish a theory behind the emergence of the grid cell behavior, we propose the PCA model which generates grid cell-like patterns while providing insights into the possible mathematical nature of these encoded representations, revealing the relation of these patterns to Bessel functions.
When the responses of path integration layer are used to train a hierarchical deep network, both layers show mixed response of grid cell-like firing and place cell-like firing. However, different PI variations gave different responses for grid cell-like firing as well as place cell-like firing and is discussed in results section. The network thus developed is used to model two experimental studies 2122 . Hippocampus combines self-motion information with sensory information to generate spatial representations of the world. In order to show that the proposed model 4 can work with such combinations, we simulate an experimental study 22 in which rats explore a multicompartmental environment with inhomogeneous lighting conditions. Our model is able to reproduce the changes in the place cell firing patterns in response to environmental lighting conditions.

Methods
In this study, we propose two models for representing spatial cell responses -1) PCA-based model, and 2) a deep learning approach. The model inherently consists of one major assumption: the animal is assumed to be a point animal eliminating the effects of head rotation. So, the head direction is always along the direction of movement of the animal. (1) where, is the velocity of the animal and , a unit vector, is the preferred direction of the i th HD neuron.

Path Integration (PI) layer:
This layer consists of oscillatory neurons that have one-to-one connection with neurons from the HD layer (Soman et al 2018a, 2018b). The process involved in this layer can be divided into two major stages. The first stage is frequency modulation of the HD responses followed by low pass filtering to eliminate time-dependent high frequency oscillations as the second stage which gives the path integration output (Fig: 1a).
The first stage of frequency modulation (FM) of PI is described by the following equations: where, ω is the base angular frequency of the oscillators which lies in theta frequency, z is the displacement of the animal from its initial position and β is the scaling factor. This was the model of path integration used in (Soman et al 2018a(Soman et al , 2018b. Note that the PI cell described in eqn. (3) is a spatio-temporal model dependent explicitly on space, z, and on time, t. Such a model was successful in explaining certain temporal phenomena like phase precession (Soman et al 2018a). However, most hippocampal spatial cells are described in purely spatial terms (e.g., place cells and grid cells), depicting their responses as exclusive functions of space. In the present study, we are interested in describing these purely spatial responses.
Therefore, we convert the spatio-temporal model of eqn. (3) into a purely spatial model by an averaging process described below.
In order to eliminate temporal variation, a sin( ) term is multiplied with the shown in eqn. 3, and passed through a low pass filter which blocks the high frequency signals as shown in eqn 4 and eqn 5.
After eliminating the high frequency term, we have, This formulation of Path Integration can also be extended to a complex version, shown in eqn. 6: where j is the complex number √-1.
For implementation purposes, an alternative of this form of the above is an ordered pair of ( . ) and For all k {Concatenation (Concatenation ( (βk z.ui), (βk z.ui)))} Type IV The above formulations of PI will be referred to as Type I, Type II, Type III and Type IV respectively from here on. We now take the PI vectors described by table 1 and generate representations using two unsupervised learning approaches: 1) PCA, and 2) Autoencoder network.

Principal Component Analysis (PCA) Model:
The input for this step is the path integration data in two formats, Type I and Type II as described in the previous section (Table 1). PCA is performed on this input data by calculating the covariance matrix (Fig: 1b). The eigenvectors of the covariance matrix give us the principal components which form a basis for the encoded representation of the spatial information. We will now show that the responses obtained by the PCA process result grid cell-like responses with spatial periodicity. It can be shown that the covariance matrix of the path integration data is a circulant matrix (see supplementary material: section 1). It is a well-known fact that the eigenvectors of a circulant matrix are sinusoidal. Hence, we get sinusoidal eigenvectors from the PCA steps. We can show that these eigenvectors will lead to the final output resembling  section.

Multi compartment model:
The second experimental study simulated is by 25 . In this experiment the rat is allowed to randomly forage in four compartments environment and the activity of place cells in the CA1 region of hippocampus is recorded. Initially, all four compartments have the same visual cues and lighting conditions.
In such conditions the place cell fires at a relatively same location in each compartment. As the lighting condition of one of the compartments is changed, the place cell firing location inside that compartment dislocates or disappears, thereby proving that there is local remapping of place cells activity.
To reproduce "the local remapping of place cell activity" results by 25 (Fig: 1d). The results are described in the results section.
For the different simulation studies described so far, the choice of is crucial to generate the corresponding results. In the oscillatory interference (OI) model, 10 have characterized the wave length of the interference pattern, which represents the spatial scaling of the grid cells as a function of β, as shown below.
where L is the wave length of the interference pattern, and is equal to the distance between two adjacent peaks in the grid cell firing field. Therefore, the choice of β can be made based on L (or the number of square or hexagonal fields in the environment) using the following equation.
For obtaining grid cells in a 2 by 2 square environment, the value = 2 3 will give one complete grid field.
However, the experimental studies are focused on the place cell responses and hence, we choose L = 1 such that there is one place field in the environment.

Analysis of Results:
For each of the studies performed, various methods have been used to analyze the results. We characterize the single neuron responses in the model in terms of Firing rate maps, Autocorrelation 26 and grid scores 20 . To 9 characterize place cells Connected component analysis has been used on the firing rate maps 27 . The detailed procedure for each of the methods can be found in Supplementary Material Section 6.

Results
Grid Cells from PCA model: The PCA model described previously is simulated with two kinds of path integration inputs, the Type I and Type II forms of PI. The final outputs from this network depict different trends in spatial behavior depending on the nature of path integration. The spatial cell responses are suitably thresholded and plotted as a red dot against the trajectory of the animal shown in blue (see Fig. 2). This analysis maps the firing data to the location of its firing hence revealing the spatial behavior. Fig.2 (a) depicts the firing fields when the network is simulated with Type I form of PI. The second column depicts the firing rate maps and the third column depicts the autocorrelation maps for the corresponding firing field (see methods section for rate maps and autocorrelation maps). These behaviors could be successfully understood theoretically using the derivations in the methods section. This has been validated by the results shown in Fig. 2 (b) where the first and second column are the raw outputs from the simulation. They can be compared to the third column which shows the output function from the derivation (Supplementary material: section 2). The following equation shows the derived theoretical output for PI Type I: where, r and correspond to the polar notation of ( , ), is the k th order Bessel function of first kind and

b) A comparison of the raw outputs from the model (columns 1 and 2) and the theoretical values of the derived output function (column 3).
Similar simulations are performed with the PI Type II and the firing fields from this simulation are plotted in Fig   3. Fig 3 (a) shows the firing fields, firing rate maps and autocorrelation maps of the output for this model.
where, r and correspond to the polar notation of ( , ), is the k th order Bessel function of first kind and ( , ) is related to the firing fields as given below: Firing field of neuron number 2k = real ( ( , )) and firing field of neuron number 2k + 1 = imaginary ( ( , )).

Fig 3: PCA model results with PI Type II. (a) Firing fields of the first few neurons with PI Type II as input to the PCA model. The columns show firing fields, firing rate maps and autocorrelation maps of the corresponding firing fields. (b) A comparison of the outputs from the model (columns 1, 2, 4 and 5) and the theoretical values of the derived output function (column 3,6)
It is observed from Fig. 2 (b) that the firing fields show only even order grid fields while the outputs from Fig 3   (b) show both even and odd order grid fields. This once again confirms that the theoretical framework proposed holds true for both the cases and successfully describes the simulation results.

Spatial cells from the autoencoder model:
The neurons in layer 1 and layer 2 of the autoencoder model are analyzed for the spatial responses using all four different cases of PI as discussed in the methods section (Table 1). The activity of the neurons (red dots) is plotted over the trajectory (blue) followed by the virtual animal (see Fig 4). It is observed from the firing responses that these activities are similar to those seen in hippocampal formation in terms of place cells and grid cells. In this section, we also present a comparative analysis of place cells and grid cells in layer 1 and layer 2 of the four different formulations of path integration.

Grid cells response:
The firing pattern of some neurons in layer 1 and layer 2 resembles the firing of grid cells observed in Entorhinal Cortex (EC) in all four variants of PI described in the method section (Table 1). Two types of grid patterns are observed in the model: hexagonal grid pattern and square grid pattern. The square grid patterns are previously reported in experimental studies as well as computational studies 16,28 . In the model with PI Type I as the input, 56% of neurons are hexagonal grid cells and 18% of neurons are square grid cells out of 12 50 neurons in layer 1 (fig 4: 1.a, 1.b and 1.c). Layer 2 contains 30% of hexagonal grid cells and 14% of square grid cells out of 50 neurons (fig 4: 1.d, 1.e and 1.f). Similarly, when the input is PI Type III, the Layer 1 contains 8% of hexagonal grid cells and 34% of square grid cells (Supplementary fig 7a). Layer 2 contains 8% hexagonal grid cells and 44% square grid cells (Supplementary fig 7b).
In the model, with input as PI Type II to the autoencoders, 46% of neurons are hexagonal grid cells and 9% of neurons are square grid cells in Layer 1 (fig 4: 2.a, 2.b and 2 (Supplementary fig 7c). Layer 2 contains 22% of hexagonal grid cells and 24% of square grid cells (Supplementary fig 7d).   Table 2).

Stability of place cells with different trajectories:
An important way to characterize a neural response as a place cell is by checking for the stability of the firing field, i.e., the firing field must be in the same area of the environment irrespective of the trajectory the animal takes. The place cells from the autoencoder model have been analyzed for stability by testing against multiple trajectories and are found to be stable. The models with PI Type II and PI Type IV have been tested as shown in the  14 Table 2 summarizes the number of place cells and number of grid cells in layer 1 and layer 2 in all four cases of PI considered.  Both Disto codes and place cells are observed in layer 1 and layer 2 of the model (fig: 6).

Multi compartment Model:
The second experimental study simulated is by Spiers et.al 25 . In this experiment the rat is allowed to randomly forage in the four compartments of the environment and the activity of place cells in the CA1 region of hippocampus is recorded. Initially, all four compartments have the same visual cues and lighting conditions. In such conditions the place cell fires at relatively same location in each compartment. As the lighting condition of one of the compartments is changed, the firing location of that compartment dislocates or disappears, suggestive of local remapping of place cells activity.
The input vector to the autoencoder model is described

Discussion
There are several models of hippocampal spatial cells in a network structure and yet very little can be explained about how these representations are generated and what they contribute to the larger challenge of spatial 16 navigation. Most of the existing grid cell models can be broadly classified into two categories namely: Continuous attractor neural networks (CANN) 29 and Oscillatory Interference (OI) models 10 . The CANN has a 2D sheet of neurons with special symmetrical lateral connections to the nearby neurons whose weights are inversely proportional to the distance. Also, in order to manage the problem of distorted connections at the boundary of a 2D sheet, the model makes an assumption of connections between last and first boundary neurons to make a toroid kind of structure. This toroid pushes the network to produce periodic responses. However, the DNN model proposed here produces the periodic response without setting an in-built cyclic connection between the neurons by encoding the path integration data. The other category of model which is OI model, assumes interference of three inputs from dendrites with the soma oscillating at theta frequency. The assumption made here is the pre-tuned directionality of these inputs which differ by 60 0 making it again unrealistic.
In an effort to answer the questions of spatial cell modeling while avoiding unrealistic assumptions about head direction selectivity, the PCA and Autoencoder model have been proposed. The PCA approach is taken up mainly to establish a mathematical framework for the encoding of spatial information. This helps us understand the underlying principles and subsequently apply these principles for spatial reconstruction and versatile navigation as observed in animals. However, the autoencoder model, being a multi-layer architecture, it is more nonlinear and is capable of modelling spatial cells in a hierarchical fashion. In addition to modelling spatial cells like grid cells, place cells, in the present study we also use it to model behavioural studies like behaviour of place cells in multicompartment and emergence of disto codes in a triangular trajectory which include multisensory integration.
All these results follow from the original ansatz where PI is implemented by integrating the velocity into the phase of the theta oscillations given by eqn 2. This ansatz can be justified by invoking biological evidence that theta frequency varies with velocity nearly linearly 30 14 and eqn 15). The PCA on PI Type I shows output with only even order grid fields and PCA on PI Type II shows output with both even and odd order grid fields which are in accordance with the theoretical framework (Supplementary material: section 2). These output functions obtained are also seen in modes of vibration of a circular membrane. It is a known fact that any type of vibration can be decomposed into an infinite series of the vibrational modes. This process is like how a time signal can be decomposed into a Fourier series. The model being able to produce this kind of a basis in the encoded representation points to the existence of a possible basis for storing spatial information in the spatial frequency domain of the brain.
In autoencoder model, out of all the four cases, a greater number of grid cells appears when input to the model is PI Type I and PI Type II. Since there is no phase offset in PI Type I and PI Type III, we do not observe any place cell in this case. The phase offset in the input is because of the complex terms in PI Type II and PI Type IV and we observe a greater number of place cells in this case which is in accordance with other studies like 10 . There are existing models which suggest emergence of place cells from linear summation of variably scaled grid cells 15 . The same approach is taken when when PI Type IV is presented as input to the autoencoder model, in which case we see the emergence of more place cells (refer Table 2).
The proposed deep model is not only capable of modelling spatial cells in a simple environment but can also model the experimental paradigms involving more complex environmental conditions. In this study, we have modelled two experimental paradigms which in our knowledge are not modelled before. The model is able to reproduce the experimental results efficiently. The experiments utilize different sensory inputs as cues to the animal while the model is able to generate the same results with much simplified cues which substitute for vision, lighting conditions, etc.
The multilayer autoencoder model is capable of modelling more complex non-linear representations, which suggest future direction of combining audition, olfaction and more realistic vision along with PI to study spatial navigation as a whole. Since this approach of modelling cognitive space in the hippocampus provides a robust framework, it supports multiple modelling studies like modelling object responsive neurons found in lateral entorhinal cortex (LEC) 31 , object vector cells 32 etc. Muller et.al 33 have shown that place cells exhibit direction sensitivity which varies with environments and is more prominent in polarized environments. This behaviour can also be further explored using the DNN model. The DNN approach can be used for modelling theta sequences (which involve rapid sequential firing of place cells within a few theta cycles) and Sharp Wave Ripples (SWR) which are high frequency synchronous bursts of neurons when the animal is immobile 34 . The continuous integration of speed in the PI will lead to an accumulation of error. As a result, the position estimates are not accurate. This issue is addressed by resetting the integration limits in PI using visual input 35 . The current model does not account for phase resetting in PI which is an important phenomenon to correct accumulated error during navigation and needs to be addressed in the future. Furthermore, both the DNN and PCA models can be extended for studying 3D navigation to explore the extensions of the spatial cell behaviours in 3D environments and also 18 observe the emergence of new spatial cell patterns like plane cells and FCC lattice type firing fields 19 . Moreover, we can also investigate a suitable basis in 3D using the PCA model similar to the 2D case.