Testing Methods of Neural Systems Understanding

,


Introduction
The scale of neurophysiological and neuroimaging experiments is growing, with increasing numbers of brain regions and neurons recorded under a variety of conditions and tasks [Weisenburger et al., 2019, Demas et al., 2021, Van Essen et al., 2012, Steinmetz et al., 2018, Bae et al., 2021. This poses a challenge of how to best analyze these large and complex datasets. A somewhat de-facto toolbox has arisen in systems neuroscience, with certain flavors of analysis being used across many different studies. In addition, novel methods aimed directly at tackling high-dimensional neural data are constantly being developed [Paninski and Cunningham, 2018].
A key question that is not frequently asked directly, however, is: are these methods helping us make progress towards better understanding of neural systems? And how can we know?
While scientific facts are expected to be validated through repeated demonstration of their truth in multiple different experiments, scientific tools tend to go through less explicit tests of their utility. The problem posed by neuroscience, however, is an extremely challenging one. High-dimensional nonlinear systems like the brain are notoriously difficult to understand, but can be deceptively easy to find stories in. Without well-honed tools we are at risk of compounding errors that waste years of scientific resources [Smaldino and McElreath, 2016]. A more explicit research program centered on documenting in an unbiased way the extent to which different tools have yielded accurate insights into the functioning of neural systems would benefit the field.
Neuroscience is not the only field interested in discovering the mechanisms by which neural systems produce intelligent behavior. Modern artificial intelligence (AI) relies on deep neural networks: parallel distributed processing systems with individual artificial neurons connected through weights trained to achieve some goal. The field of interpretable AI is charged with trying to make sense of how the resulting artificial neural systems process information [Gilpin et al., 2018]. Here too, empirical validation that the methods used are providing accurate and productive understanding of these systems would be valuable, especially as these systems get deployed in the real world.
2 What it means to understand a neural system To decide on what good methods are we need to have some sense of what kind of understanding we are aiming for and a way to evaluate it. In philosophy of science, many notions of understanding and explanation have been put forth for different purposes, including neuroscience-specific theories [Stinson and Sullivan, 2017, Stinson, 2018, Hills, 2016, Craver, 2006, 2007, Kaplan, 2011, Cao and Yamins, 2021a Here the focus is on what scientists working with neural systems (particularly, within systems neuroscience) are striving for when they devise their studies and interpret their results.
Insofar as what is sought by neuroscientists can be described as an abstracted set of steps that the neural system follows to achieve its computational goals, it is roughly aligned with the algorithmic level of understanding as defined by Marr [Marr, 1982, Love, 2015, van Bree, 2022. While not an algorithm in the technical sense, satisfactory algorithmic descriptions of neural datasets involve abstracting away many of the specifics of the activity patterns to provide a more compressed and logical description of the information transformations implemented by the neural circuit. An example of such an explanation is the description of the ventral visual stream as 'untangling' representations in order to achieve invariant object recognition [DiCarlo and Cox, 2007]. This way of understanding how object recognition is achieved provides a picture of the role each region of the ventral stream plays in terms of how it progressively untangles the activity of images of the same object into nicely smoothed and separated manifolds. Though it is not a quantitative model, conceptualizing the ventral stream this way still allows for experimental predictions and can be used as the basis of more formal quantitative models [Cohen et al., 2020].
Focusing on the algorithmic level also makes connections across different types of systems possible. For example, describing the means by which birds achieve flight in terms of aerodynamic principles makes it possible to compare bird flight to that of airplanes, despite their very different implementations. This is particularly beneficial for studying neural systems, which can be implemented with different biological details across species or even with artificial neurons.
Of course, what counts as a satisfactory algorithmic-level understanding will vary across scientists to some extent, but pinpointing this level as the rough goal offers some guiding constraints. Once we assume that goal, we can ask if our tools are providing accurate understanding of this kind.

Operationalizing understanding
The aim here is to devise a research program that explores how productive a given analysis method is at providing insights on the algorithmic level. To do so, a formal measure of success is needed. The concept of 'experimentally-validated understanding' can fill this role.
Experimentally-validated understanding refers to verifying the insights gleaned from an analysis method through further experimentation based on those insights ( Figure 1). Specifically, assuming we believe that applying the tools from neuroscience should provide us with a hypothesis of how a neural system works on the algorithmic level, then we should be able to devise experiments that directly test that hypothesis. If the hypothesis turns out to be correct, then the tools were successful at providing accurate insights. If not, then the tools were successful in creating a possible narrative, but not an accurate one.
Specifically, tools should be tested by their ability to provide insights that allow for behavioral control. That is, experimental manipulations applied to the neural system should result in predictable control of the output or behavior of the neural system. Demonstrating behavioral control is consistent with the definition of an algorithmic explanation: the algorithmic level is meant to describe the steps by which a system achieves its computational goals. Therefore, any true understanding of the algorithm will allow for behavioral control. In this way, control is necessary as a demonstration of understanding.
J o u r n a l P r e -p r o o f

Journal Pre-proof
Whether it is sufficient-that is, whether the ability to control behavior demonstrates understandingis more nuanced. One could imagine a perfect replica of a brain that could be used to find the perturbations needed to control behavior, but such a model would not be much more understandable than the brain itself. Therefore, the tools we use must provide some more compressed or simplified description of how the real neural system works in order to provide understanding. This too is aligned with the aim of finding an algorithmic level description, which is necessarily compressed. We may therefore consider a two-dimensional definition of understanding, wherein the aim is to find tools that help us identify the simplest descriptions that achieve the most control [Lillicrap and Kording, 2019].
Such an operationalized definition of understanding is critical for any honest exploration of progress in neuroscience. Informal measures of understanding do not suffice, just as informal measures of neural activity or behavior would not suffice in a neuroscience experiment.
As Carl Craver writes: "All scientists are motivated in part by the pleasure of understanding. Unfortunately, the pleasure of understanding is often indistinguishable from the pleasure of misunderstanding. The sense of understanding is at best an unreliable indicator of the quality and depth of an explanation" [Craver, 2007, Thompson, 2021. This is in part because our pre-conceived notions and expectations can bias what form of explanation we find satisfying, and blind us to more accurate yet less pleasing answers [Gershman, 2021]. Furthermore, previous research has shown that small differences in analysis choices can lead to different conclusions regarding how an underlying neural system works [Botvinik-Nezer et al., 2020]. A recent article by a group of neuroscientists also called out the many different types of 'causal' claims that occur in the field, claiming that this confusion around explanatory goals slows progress. All of these issues point to the need for a pre-set definition of understanding in order to make sure that our test of analysis tools is applied consistently and transparently. Experimentally-validated understanding avoids the unspoken, underspecified or confused elements of neural data analysis by providing a well-defined standard of success.
Experimentally-validated understanding can also be related to previous philosophical conceptions of explanation. Because it requires claims about how behavior of the network will change if specific aspects of the network are changed, it is closely related to the interventionist view of explanation, which focuses on identifying the factors needed to change the phenomenon of interest [Woodward, 2017[Woodward, , 2005. The concepts put forth here can also be mapped to elements of the mechanistic perspective laid out in Stinson and Sullivan [2017]. Specifically, it identifies aspects of performance as the phenomena to be explained and it admits for multiple scales in the explanation (including those based on individual neural units, population response, etc). Our definition of the algorithmic level of understanding could also be seen as satisfying many of the requirements of understanding in Hills [2016].

Testing on artificial neural networks
While in theory this form of experimental validation of tools can occur in real neural networks, performing the exact experiment that would best test the insights provided by a given method is often infeasible, and certainly cannot be done as thoroughly and quickly as it can in artificial neural networks. Therefore, the testing of tools of neuroscience would best be carried out on ANNs.

Benefits of ANNs
As mentioned above, experiments can be carried out on ANNs easier than on real brains. This is because ANNs are fully observable and perturbable. The study of biological neural systems is hindered in significant ways, especially historically, by the challenge of observing and perturbing neural activity. Such experimental limitations introduce multiple forms of uncertainty, put restrictions on the types of analyses that can be performed, and slow the process of building off the insights of past analyses. The activity and connectivity of ANNs, on the other hand, can be perfectly observed simultaneously across all units in the network under any experimental condition. A large variety of manipulations are easily implementable including activity perturbations and connectivity changes. Even 'developmental' interventions are possible during network training through control of the learning algorithm or restriction of the training data (e.g. Geirhos et al. [2018]). The situation with ANNs thus far exceeds our experimental abilities for even the simplest of organisms, such as C. elegans.
In addition to the practical benefits, full observability and perturbability offer an important constraint: with perfect data collection and perfect ability to manipulate the system one cannot appeal to J o u r n a l P r e -p r o o f Journal Pre-proof uncontrollable external factors to fill in the holes of an explanations. For example, it is not uncommon for neuroscience studies to make references to things like neuromodulation, inputs from an unrecorded brain region, downstream readout effects, or changes in behavioral state to explain confusing aspects of their data. With the use of ANNs, experimenters must face the fact that even with full knowledge of the details of a system, tools can still struggle to identify a satisfying mechanistic explanation. These models show that neural recording technology is not the only bottleneck to understand the brain; the tools of analysis may be lacking too.
ANNs also have an unexplained 'algorithmic level' to explore. Because the weights in ANNs are trained rather than hand-designed, the principles by which these networks achieve their task performance remain mysterious. Much like in the brain, merely knowing the distributed activity of thousands (or more) neural units does not give a satisfying explanation of the computations that make it capable of carrying out difficult tasks. In this way, ANNs offer a setting in which to determine how to best recover an algorithmic explanation from a complex distributed system.

Generalizing from artificial to biological neural systems
The approach advocated here involves assessing methods from neuroscience by determining how well they can provide experimentally-validated understanding of ANN models. This directly demonstrates the utility of these tools for understanding ANNs, but does it mean the same is true for biological neural systems? There are several reasons to believe that validating a method on ANNs would be informative about its role in understanding the brain.

Similarity between artificial and biological neural networks
The recent resurgence of artificial neural network models in computational neuroscience has stemmed in large part from their success in matching important features of real neural activity. This was shown most prominently in primate visual cortex where convolutional neural networks (CNNs) trained to perform object recognition were able to predict neural activity and match representational patterns as recorded through electrophysiology and neuroimaging [Yamins et al., 2014, Khaligh-Razavi andKriegeskorte, 2014]. This same approach has been fruitfully applied to capturing neural response properties in many other brain areas [Kell et al., 2018, Banino et al., 2018, Nayebi et al., 2021, Michaels et al., 2020, Wang et al., 2021, Barak, 2017. In general, the representational similarity between models and data occurs at the population level; that is, layers of artificial neural networks show similar representations as populations of real neurons, but there is usually not one-to-one correspondence between artificial and real neurons. While these trained models do not usually explain all of the neural variance in the data, they tend to do better than pre-existing and more hand-built models.
Even before training, many of the basic properties of ANNs match those of the brain-albeit at an abstract level. Given that artificial neurons were inspired by the physiology of real neurons [McCulloch and Pitts, 1943], they perform they same basic function of taking a weighted sum of inputs from other neurons and non-linearly transforming that into an output activation value. ANNs also implement parallel and distributed (and at times recurrent) processing of information, as used by the cortex. And in some cases the architectural details of ANNs are directly inspired by the brain. The convolutions and pooling layers in CNNs, for example, are based on the simple and complex cells identified in cat primary visual cortex [Lindsay, 2021].
In this way, using ANNs as an analogy for different brain regions and systems has been productive for studying the brain. Their basic similarities (and the fact that more biological details can be added to ANNs) make it likely that many of the tools applied to real neural data will be applicable to ANNs as well.

ANNs as model organisms
While there are established relationships between trained ANNs and specific brain regions, ANNs do not need to function just like existing biological systems in order to be amenable to the same tools. Just as comparative psychology can use a common set of tools and concepts to study behavior across different species, the tools of neural analysis can be expected to work across different neural systems as well. Indeed, many common tools in systems neuroscience have been applied to species ranging from C. elegans to humans (e.g., Brennan et al. [2023]). There are many differences in the biological hardware of these different species and in the behaviors they produce, yet this doesn't necessarily invalidate the use of the same tool across both.
Importantly, ANNs and biological neural systems do not need to work the same way even on an algorithmic level. If we imagine two different animals (even from two different species) performing the same cognitive task, we would expect a useful analysis tool to be able to be applied to data from each and reveal something about the mechanism(s) at play. This may reveal that both animals use the same algorithm, or it may reveal a difference between the two. In either case, the tool was still validly applied and yielded useful insights on both animals (indeed, the main way we determine if two systems are working differently is by submitting them to the same analysis and this is commonly done to compare ANNs and the brain [Funke et al., 2021]). In the same way, an analysis tool may be proven useful through testing on ANNs and then when applied to real neural data reveal that the brain works differently than those ANNs.
In this way, ANNs can be thought of as just another model organism (or set of model organisms). When model organisms are used as a stand-in for humans to study cognitive function in health and disease, it is acknowledged that differences exist between the animal model and human, but assumed that enough similarities remain to make insights gleaned from the former applicable to the latter. Similarly, here it is assumed that enough similarities exist between ANNs and brains such that results about which tools are useful in studying ANNs can guide us in how we study the brain. Indeed, ANNs and biological neural systems have in common many of the features that make them difficult to understand: both are high dimensional, hierarchical and/or recurrent, nonlinear, distributed (but also modular), behavior-optimized information processing systems. Tools that can provide understanding in spite of these challenges will be useful for both types of systems.

The tools of systems neuroscience
The field of systems neuroscience takes a specific approach to understanding the brain that is focused on how circuits of neurons implement the computations that support perception, cognition and behvaior. Its goals are aligned with identifying the algorithms at play in the brain and understanding how neural activity drives behavior. Here we summarize the main categories of analysis methods from systems J o u r n a l P r e -p r o o f Journal Pre-proof neuroscience and discuss how they are suitable for testing on ANNs.
One common approach in systems neuroscience, visualization of activity via dimensionality reduction techniques such as principal components analysis (PCA), has been used to gain intuition about the response properties of populations with hundreds or thousands of recorded neurons [Cunningham and Byron, 2014]. These visualizations have provided insight to neuroscientists regarding the computations implemented by different brain regions and this method can be readily applied to ANNs.
Table (1) summarizes the major categories of analysis methods from systems neuroscience, with an emphasis on those that can be tested on ANNs. Various flavors of dimensionality reduction have been developed to tackle different questions. Latent factor modeling also has some overlap with dimensionality reduction techniques in that it aims to find a small set of factors that explain much of the data. Many variants of representational similarity analysis have been used to compare representations across brain regions and/or models. Methods aimed at describing representational geometry are used to explain cognitive abilities like abstraction. Network neuroscience focuses on connections between neurons or brain regions rather than neural activity. Encoding/decoding approaches have a long history in neuroscience of trying to identify what kind of information is represented where.
In fact, many of the methods of systems neuroscience are applicable to ANNs and could therefore be tested on them. This overlap in available methodology probably stems from the fact that many systems neuroscientists operate under an ANN-like view of the brain. That is, they mainly think of a neural population as a collection of simple input-output devices operating in parallel whose computations are enacted by the activity of these units, and that activity is a result of the connections the units make amongst themselves.
Still, not all analysis methods applied to neural data will be applicable to ANNs. Studying noise correlations in these networks, for example, would be difficult [Kohn et al., 2016]. Noise correlations rely on the existence of trialwise noise across the neural population, which could technically be added to ANNs, but is not normally part of their functioning. Studies of oscillations and analyses of local field potentials would similarly be difficult to properly replicate in ANNs [Friston et al., 2015]. In general, though, as long as the data produced by an ANN has the right properties as specified by the analysis method, there is no a priori reason we shouldn't be able to test its usefulness on ANNs.
Different methods will be more or less appropriate for different networks and different questions. To determine the circumstances under which a given method provides productive results, it is important to test a tool on many different ANNs with different architectures and trained on a range of tasks. By identifying the circumstances under which a method reliably leads to experimentally-validated understanding, this testing will better guide the tools use in practice.

What has already been shown about these tools
In addition to the above arguments for the need to test the tools of systems neuroscience, we can look to studies in which critical evaluations of methodology, including through the use of ANNs, has called into question the suitability of some of these tools.
Though not using an ANN, Jonas and Kording [2017] tested a wide swath of common analysis methods from neuroscience on a simulated microprocessor, which is also a complex distributed system. The results were not encouraging. Specifically, the authors applied techniques such as connectivity and cell type analyses, lesioning, tuning curves, and functional connectivity. While they got some promising results through connectome analysis and dimensionality reduction, they ultimately concluded "the approaches reveal interesting structure in the data but do not meaningfully describe the hierarchy of information processing in the microprocessor. This suggests current analytic approaches in neuroscience may fall short of producing meaningful understanding of neural systems, regardless of the amount of data." Other works have aimed to more formally explore just how difficult the task posed to these tools is. In Ramaswamy [2019], it is claimed that understanding the brain through perturbation tests will require an infeasible number of experiments. Considering the goal of cognitive science, Rich et al. [2021] come to similarly pessimistic conclusions. Such studies show the importance of explicitly reflecting on the goals of analysis tools and their ability to deliver on those goals.
One beloved tool of systems neuroscience that has been critically-examined in ANNs is singlecell selectivity. For decades, the strength and quality of the tuning of individual neurons has been assumed to be important for understanding a brain region's function. A variety of studies investigating the relationship between the quality of single-cell tuning and task performance in ANNs have called  Recanatesi et al. [2021] Representational Similarity Analyses compares representations from recorded neural activity to models, across brain areas, or to theoretically-motivated encoding schemes to understand information transformation. See Kriegeskorte [2008], Kornblith et al. [2019], Morcos et al. [2018b], Kriegeskorte and Kievit [2013], Kriegeskorte and Wei [2021] Representation Geometry characterizes the shape of neural population responses to understand computations and transformations. See Bernardi et al. [2020], Chung andAbbott [2021], Nieh et al. [2021], Chaudhuri et al. [2019] Network Analyses use graph theory measures to find non-trivial topological features of functional or structural connectivity. See Bassett and Sporns [2017], Medaglia et al. [2015], Bassett et al. [2018], Sporns [2014], Mehler and Kording [2018] Encoding Models quantify the information encoded in neural activity and can include characterization of tuning properties, trained decoder performance, regression models, and formal metrics of information theory. See [Kriegeskorte and Wei, 2021, Kriegeskorte and Douglas, 2019, Butts and Goldman, 2006, Quiroga and Panzeri, 2009, Timme and Lapish, 2018, Paninski et al., 2007, Glaser et al., 2020 Bespoke Methods are designed specifically for the data collected in a study and don't undergo formal method development but may be suitable for testing on ANNs.  [Morcos et al., 2018b,a, Amjad et al., 2021, Lindsay and Miller, 2018, Leavitt and Morcos, 2020; some have even shown that strong single-cell selectivity can have a negative relationship with performance. This points to the possibility that historical experimental limitations (e.g. the ability to only record from single neurons at a time) may have counter-productively influenced the way we study the brain today. It also demonstrates that basic intuitions (e.g. that a cell responding strongly to a stimulus indicates its functional importance for processing that stimulus) can't always be trusted when studying complex nonlinear systems. To fully understand when and how single-cell selectivity can be responsible for network behavior will require continued investigation in ANNs and verification in real neural networks, which may for a variety of reasons show different selectivity properties [Ratan Murty et al., 2021]. However, some experimental results already do suggest an unexpected disconnect between selectivity and behavior in the brain [Lange et al., 2023, Koida andKomatsu, 2023].
A recent study on recurrent ANNs explored the properties of representational similarity analyses as a means of comparing neural systems [Maheswaranathan et al., 2019]. This work showed that aspects of representational geometry are influenced by features of the networks that are not related to its function, calling into question the usefulness of these tools for developing productive mechanistic understanding. On the other hand, they found that an analysis of certain features of the dynamics did provide more consistent insights into the computations being performed.
Additionally, previous works that have developed and reflected on specific analysis tools have indeed acknowledged the usefulness of testing these tools on ANNs [Bernardi et al., 2020, Zaharia et al., 2021, Barrett et al., 2019, Schaeffer et al., 2020, Parde et al., 2021. In Bernardi et al. [2020], in particular, the authors showed how representational geometry techniques can recover abstract representations of the inputs in artificial and real neural activity. According to Chung and Abbott [2021], 'ANNs can serve as a testbed for developing population-level analysis techniques, such as geometric approaches, even if they are ultimately aimed at neuroscience applications.' This idea is therefore intuitive to some scientists working to understand both real and artificial neural systems.
As these previous works show, studies that explicitly investigate the tools of neuroscience can return useful and at times surprising results about their ability to provide understanding of complex J o u r n a l P r e -p r o o f Journal Pre-proof and distributed neural systems. We therefore stand to benefit from an extensive test of many common methods. This is especially true as the use of analysis methods can suffer from the 'file drawer problem' normally ascribed to null experimental results. That is, we don't know all of the times an analysis method was applied to data but did not provide interesting results and therefore the use of the method was not published. The controlled and complete test of analysis tools proposed here would thus provide a more accurate picture of how useful different methods are than the published literature does.

Interaction with interpretable AI
As alluded to above, trained artificial neural networks pose their own interpretability challenge, and this can be an issue for the computer scientists who build and use them. To meet this challenge, a variety of 'interpretable AI' methods have arisen that aim to give some insight about how a trained ANN works [Gilpin et al., 2018, Samek et al., 2021 The overarching goal is to be able to answer why the network produces a particular behavior, and to understand how it is implemented. In this way, the goals of interpretable AI are very similar to those of neuroscience: neuroscience also aims to understand the optimization principles that govern the brain (related to the concept of non-causal explanations as in Chirimuuta [2018]) and the computations it implements [Kanwisher et al., 2023, Kar et al., 2022. The test of neuroscience tools on ANNs can therefore also benefit interpretable AI by bringing in a slew of new methods for understanding artificial neural networks.
Furthermore, the methods of interpretable AI could help guide the development of tools to study the brain. This has already occurred, for example, through the generation of optimal visual stimuli for V4 neurons, which builds on visualization techniques from interpretable AI [Bashivan et al., 2019]. Controversial stimuli, which are designed to determine which of two models humans are most aligned with, were also inspired from work on adversarial examples in machine learning [Golan et al., 2020].
Interpretable AI methods are often similar to techniques used to debug computer programs, for example, asking which portions of inputs cause a particular output, or which steps in the training or running of the model are responsible for a particular behavior. ANN interpretation methods have been able to identify situations where the network is attending to the wrong input [Kim et al., 2018, Hendricks et al., 2018 or contains a mechanism with an undesirable bias [Vig et al., 2020]. They have identified representations that contain structural information about the problem [Jawahar et al., 2019] and others that contain disentangled neurons that distill emergent features [Wu et al., 2021, Bau et al., 2018. And recently, interpretable methods have begun to elucidate the emergence of algorithmic mechanisms in large networks, such as inductive copying behavior [Olsson et al., 2022] and the retrieval of factual associations [Meng et al., 2022]; these can allow the weights of a network to be modified to, for example, directly alter the factual associations stored in a network. See a list of interpretable AI methods in Table 2.
While some methods from interpretable AI are specifically built to control the network's output, many are not. Interpretable AI methods also therefore stand to benefit from the same testing procedure described here and evaluation according to the goal of experimentally-validated understanding, especially given existing debates within interpretable AI about the suitability of different methods [Doshi-Velez and Kim, 2017, Lipton, 2018, Kindermans et al., 2019, Zimmermann et al., 2021. Through this testing, a unified set of methods could be identified that are proven suitable for assisting neural systems understanding as a whole. That is, researchers aiming to understand AI or the brain could pull from a shared toolbox and set of frameworks for approaching these systems, and benefit from understanding gleaned from both sides.
Finally, some scientists suspect that there are features of the brain that make it inherently more interpretable than ANNs. This may mean that ANNs aren't an appropriate testbed for tools that we want to apply to the brain, as ANNs may actually be more difficult to understand. If this is the case, however, explicitly identifying those features would be immensely valuable. Specifically, this information could be used to build more brain-like ANNs which are thus more interpretable, as has been attempted recently [Liu et al., 2023].

Journal Pre-proof
Methods from Interpretable AI Input synthesis methods help understand network decisions by using optimization methods to create inputs that induce a particular output. See Szegedy et al. [2013], Morris et al.

Challenges and limitations
Validating and unifying methods of neural systems understanding is a valuable goal, but will likely face a variety of challenges and limitations.
The work proposed here rests on the belief that there will be some overlap between the tools that can be used to understand ANNs and real neural systems. Not all researchers will subscribe to this belief. However, if ANNs do not violate any explicit assumption or requirement of the method being studied then why should these methods not illuminate the workings of ANNs? Usually the answer is that there is some looser, unstated assumption about why the tool is useful for understanding the brain, and ANNs somehow violate these less formal assumptions. One such example is the fact that the brain is the result of evolution and a set of developmental processes based mainly on local learning rules; ANNs on the other hand are typically trained from random initialization with gradient descent. The question of whether these differences actually invalidate the use of the tools of neuroscience on ANNs is an empirical one. A variety of more biologically-plausible forms of training ANNs are constantly being developed [Lillicrap et al., 2020]; these networks can be used to determine if local learning rules really do impact the applicability of our tools. In any case, it would be beneficial to make these unstated assumptions about the tools explicit, so that their application to real neural data is better understood.
What's more, it is sometimes claimed that we should not expect to get a satisfying humancomprehensible understanding of neural systems (either artificial or biological) at all [Cao and Yamins, 2021b] According to this view, the networks are simply too distributed and unconstrained to promise that a simpler description of their function is possible. Therefore, attempting to develop methods that will provide a compressed, algorithmic understanding of their working is ill-founded. Indeed, Lillicrap and Kording [2019] claims that 'a lot of approaches currently deployed to understand brains may be rather transparently unable to deliver the results neuroscientists are looking for' and therefore neuroscientists should largely abandon this goal and instead aim to describe the architectures, objective functions, and learning rules that give rise to these systems instead [Richards et al., 2019].
While there is no guarantee that a satisfying description of the function of a neural network is always possible, past successes suggest that some progress can be made. At the very least, it seems reasonable to assume that we could find some description of how a neural network works that is more compressed than simply listing all of its weights and activity values. Still, it is likely that expectations J o u r n a l P r e -p r o o f Journal Pre-proof of neuroscientists regarding what understanding looks like and what concepts it will be based on will likely need to shift as the aim shifts to understanding larger neural populations performing more naturalistic behaviors.
A final limitation of this approach is that results will be valid only for the type of understanding operationalized here, that is, experimentally-validated understanding focused on the goal of behavioral control. While we believe this is a well-motivated goal, other types of understanding may also be desirable. Other forms of mechanistic or non-mechanistic explanation [Chirimuuta, 2014, Ross, 2021 may need to be operationalized and tested in different ways.

Conclusion
The tools we use to analyze neural systems will determine how accurately and efficiently we develop useful theories of how these systems function. For this reason, it is important to explicitly reflect on and vet these tools. Here we argued that ANNs are the appropriate systems on which to perform this vetting, both for tools designed for AI approaches and tools from neuroscience. By encouraging more explicit definitions of the desired outcomes for these tools and empirical tests of their abilities, we hope to contribute to the development of sound and effective methods for the study of neural systems, both biological and artificial.

Acknowledgements
Thanks to Rosa Cao, Matteo Colombo, and Josh Merel for input on the manuscript.