Some free-energy puzzles resolved: response to Thornton

Chris Thornton [1] poses some simple but key questions about the free-energy principle reviewed in [2]. These puzzles have simple and clear answers:


Letters Response
Some free-energy puzzles resolved: response to Thornton Karl Friston The Wellcome Trust Centre for Neuroimaging, University College London, Queen Square, London, WC1N 3BG, UK Chris Thornton [1] poses some simple but key questions about the free-energy principle reviewed in [2]. These puzzles have simple and clear answers: Puzzle: ''A generative model of causal structure in the environment is [then] obtained, on which basis the agent is able to infer the 'causes of sensory samples' [ibid. p. 294]. What is unclear is how this mechanism would function where sensory samples are ambiguous'' [1].
Answer: One of the main motivations for the freeenergy principle is its appeal to [approximate] Bayesian inference where ambiguities are resolved by priors [3]. Priors are mandated by the (ill-posed) problems created by ambiguity and empirical priors are an integral part of hierarchical inference [2,Box 3]. This is not theoretical hand waving; in biophysics, the free-energy formulation is used routinely to solve difficult ill-posed inverse problems (e.g. [4]).
Puzzle: ''On the face of it, no particular stand is taken on emergence of the structures that mediate minimization. But looking at the definition of free-energy, we find a significant role being played by the variable W. It is values of this variable that encapsulate the brain's representation of 'environmental causes''' [1].
Answer: The representations are not environmental causes W but the sufficient statistics m of the brain's recognition density q(W;m); these include synaptic activity and Figure 1. This schematic summarises the various timescales over which minimization of free-energy can be considered as optimizing the state (perception), configuration (action), connectivity (learning and attention), anatomy (neurodevelopment) and phenotype (evolution) of an agent. Here, F ðs; m ðiÞ jm ðiÞ Þ is the free-energy of the sensory data (and its temporal derivatives -sðaÞ) and states of an agent m (i) 2m that belongs to class m. The states mm x ,m g ,m u correspond to synaptic activity, gain and strength, respectively, whereas a action determines the sampling of sensory data.

Update
Trends in Cognitive Sciences Vol.14 No.2 efficacy [2]. The implicit optimization of neuronal connections (i.e. perceptual learning) leads to hierarchical brain structures (models) that recapitulate causal structure in the sensorium. This optimization process can 'prune' the form or structure of the model (cf., synaptic pruning [5]) and is used routinely in model optimization (e.g. automatic relevance determination [6]). Furthermore, one could regard natural selection as optimizing the structural form of models at an evolutionary scale, through minimizing free-energy (where it is called free-fitness [7]). In a statistical setting, freeenergy bounds on model evidence are used routinely in Bayesian model selection (where the log model evidence is negative surprise, e.g. [8];) (Figure 1).
Puzzle: ''With the framework providing no principle for deciding the range of W, the brain's representation of the conditional density is inevitably a 'slightly mysterious construct''' [1].
Answer: The range of W (the values it can take) is specified by the form of the (generative) model and the priors it entails. For example, the equation in Box 2 [2] specifies the range of hidden states in the world x (i) W with the range of a function, for example a neuronal activation function. The 'slightly mysterious' aspect of the recognition density is not its form (nor the implicit range of causes that are represented) but the fact that it is induced by the brain's physical states (which encode the recognition density).
Answer: The explanatory advance furnished by freeenergy is fundamental: it provides a means to minimize surprise. This is because surprise cannot be quantified by an agent, whereas free-energy can. Again, this is not abstract hand waving; the free-energy bound on surprise (or logevidence for a model) plays an essential role in physics [9], machine learning [10] and statistics [11] for this reason.

David Over
Psychology Department, Durham University, Durham City DH1 3LE, UK Despite a long tradition of research in both fields, the psychological study of intelligence and its tests has not been well integrated with the psychological study of rationality. Keith Stanovich's well-written and accessible book does integrate these studies and should, for that reason alone, have a highly beneficial impact on both.
Stanovich argues that standard intelligence tests miss the trait than can be of even greater value than relatively high intelligence: rationality. Although these tests measure something of value, it is not rationality: rationality is usually at best modestly correlated with scores on intelligence tests.
Stanovich relies on the standard view in cognitive science that rationality should be defined in two related ways. It can refer, most fundamentally, to instrumental rationality, which is the ability to achieve one's goals or (more technically) maximize expected utility. It can also refer to epistemic rationality: the capacity to acquire the well-justified beliefs that are usually necessary for goal achievement. Stanovich establishes, both informally and on the basis of an extensive empirical literature, that there is a clear distinction between rationality so defined and intelligence. Intelligence can help to solve some problems about rational belief or action, but is of little help in other cases. An example is myside bias, the tendency to evaluate evidence from an egocentric point of view. Stanovich and his collaborator Richard West found no correlation between the magnitude of this bias and intelligence.