Switch from ambient to focal processing mode explains the dynamics of free viewing eye movements

Previous studies have reported that humans employ ambient and focal modes of visual exploration while they freely view natural scenes. These two modes have been characterized based on eye movement parameters such as saccade amplitude and fixation duration, but not by any visual features of the viewed scenes. Here we propose a new characterization of eye movements during free viewing based on how eyes are moved from and to objects in a visual scene. We applied this characterization to data obtained from freely-viewing macaque monkeys. We show that the analysis based on this characterization gives a direct indication of a behavioral shift from ambient to focal processing mode along the course of free viewing exploration. We further propose a stochastic model of saccade sequence generation incorporating a switch between the two processing modes, which quantitatively reproduces the behavioral features observed in the data.


Object images used in the experiment
Supplementary Figure 1. The whole set of 64 object images used to generate stimulus images for the free viewing task. The scale bar to the upper right indicates 2 degrees of visual angle. "Human face" 1-8 are facial images of different individuals, not shown here for anonymisation reasons. Gray disk in these images indicates the size of the faces. All images are drawn from the Microsoft image gallery (see Methods of the main article for details), except for the first two human face images, which were taken by one of coauthors.

Sequential dependence between successive saccade types
In this analysis we examine the temporal structure of free viewing eye movements on a finer temporal scale than the analysis in the main article concerns about, by focusing on sequential dependences between types of successive saccades. To this end, we formulate a framework to test the significance of successive occurrence of saccade types for every possible combination of the types. For simplification, we focus on pairs of saccades which immediately precede and follow an object fixation (Fig. S2A). We test whether a specific combination of saccade types occurs more or less frequently than expected by chance. Thus, the null hypothesis for this test is defined as: the types of saccades preceding and following an object fixation are independent of each other. Rejection of this null hypothesis means that there is a certain bias in the occurrence pattern of successive saccade types.
For each possible combination of saccade types, we counted occurrences of successive saccade pairs of which types match the combination in the empirical eye movement data of the monkeys. From the empirical count for all combinations, we constructed a 3x3 contingency table, of which the rows are different types of the preceding saccades (i.e., intra-object, trans-object and background-to-object) and the columns are different types of the following saccades (i.e., intra-object, trans-object and object-to-background), by filling the cells with the corresponding counts. We applied a 2 -test to the table to test for deviation of the empirical counts from the counts expected under the null hypothesis of independence. In case of significant deviation, we further applied a post-hoc residual analysis to each of the empirical counts in order to separately test the significance of the deviation of each empirical count from its expected count.
The results of this analysis are summarized in a form of transition diagram ( Fig. S2B and C). Both monkeys show significantly more frequent successive intra-and trans-object saccades (red circular arrows at the intra-object node and the trans-object node), while the transitions between these types occur significantly less than expected by chance (blue arrows in both directions between the intra-object node and the transobject node). Thus, the analysis reveals that saccades of a same type tend to occur in successive repetitions. This result motivated us to construct the saccade sequence generation model, described in the main article, to be composed of distinct states engaged in generation of specific types of saccade, in order to incorporate a mechanism for generation of the successive repetitions revealed in the empirical data. Figure 2. Sequential dependence between successive saccade types. (A) Schematic illustration of the saccade types that we focus on in the present analysis. We test for biases in the combination of saccade types preceding and following an object fixation. Object images are taken from the Microsoft image gallery. (B) Graphical summary of the test performed on the whole saccade data of monkey H. Nodes and arrows represent saccade types and their transitions, respectively. Circular arrow represents recurrence of an identical saccade type. Numbers next to the arrows indicate the empirical occurrence count of the respective transitions and recurrences, with the percentage values in parenthesis representing the difference of the empirical count from the occurrence counts expected by chance (e.g., "+50%" means that the empirical count is 150% of the expected count). Thick arrows indicate the transitions and recurrences with the difference percentages more than 30% or less than -30%. Color of the arrows indicates significantly more (red), less (blue), or non-significant (gray) empirical count than the expected. Significance level was set to p = 0.01. (C) Same graph as (B), but for monkey S.

Application of the model to free viewing eye movements on natural scene background stimuli
The model was applied to the eye movement data from trials with natural scene background stimuli (Fig.  S3). The main results remain unchanged from the results shown in the main article obtained from the free viewing on gray background stimuli. The most noticeable difference is that GoF is generally low for natural scene background trials compared to the results from gray background trials. This is most likely because of an increase in the number of background fixations induced by the complex background images. This increase is indicated by the value of ; it is increased from 0.23 to 0.41 for monkey H, and from 0.21 to 0.51 for monkey S. Occurrence of more background fixations causes larger variability in the numbers of intra-object and trans-object saccades across trials, and hence should degrade the performance of the model. Furthermore, the ratios of the saccade types other than intra-object and trans-object are not as constant as observed for the free viewing on gray background stimuli. This feature further degrades the performance of the model, because the model cannot account for such changes by construction. . (F) Same as (E), but for monkey S. Note that, differently from Fig. 6E and F in the main article, the same initial condition (the initial fixation on the background) was used for simulations of monkey H and monkey S, because no object is placed at the center of the natural scene background stimuli used for the both monkeys.

Mathematical formulation of the model
Our stochastic model of saccade sequence generation described in the main article can be formalized as a hidden Markov model with four hidden states and three observations. The four states are identical to the saccade generation states shown in Fig. 5A of the main article, which we denote here as { , , , }, with the following correspondence: and are the intra-and trans-state of the early mode, respectively, and and are the intra-and the trans-state of the late mode, respectively. The three observations, which we denote as { , , }, are interpreted as the types of simulated fixations as follows: is a fixation on the same object as the previous fixation was on, is a fixation on an object different from the one that the previous fixation was on, and is a fixation on the background.
The state transition probabilities are defined as follows. To The emission probabilities, which relate the current state of the model to the current observation, are defined as follows. Observation Note that , , , , and are the same probabilities as those defined in the Results section and the Methods section of the main article.