How workload and availability of spatial reference shape eye movement coupling in visuospatial working memory

Eyes are active in memory recall and visual imagination, yet our grasp of the underlying qualities and factors of these internally coupled eye movements is limited. To explore this, we studied 50 participants, examining how workload, spatial reference availability, and imagined movement direction influence internal coupling of eye movements. We designed a visuospatial working memory task in which participants mentally moved a black patch along a path within a matrix and each trial involved one step along this path (presented via speakers: up, down, left, or right). We varied workload by adjusting matrix size (3 × 3 vs. 5 × 5), manipulated availability of a spatial frame of reference by presenting either a blank screen (requiring participants to rely solely on their mental representation of the matrix) or spatial reference in the form of an empty matrix, and contrasted active task performance to two control conditions involving only active or passive listening. Our findings show that eye movements consistently matched the imagined movement of the patch in the matrix, not driven solely by auditory or semantic cues. While workload influenced pupil diameter, perceived demand, and performance, it had no observable impact on internal coupling. The availability of spatial reference enhanced coupling of eye movements, leading more frequent, precise, and resilient saccades against noise and bias. The absence of workload effects on coupled saccades in our study, in combination with the relatively high degree of coupling observed even in the invisible matrix condition, indicates that eye movements align with shifts in attention across both visually and internally represented information. This suggests that coupled eye movements are not merely strategic efforts to reduce workload, but rather a natural response to where attention is directed.


Introduction
Although the primary function of the eyes is to focus on and preprocess visual information, they exhibit notable activity during the recollection of memories or engagement in visual mental imagery (Benedek, 2018;Johansson et al., 2022;Mast and Kosslyn, 2002;Walcher et al., 2017).For example, our eyes adapt to imagined luminance (e.g., Kay et al., 2022;Laeng and Sulutvedt, 2014), follow imagined objects (e.g., Demarais and Cohen, 1998;De'Sperati, 2003;Huber and Krist, 2004;Johansson et al., 2006), and even adapt to the distance of an imagined object (e.g., Sulutvedt et al., 2018).Below, we use the term internal coupling of eye behavior to refer to this phenomenon where mental representations and internal processes trigger or affect eye behavior (Annerer-Walcher et al., 2021;Walcher et al., 2017).
To gain a deeper understanding of this phenomenon and its underlying mechanisms, we must identify its key characteristics and the factors that influence it.Below, we review literature on some of the factors that seem most relevant to us.First, what aspect of an internal activity is the main trigger of coupled eye movements: Do eyes mainly follow shifts of attention/updating of mental representations or are eye movements mainly triggered automatically by spatial cues unrelated to the current mental representation (e.g., hearing "up")?Second, what effect has workload on internal coupling: Is internal coupling of eye movements used to reduce workload ("offloading")?Third, is coupling of eye behavior affected when spatial reference is available?Fourth, how precise, or prone to distortions is internal coupling?

What aspects of internal activities elicit internal coupling?
Most research on eye movements during internally directed cognition comes from studies in which participants had to encode, maintain, and retrieve a (static) picture or set of stimuli (for an overview, see Wynn et al., 2019).Those studies showed that gaze patterns exhibited during encoding of a picture are reinstated during retrieval (Johansson et al., 2022;Johansson and Johansson, 2013;Wynn et al., 2018).While some authors suggest that this gaze reinstatement plays a functional role for memory retrieval (Johansson et al., 2022;Johansson and Johansson, 2013;Wynn et al., 2018Wynn et al., , 2019)), others found no evidence for a functional role of gaze reinstatement in memory retrieval in a study using experimental manipulation of gaze reinstatement (Foulsham and Kingstone, 2013).Further, studies on the "looking-at-nothing" effect showed that when people try to remember information, they tend to look at the position where this information was presented previously, although the position is now empty (e.g., Ferreira et al., 2008).Therefore, some argued that eye movements could serve as some sort of visual index of items held in memory, i.e., the spatial position and/or eye movements are stored together with the memorized item and are reinstated during retrieval (e. g., Mast and Kosslyn, 2002;Spivey and Geng, 2001).This account is also supported by findings showing that simply reading (without visualizing) words associated with specific spatial positions (e.g., "sun" which is typically found overhead) facilitate eye movements in the congruent (vs.incongruent) direction (Dudschig et al., 2013).Johansson et al. (2006) found that during retelling and listening to a story involving spatial information ("left of the tree") eye movements more closely reflect the direction of the movement (e.g., leftward saccade) than the actual position (saccade landing position) in the mental representation.Even when gaze is restricted, in form of requiring participants to maintain gaze on a fixation cross, small shifts in gaze position still reflect the attention shift towards the items represented in mind (Gresch et al., 2023;van Ede et al., 2019a).Eye movements also follow movements and manipulations in mental imagery.Already in 1964, Antrobus et al. found that imagining a dynamic scene elicits more eye movements than imaging a static scene (Antrobus et al., 1964).Huber and Krist (2004) found that the eyes follow the imagined track of an invisible ball, when trying to estimate when the ball hits the ground.Hence, some researchers argue that eye movements during internally directed cognition reflect attention shifts within mental representations (Johansson et al., 2006;Kosslyn, 2005;van Ede et al., 2019a).
The literature highlights two primary theories on why eye movements occur during internal cognition.First, eye movements may relate to the spatial location of initially encoded memories, activating alongside memory retrieval (e.g., Mast and Kosslyn, 2002;Spivey and Geng, 2001).Second, they might reflect shifts in attention within mental imagery, similar to how eyes move in response to external stimuli (e.g., van Ede et al., 2019a).However, distinguishing a primary mechanism is complex, as studies often encapsulate both elements: memoryassociated eye movements and attentional shifts.Contrary to viewing these theories as contradictory, the distinction likely lies in the experimental design.In tasks where participants recall a stimulus, eye movements align with the spatial aspects of the remembered content.Conversely, when tasks involve internal navigation or reaction to mental representations, eye movements indicate shifts in attention.Essentially, even when recalling, eye movements can be seen as tracking internal attention shifts, corresponding to the spatial characteristics of the remembered stimulus (Wynn et al., 2019).

The role of workload
Workload has been discussed as one factor that explains why we move our eyes during internally directed cognition.As we delve into the literature that follows, it is essential to keep in mind the diverse ways in which workload has been conceptualized and measured.In our study, "workload" is defined as the mental load imposed on an individual, manifested through the quantity of (visual) items to be retained or manipulated in memory (e.g., Draschkow et al., 2022), or the complexity of the stimuli involved which increases the mental effort to maintain and reconstruct information (such as the number of potential positions for an item in the current study; e.g., Korda et al., 2023; or the imaginability of a word; e.g., Kumcu and Thompson, 2020).Such a conceptualization of workload does not alter the inherent quality of the task but instead focuses on the cognitive demands it imposes on participants.It is critical to distinguish this from the related concept of "difficulty."While increasing workload typically increases the difficulty of a task, task difficulty may also be determined by other factors unrelated to workload.These could include the perceptual difficulty of stimuli (e.g., clear versus blurry images), time pressure, or distractions.Therefore, our definition emphasizes the cognitive aspects of workload, setting it apart from other dimensions of task difficulty.
It seems quite intuitive that eye movements during internally directed cognition might be a way to "offload" some of the workload by making eye movements or linking memory to spatial locations (see "cognitive offloading" theory; Risko and Gilbert, 2016).When there is spatial reference available, one can utilize this information as a scaffold for the mental process and thereby "offload" some of the workload.For example, when planning a route one can track the current position on a visible map with eye movements.This raises the question: Are the coupled eye movements themselves a functional part of the offloading or only a byproduct of moving attention along visible information?Hence, when no spatial reference is provided and a task needs to be performed solely in mind, does internal coupling of saccades continue to function as a form of offloading?If internally coupled saccades are strategically employed to alleviate workload, higher workload should imply increased coupling.For instance, Wynn et al. (2019) theorize that the phenomenon of gaze reinstatement (or "looking-at-nothing") correlates with workload, suggesting that an increase in task demands or a decrease in memory capacity leads to more frequent gaze reinstatement.
There is support for this notion in the literature: For example, a study on the looking-at-nothing effect by Kumcu and Thompson (2020) found that people look more often at the now empty position, when trying to remember a difficult (hard to imagine, less concrete, e.g., "idea") compared to an easy word (easy to imagine, concrete, e.g., "baby").Similarly, Wynn et al. (2018) showed that under low mnemonic demands, in the form of brief delays between encoding and recall (recognizing whether a now presented stimuli is old or new), gaze reinstatement was used predominantly by younger adults with lower performance.However, with higher mnemonic demands, in the form of longer delays, gaze reinstatement was associated with improved performance.Also, under restricted gaze (retro cue is presented visually as central fixation instead of auditorily and therefore restricts gaze), people shift their gaze more in the direction of the to be remembered item when workload was higher in the form of higher memory load (2 vs. 4 items during encoding; Draschkow et al., 2022;Liu et al., 2023).
Furthermore, some studies suggested associations between internal coupling and workload on a between-subject level, defining workload as the relationship between task demands and available working memory capacity.Johansson et al. (2011) found that individuals with lower spatial imagery skills exhibit increased gaze reinstatement during scene recall.Similarly, Wynn et al. (2018) reported a positive correlation between gaze reinstatement and performance in older adults but different effects for younger adults, suggesting the use of gaze reinstatement to compensate for lower capacity.In contrast, Gurtner et al. (2021) observed that higher task workload relative to working memory capacity correlates with weaker gaze coupling, indicated by a slight positive relationship between n-back task performance and the alignment of gaze during perception and imagination.
Besides these findings, research into the impact of workload on internal coupling is scarce and shows inconsistent results.Variations across studies in the materials to be recalled, task designs, eye movement measurements, and the definitions of workload likely contribute to these inconsistencies.Consequently, the specific role of internal coupling in cognitive offloading requires further exploration.

Availability of spatial reference
Another important factor that is oftentimes not directly addressed, but varies between studies, is the availability of a spatial frame of reference during the internal activity (like memory retrieval or mental imagery).In some of the studies, the screen was empty (e.g., Johansson et al., 2006) while in others there was a fixation cross (e.g., Liu et al., 2022;van Ede et al., 2019b) or even placeholders for the previously visible stimuli on the screen (Gresch et al., 2023).
Performing a task solely through mental representation versus having a spatial frame of reference visually available could impact the coupling of eye movements.Spivey and Geng (2001) presented objects in a grid during memory encoding and presented an empty grid, an outline of the grid, or nothing on the screen during recall.They found that people made more eye movements to the previous position of an object, with increased details of the spatial frame of reference during recall (grid > outline > nothing).Bourlon et al. (2011) asked participants to decide whether a city is left or right of Paris.They found no differences in eye behavior between having an empty map of France vs. no visual information.The blank map of France could serve as a cue for recalling the locations of cities, yet it may not offer a sufficiently detailed spatial frame of reference to significantly affect coupled eye movements.
Why would the availability of spatial reference affect internal coupling?When the screen is blank, all relevant information needs to be produced internally in form of a mental representation.Mental representations are sparse and do not match the details and clarity achieved by perception (Koenig-Robert and Pearson, 2021), making it harder for the oculomotor system to plan and execute eye movements and more prone to internal interference and internal noise (LaBerge et al., 2018;Nuthmann et al., 2016).In line with that, Liu et al. (2022) showed that not all shifts of attention within mental representations are accompanied by shifts in gaze position because, although the oculomotor system is activated, it does not always reach the threshold for outputting a saccade.Consequently, shifts of attention within mental representations in the absence of visual input could be performed more often covertly (without accompanying eye movements) than shifts of attention with a visual spatial reference.Several studies showed that covert shifts of attention can have beneficial effects similar to eye movements (= overt shifts of attention) on retrieving information associated with spatial locations (Godijn and Theeuwes, 2012;Scholz et al., 2018).
Having a spatial reference reduces the workload by offloading part of the internal task onto the external world (Risko and Gilbert, 2016) and allows using the spatial reference as a scaffold for the internal operations (Loaiza and Souza, 2022).Further, the availability of a spatial reference (even just a fixation cross) can serve as some sort of visual anchor for the oculomotor system and therefore makes it easier to plan saccades and this increases the chances of reaching the threshold for executing saccades (LaBerge et al., 2018).To summarize, the availability of spatial reference has a high potential for influencing the amount of eye movements triggered during internally directed cognition.However, research specifically exploring how the availability of spatial reference affects eye movement coupling remains limited.

Characteristics of internally coupled eye movements
While most work so far has focused on the frequency of internal coupling, the latency, precision, vulnerability to noise, and vulnerability to biases of internally coupled eye movements also provide valuable insights into the underlying mechanisms of internal coupling.Let us begin with latency of internally coupled eye movements.Memoryguided eye movements were shown to take longer (higher saccade latency) than visually-guided eye movements (e.g., Alexopoulou et al., 2023;Nuthmann et al., 2016).Potentially, retrieving and generating a mental representation takes longer than forming a percept from visual input, delaying eye movements.Further, the lower quality of mental representations compared to perception (Koenig-Robert and Pearson, 2021) could lead to longer times required for planning eye movements.
Let us continue with precision and vulnerability to noise, namely how closely eye movements match with the distance (amplitude and variation in amplitude of eye movements).In fact, studies on gaze reinstatement and looking-at-nothing never find a perfect match between gaze patterns during encoding and retrieval (e.g., Johansson et al., 2006;Spivey and Geng, 2001;Wynn et al., 2022).Some studies showed that eye movements during memory retrieval or saccades to remembered positions tend to over-or underestimate the distances.For example, Johansson et al. (2006) found that during memory retrieval the direction of saccades is often correct (local correspondence), but the magnitude is off (global correspondence; landing position).Similarly, sometimes a shrinkage of eye movements on mental images is reported (Wang et al., 2020); and another study showed that for short distances (e.g., 1.2 dva), saccades to remembered positions tend to slightly overshoot while for longer distances (e.g., 10 dva), they tend to undershoot (Nuthmann et al., 2016).Further, imagined smooth pursuit tends to under-or overestimate amplitude, depending on whether eyes are open or closed (De'Sperati, 2003;Lenox et al., 1970).These (systematic) overand undershoots of saccade amplitudes support the view that mental representations are no exact copy of a visual percept (Koenig-Robert and Pearson, 2021).
Interestingly, in a recent study on gaze shifts during memory retrieval under maintained fixation, Liu et al. (2023) showed that when the direction is a unique feature of each item (one item per direction) then the distance to the fixation cross is not reflected in the amplitude of gaze shifts.However, when the direction is not unique (two items per direction) then gaze shifts also reflect the distance (longer amplitudes for the far item compared to the near item).
Besides over-and undershooting of saccade amplitude, memoryguided saccades also appear more variable and vulnerable to noise than vision-guided saccades (e.g., Nuthmann et al., 2016;Wang et al., 2020).This vulnerability to noise could be the result of the lower detail and vividness of internal representations compared to visual percepts (Koenig-Robert and Pearson, 2021), making it harder for the oculomotor system to plan precise eye movements and more prone to random neuronal noise and external visual noise (LaBerge et al., 2018;Nuthmann et al., 2016).

The present study
The present study's goal was to identify the characteristics and moderators of internally coupled eye movements to further our understanding of the underlying mechanisms of internal coupling.To this end, we systematically investigated eye movements during the performance of a visuospatial working memory task (visuospatial WM task) under free viewing conditions.In specific, participants were asked to mentally move a black patch along a path within a matrix and each trial involved one step along this path (presented via speakers: up, down, left, or right), a task similar to the one used by Kerr (1993).We were interested whether the eyes would follow the directions of the steps and, if so, to examine the conditions and characteristics of those coupled saccades.We considered a saccade a "coupled saccade" if the saccade's angular direction was in the principal direction of the step.
Traditional encoding and recall paradigms, frequently used to investigate internal coupling, rely on a sequence where information is presented visually and later retrieved (De Vries and Van Ede, 2023;Mast and Kosslyn, 2002;Spivey and Geng, 2001).In such paradigms, eye movements could be associated with the original position of the encoded memory content and co-activated when retrieving the memory content (e.g., Mast and Kosslyn, 2002;Spivey and Geng, 2001).Our approach, however, shifts focus to how the eyes track mental operations like updating positions, without physical observation.The specific design of mentally moving the patch one step at a time enables us to precisely track if and how closely eyes follow mental operations.
In a first step, we examined whether the visuospatial WM task triggered internal coupling of eye movements, by checking whether auditorily presented directions (up, down, left, right) trigger saccades in the same direction when the matrix is invisible and therefore represented in mind.Based on the literature reviewed above, we expected that the eyes consistently follow the imagined movement through the (invisible) matrix.According to the known horizontal bias (Foulsham & Kingstone, 2010;Tatler and Vincent, 2009), we expected fewer coupled saccades in the vertical (up, down) than horizontal directions (left, right).Then, to clarify whether eye movements couple to imagined movements along the path or also to hearing the direction cues, we included two control conditions: a passive listening and an active listening condition, using the same stimuli as in the visuospatial WM task.If semantic processing can trigger saccades when hearing directions, then we would expect more coupling in the active listening condition than in the passive listening condition and compared to chance.If internal coupling of saccades is (mainly or also) triggered by shifts of attention within mental representations then the visuospatial WM task should show more internal coupling than the passive and active listening conditions.
In our study, we aimed to distinguish effects of workload from the specific influence of spatial reference on saccade coupling.To examine workload's impact on internal coupling, as discussed in previous research (e.g., Johansson et al., 2006;Kumcu and Thompson, 2020;Richardson et al., 2009), we manipulated the complexity of the task by varying matrix sizes (3 × 3 vs. 5 × 5).This variation directly affects the number of possible positions (from 9 in a 3 × 3 matrix to 25 in a 5 × 5 matrix) and the potential paths the patch could take, subsequently increasing the cognitive demand to maintain and process this information.Here, workload pertains to the complexity of the stimulus-specifically, the number of possible positions-which increases the mental effort required for stimulus maintenance.To empirically confirm our manipulation's efficacy, we measured median pupil diameter as an objective indicator of workload.The average pupil diameter, indicative of the overall arousal level, correlates with the effort exerted during a task; a larger diameter signals increased effort (Unsworth and Robison, 2017).If saccade coupling facilitates cognitive offloading, as suggested by prior studies, higher workload-manifested in higher stimulus complexity-should lead to a greater incidence of coupled saccades.
We also investigated how the availability of a spatial reference-achieved by manipulating the visibility of the matrix (visible vs. invisible during the task)-affects saccade coupling.While manipulating the availability of spatial reference inherently affects workload by adding more information to maintain in the invisible matrix condition, our study uniquely focuses on how spatial reference availability-versus its absence-affects saccade coupling.This contrasts with our workload manipulation, which explores the effects of increased cognitive demand without altering the task's fundamental characteristics, unlike the inclusion of spatial reference, which does.Based on our own pilots and the literature reviewed above, we expect more coupled saccades with a visible compared to invisible matrix and expect that the availability of spatial reference influences saccade amplitude (over-or undershoot of saccades).Further, we explore effects of matrix visibility on saccade latency (time from cue onset to saccade onset) and variability of saccade amplitude.
Moreover, we explore potential interactive effects of workload, availability of spatial reference and direction of the imagined movement on the frequency (number of trials with coupled saccade), latency (time from cue onset to saccade onset), amplitude (distance between starting and ending points of the saccade), and variation in amplitude of coupled saccades.

Method
We provide our materials, data, and analysis scripts on the Open Science Framework (OSF, https://osf.io/pdga8/).We preregistered our methods, especially hypotheses, design and outline of the analyses (https://osf.io/gf7mv),but we made minor deviations from the preregistration whenever we found it necessary, see Table S1 for a list of deviations.

Power analysis
Based on our internal pilots, we expected medium to large effect sizes.We determined the sample size a priori based on our own previous studies which used similar statistical approaches (i.e., within-subject LMM with more than one within-subject factor; Korda et al., 2023Korda et al., , 2024;;Walcher et al., 2023) and power analysis using G*power version 3.1 (Faul et al., 2009).Based on our previous studies, we expected medium-sized within-subject effects of dz = 0.4 (Annerer- Walcher et al., 2021).As a conservative estimation of required sample size, G*Power suggested a sample size of 46 participants to detect an effect of dz.= 0.4 at p = .05with a power of 80% in a repeated measures ANOVA.To account for possible exclusions of participants, we collected data from 50 participants.To check the actual power achieved with our dataset, we performed a post-hoc power analysis with the simr package (Green and MacLeod, 2016).We ran the powerSim function on the following binomial glmer model: coupled saccades ~ visibility * direction + (1 + visibility + direction | subject) + (1 | trial) (see analysis in section 3.4.1)and set the effect of visibility to the expected effect size of d = 0.4 (which equals an estimate of 0.72 in the glmer model) and the number of simulations to 100.The power for detecting the effect of visibility with p < .05 was 95% (CI: 88.72, 98.36) and with p < .01 the power was still 76% (CI: 66.43, 83.98).

Participants
Participants were prescreened via online survey (LimeSurvey) regarding the following inclusion criteria (mainly regarding requirements for eye-tracking): age between 18 and 50 years, normal (up to 0.5 diopters) or corrected-to-normal vision (soft contact lenses only), native German speaker, no dyslexia, no problems distinguishing left and right, no neurological or psychological disorders, no eye sicknesses, no previous eye surgery affecting vision, no active medication affecting eyesight or driving abilities.Participants fulfilling the inclusion criteria were then able to book a lab session for the main experiment.We stopped recruiting when we reached our goal of having data from participants.All participants gave written informed consent and were paid 10 € per hour or received partial course credit for participating in the lab session.The study protocol was approved by the local ethics committee.Lab sessions took place between February and March 2023.

Design
The paradigm of our study had four main experimental factors: (1) the task participants were asked to perform (Task: visuospatial WM, active listening, passive listening), (2) the visibility of the matrix during the trial (Matrix Visibility: invisible, visible), (3) the matrix size (Matrix Size: 3 × 3, 5 × 5), and (4) the direction cue (Direction: up, down, left, right).These four factors allowed us to assess how internal coupling is affected by the type of internal processing (imagined movement/mental updating vs. active listening vs. passive listening), the availability of spatial reference (invisible vs. visible matrix), the level of internal workload (low vs. high), and the imagined movement (up, down, left, right), respectively.The main dependent variable of interest was the internal coupling of saccades, which was measured as the proportion of trials that show a saccade made in the direction of the direction cue (for more details see section 2.5 Data Preprocessing).To investigate the characteristics of those internally coupled saccades, further dependent variables were the latency, amplitude, and variation in amplitude of coupled saccades.

Tasks
The design of the visuospatial WM task is oriented on the task described in Kerr (1993).Each block started with the information on the task to be performed in this block (visuospatial WM, active listening, or passive listening).The schematic sequence of a task block is depicted in Fig. 1.
In the visuospatial WM task, a starting stimulus was presented, which was a matrix (3 × 3 or 5 × 5) with one filled patch.The matrix had "transparent" fields and black borders, each field was 70 × 70 pixels in size (approx.1.29 degrees of visual angle), the filled patch was black.Participants continued by pressing the spacebar.In half of the blocks, the matrix was then replaced by an empty matrix of the same size (visible matrix condition), while in the other half, the matrix disappeared and the screen was empty during the block.After 1 s, the first trial started.Each trial started with the auditory presentation of a direction cue (up, down, left, right; duration 0.6 s) and participants were instructed to mentally move the patch in the cued direction by one field.Time between direction cues was 1 s (= duration of a trial).Order of directions within a block was produced pseudorandomized in advance to make sure that the patch did not cross the limits of the matrix, the patch's path included many different areas within the matrix (e.g., the patch position did not remain in one corner of the matrix), the paths differed between blocks, there were not multiple repetitions of position (e.g., up, down, up, down), and each direction was included equally often across conditions.
Through extensive piloting, we calibrated the matrix sizes to establish two conditions that significantly differed in pupil diameter in both the visible and invisible matrix condition, indicating differences in workload, while ensuring the tasks were neither too simple to disengage participants nor excessively challenging to preclude completion.Opting for matrices with uneven dimensions (e.g., 3 × 3 over 4 × 4) was strategic to minimize the likelihood of chunking strategies, which are more prevalent with even-sized matrices.Larger matrices were deemed impractical due to significantly reduced performance levels in the invisible matrix condition (e.g., 6 × 5 matrix already yielded performance below 50%, with an average of 30%), rendering the data unusable for analysis.Uniform matrix sizes were maintained across both visible and invisible condition to avoid influencing eye movement patterns-recognizing that disparities in matrix size could potentially affect saccade dynamics, such as saccades further from the center.
At the end of a block (= after 10 trials), participants performance was checked: an empty matrix appeared on the screen and participants had to report the last position of the filled patch by clicking on the respective field in the matrix and pressing the spacebar to continue.Participants were instructed that in case they lost track of the position of the filled patch during the block, they should continue with the last remembered position to ensure continuous task performance.
In the passive listening task, a block proceeded similarly as in the visuospatial WM task except that the starting screen showed an empty matrix (either 3 × 3 or 5 × 5, but no black patch) and participants were instructed to not perform any task at all and leave the performance Note.Each block started with the information of the block's task (visuospatial WM, active listening, passive listening; not depicted here).Then the block's starting stimulus was presented, which was a 3 × 3 or 5 × 5 matrix with one patch filled (visuospatial WM task) or an empty matrix (active and passive listening task).After 1 s, the trials started.In half of the blocks, the empty matrix remained on the screen (visible condition), in the other half the screen was empty (invisible condition).Each trial within the block started with the 0.6 s presentation of the direction cue (left, right, up, down) via speakers and lasted for 1 s, before the next trial started.In the active listening task, some direction cues were replaced by numbers and participants were instructed to press the spacebar as fast as possible when hearing one of these numbers.In the passive listening task, the visual and auditory setup was the same as in the visuospatial WM task, but participants had no task to perform.Each block comprised 10 trials.At the end of the block, an empty matrix appeared and participants were prompted to report the last position of the patch (visuospatial WM task) or not (active and passive listening task) and continue by spacebar press.Matrix in figure is not up to scale.check (empty matrix) at the end of the block empty.
In the active listening task, a block started with an empty matrix (3 × 3 or 5 × 5) and in 8% of the trials the direction was replaced by a number (one, two, three, or four).Participants were instructed to press the spacebar as fast as possible when a number was presented.Due to a technical error, oddballs (numbers) were only presented in the 3 × 3 matrix size condition.Since participants were instructed to attend to the direction cues only, we expect that they ignored the matrix size and we don't think this error affected the data.
The visuospatial WM task comprised 10 blocks (= 100 trials) per combination of matrix visibility (visible, invisible) and matrix size (3 × 3, 5 × 5) combination, resulting in a total of 40 blocks (400 trials).The passive and active listening task had 5 blocks (= 50 trials) per visibility and matrix size combination resulting in a total of 20 blocks (200 trials) each.Order of blocks (tasks x matrix visibility x matrix size) was randomized.

Procedure
After arrival in the lab, participants gave written informed consent, filled out a questionnaire on their current physical and mental state (i.e., coffee, alcohol consumption, sleep, vigilance), and we verified their eyesight with the Landolt vision test (Wesemann, 2002).Participants read written instructions of the visuospatial WM task with all conditions and performed practice blocks (6 blocks in total).When participants gave correct responses at the end of each practice block, they continued with the main blocks; otherwise, the practice blocks were repeated.Performing the paradigm took about 45 min.Since the study was obviously an eye-tracking study (eye-tracker on table, calibration etc.), we told participants that we are interested in their pupil diameter (instead of eye movements), so they won't focus on or try to change their eye movements.We clarified the real measure at the end of the session.After the paradigm, participants filled out a short questionnaire regarding the paradigm.We asked participants to report how demanding they perceived each condition on a scale ranging from 0 not demanding at all to 6 very demanding.Further, we asked participants how well they were able to imagine the movement of the patch in the visuospatial WM task in each condition (perceived imagination) on a scale from 0 not at all to 6 very clear.Finally, we asked them which strategy they used to perform the visuospatial WM task by asking them to select one or more options: imagining the matrix, verbalizing, naming rows and columns, using fingers, chunking, and another strategy not stated (with option to describe this strategy).After the paradigm, participants performed cognitive tests and filled out further questionnaires as part of another study.The total lab session took 2.5 h.

Apparatus
Participants were seated in a sound-attenuated room with lights on (luminance at participants eyes was 29.55 lx), in front of a 24-in.ASUS VG245qe monitor (1920 × 1080 pixels, ca.33.52 • x 19.73 • , 60 Hz, monitor settings: 40% brightness, 20% contrast).Binocular eye behavior was tracked with an EyeLink 1000 Plus system (SR Research Ltd.) at 1000 Hz.The head was stabilized with a chin rest at 88 cm distance to the screen.The eye tracker was placed 59 cm in front of the chin rest.A 9-point calibration and validation were performed at the beginning of the paradigm with error thresholds of average gaze at 0.5 • and maximum per position at 1 • .Drift checks were conducted at the beginning of each block.
All text was black (0.37 dva high, font "Arial") and the background was grey (RGB 128,128,128).Auditory direction cues were presented through a Logitech PC speaker Z 200 with computer volume at 100% and speaker volume at medium.Audio files were created using htt ps://wideo.co/text-to-speech/ with voice "[de-DE] Lisa Fischer-S", converted to Ogg files, for compatibility with PsychoPy, and edited to an equal length of 600 ms.The experimental script was generated in PsychoPy (Version 2020.2.10; Peirce et al., 2019).
Pupil diameter was transformed from arbitrary units to millimeters, smoothed with an average filter (n = 20), and interpolated using the gazer package (Geller et al., 2020).We excluded pupil samples with missing data for one eye, abnormal eye vergence, fixation disparity outliers, abnormal pupil values (outside natural limits of 2-8 mm), pupil outliers (> 3 SD from individuals mean), and during saccades, resulting in average rate of excluded samples per participant of 2%.We had preregistered that if a trial or block had <50% valid pupil data, they were excluded from further analyses (excluded trials and blocks: 0).For analysis of pupil diameter, we calculated median pupil diameter per trial.
Regarding gaze position, we excluded samples with data from only one eye, within blinks (detected blink periods were extended by 100 ms backward and forward to account for possible data distortions due to partial lid closure), with abnormal or outlying fixation disparity, and abnormal or outlying pupil diameter, resulting in an average rate of excluded samples per participant of 8.52%.We excluded trials and blocks with >50% missing gaze position data (1.58% of all trials, and 0.45% of all blocks).
Gaze position samples were categorized as saccade if velocity was larger than 30 • /s or acceleration was larger than 8000 • /s 2 and if the duration was >6 ms.To determine whether a trial elicited a coupled saccade, we selected the first saccade that was made at least 80 ms from trial onset (the threshold of 80 ms was chosen to exclude anticipatory saccades that are too early to be the result of hearing the audio; e.g., Walker et al., 2000).Next, we determined the saccade orientation and categorized it into one of four principal directions (up, down, left, right), each encompassing a 90 • angle (e.g., right = 45 • to 135 • ).Saccades that were made in the same direction as the audio cue of the trial were then labelled as coupled.Saccade latency was calculated as the time between onset of a trial (onset of audio direction) and onset of the saccade.

Analysis strategy
The main analysis strategy was to predict which trials (i.e., Task, Matrix Visibility, Matrix Size, Direction) triggered a coupled saccade and which did not (binomial data).The trial level eye behavior data was analyzed with generalized and linear mixed models using the lme4 (Bates et al., 2015) and lmerTest package (Kuznetsova et al., 2017), respectively.Besides the fixed effects of interest representing the main experimental factors (Task, Matrix Visibility, Matrix Size, Direction), we entered time of block to control for time on task effects.As random structure we included random intercepts for participants and trials, and random slopes per participant for the fixed effects of interest.For example, a model predicting coupled saccades with the fixed effects visibility and direction would have the following structure: coupled saccades ~ visibility * direction + (1 + visibility * direction | subject) + (1 | trial).Final random structure was determined based on the correlation matrix and PCA of the random structure (rePCA function from the lme4 package; Bates et al., 2015).If the model did not converge, we removed fixed effects that had no significant effect and simplified the random structure by first removing interactions, then random slopes (starting with the one with the smallest SD), but always keeping the random intercepts (Matuschek et al., 2017).Fixed and random effects of each model are provided in tables in the supplemental material.
Global fixed effects for each model were tested with a Type III ANOVA using the function implemented into the lmerTest package.The S. Walcher et al. emmeans package (Lenth et al., 2022) was used to calculate follow-up planned pairwise comparisons.An approximation to Cohen's d effect size was computed with the eff_size function from emmeans.Descriptive statistics (also in plots) were calculated with the summarySEwithin function from the Rmisc package (Hope, 2013) which adjusts confidence intervals for within-subject designs with the method from Morey (2008).
For analysis of task performance and subjective ratings, we calculated repeated-measures ANOVAs and repeated-measures pairwise ttests for planned pairwise comparisons.In all analyses, we considered effects with p < .05 as significant.

Results
In a first step, we examined how the experimental factors (task, visibility, matrix size) affected task performance, subjective experience, and pupil diameter.In the second step, we determined whether we find internal coupling in our current study.In the third step, we explored the characteristics and moderators of internal coupling.Since this study comprises many effects of interest, below we only mention the most important statistical parameters and the detailed statistics are provided in supplemental materials and the full analysis output, including every descriptive statistic and statistic parameter can be found here: htt ps://osf.io/pdga8/.
Participants used different strategies to perform the visuospatial WM task: 96% indicated that they imagined the matrix, 60% (also) used verbalization, 34% (also) named rows and columns, 10% (also) used their fingers, 2% (also) used chunking, and 12% (also) used another strategy not stated.
We then examined performance in the active listening task.Participants on average caught 15.82 of 16 oddballs (SD = 0.52) with an average reaction time of 634 ms (SD = 9), missed 0.08 oddballs (SD = 0.34), and produced 0.20 false alarms (0.54).The active listening task was rated as fairly undemanding (M = 1.24, on a scale from 0 not demanding to 6 very demanding), comparable to the visible 3 × 3 visuospatial WM task.
Finally, we analyzed median pupil diameter as an objective measure of workload (see Table S2 for descriptive statistics).We ran a LMM predicting pupil diameter with the fixed effects task (passive Listening, active Listening, visuospatial working memory), matrix visibility (invisible, visible), matrix size (3 × 3, 5 × 5), their interaction, and time of block.Final random structure included random intercepts for trial and subject, and random slopes per subject for matrix visibility, matrix size and the interaction of matrix visibility and matrix size, see Table S3.
In the visuospatial WM task, pupil diameter was smaller when the matrix was visible compared to invisible, and this effect was especially pronounced for the 5 × 5 matrix (3 × 3: t 71.7 = 7.65, p < .001,d = 0.21; 5 × 5: t 67.6 = 14.64, p < .001,d = 0.43).There was also a small effect of matrix visibility in the active listening task with 3 × 3 matrix (t = 2.67, p = .008,d = 0.06), but not in the 5 × 5 matrix or in the passive listening task (t's < 1.58, p's > 0.116, d's < 0.04).Interaction of Task and Visibility and follow-up comparisons further showed that the effect of matrix visibility was larger in the visuospatial WM task than in the active or passive listening task (see Tables S2, S3, S4 and Fig. S1).Hence, the effect of matrix visibility on pupil diameter in the visuospatial WM task was not merely an artifact of visual stimulation.
Next, we inspected effects of matrix size on pupil diameter in the visuospatial WM task.While the 5 × 5 matrix yielded larger pupil diameter compared to the 3 × 3 matrix with the invisible matrix, there were no matrix size effects on pupil diameter in the visible matrix (visible: t 83.8 = 0.55, p = .586,d = 0.01; invisible: t 84.9 = 10.27,p < .001,d = 0.23; see Fig. 2D).Please note, that in our study, participants were free to move their eyes.Changes of eye position lead to pupil foreshortening, namely a decrease in measured pupil diameter (Hayes and Petrov, 2016).More foreshortening could be expected for the 5 × 5 compared to the 3 × 3 matrix, and therefore some underestimation of the effects of matrix size on pupil diameter.
The passive listening task was accompanied by the smallest pupil diameter compared to all conditions (t 49 ≥ 5.95, p < .001,d ≥ 0.21).The active listening task showed a similar pupil diameter as the visuospatial WM task with visible matrix (t 49 ≤ 0.74, p ≥ .465,d ≤ 0.03), hence smaller than in the visuospatial WM task with invisible matrix (t 49 ≥ 6.67, p < .001,d ≥ 0.15).
To summarize, in the visuospatial WM task, having an empty matrix available made the visuospatial WM task easier, both objectively (performance and pupil diameter) and subjectively (perceived demand).In the invisible matrix condition, the 5 × 5 matrix was perceived as more demanding, harder to imagine, led to far fewer correct responses and larger pupil diameter than the 3 × 3 matrix.In the visible matrix condition, the 5 × 5 was also perceived as more demanding and harder to imagine than the 3 × 3 matrix, but performance and pupil diameter were unaffected by matrix size.

Visualization of eye behavior
Visualization of gaze position (Fig. 3A) and change in gaze position (Fig. 3B) provided an overview of raw eye behavior across all conditions.It suggested that, at least in the visuospatial WM task, gaze positions reflected the matrix fields and change in gaze positions reflected the auditory directions in specific conditions.Fig. 4A illustrates how many trials had a saccade in it and shows that the saccadic activity was not only overall higher in the visuospatial WM task, but systematically higher as seen by more coupled than uncoupled saccades.These observations are subjected to statistical testing in the following sections.

Task effects on internal coupling
First, we need to assess, whether the visuospatial WM task and the passive and active listening control tasks elicited coupled saccades (a saccade in the direction of the audio cue).In Fig. 4A and Table 1 we already see differences in the rate of coupling (proportion of trials with coupling), with coupling of saccades in around 50% of trials in the visuospatial WM task but in only around 10% of trials in the passive and active listening task.To test these differences statistically, we calculated a model with coupling of saccades (was there a coupled saccade in this trial or not) as binomial outcome and Task (passive listening, active listening, visuospatial WM), and Matrix Visibility (invisible, visible) and their interactions as fixed effects and the final random structure of random intercepts for trial and subject and random slopes per subject for Matrix Visibility and Task (see table S5 for fixed and random effects and Table 2 for pairwise comparisons).Time of block was excluded to achieve a convergence.We found a significant effect of Task (χ 2 2 = 81.25,p  2. The passive and active listening tasks triggered only a few coupled saccades (see Fig. 4A) and did not differ in their likelihood of triggering a coupled saccade (odds ratio: 0.78-0.85,z ≤ 1.56, p ≥ .352,d ≤ 0.15).The proportion of coupling is much smaller in the passive and active listening task compared to the visuospatial WM task.Next, not preregistered, we tested whether there is still a small bias towards the direction of the audio in the saccades made during the passive and active listening task (see Fig. 4A).Due to the low number of saccades in the two control tasks, we aggregated data to participant level: We calculated proportion of saccades in the correct direction (correct saccades/ total saccades) and in any other direction (uncoupled saccades see Fig. 4A, incorrect saccades/total saccades/3, divided by 3 to account for the three possible other directions besides the correct direction).If there is no bias, the proportion of saccades in the correct direction (direction of audio cue) and in any other direction (divided by 3) should be equal (close to 0.25).We calculated a within-subject ANOVA with proportion of saccades as dependent variable, and Task (passive Listening, active Listening, visuospatial WM), Saccade direction (correct, incorrect), and their interaction as predictors.Main effects and interactions were significant (F 1-2,49-98 ≥ 442, p < .001),and pairwise comparisons showed again a strong bias for coupled saccades in the visuospatial WM task (t 49 = 34.29,p < .001,d = 1.22).There was a small bias for coupled saccades in the passive listening task (t 49 = 2.93, p = .005,d = 0.10) but not in the active listening task (t 49 = 0.42, p = .674,d = 0.02).See Table S8 for descriptive statistics and pairwise comparisons.
To summarize, we found internal coupling in the visuospatial WM task.This coupling of saccades was not merely triggered by simply hearing the directions (passive listening) or semantic processing (active listening) but mainly by performing the visuospatial WM task in mind.

Moderators of internal coupling: matrix visibility, matrix size, direction of movement
After establishing that internal coupling occurs during visuospatial WM (but not during passive or active listening to direction cues), we continue to the next question: What are the characteristics and moderators of this internal coupling?Below, we test the potential moderating roles of matrix visibility, matrix size, and direction on the frequency (number of trials with coupled saccades), latency (time from cue onset to saccade onset), and amplitude (distance between starting and ending points of the saccade) of coupled saccades during the visuospatial WM task.Analyses below refer to data from the visuospatial WM task.

Frequency of coupled saccades
How often are coupled saccades triggered and how is this influenced by matrix visibility, matrix size, and direction of the imagined movement (Fig. 4B)?We attempted to calculate a Generalized Linear Mixed Model (GLMM) to predict the occurrence of coupled saccades (does a trial have a saccade or not, binomial data) based on matrix visibility, matrix size, audio direction, and their interactions and time of block as fixed factors.Unfortunately, despite reducing the random structure, the model failed to converge until the matrix size predictor was excluded, with minimal changes to the estimates of the remaining fixed.To still get an estimate of the effect of matrix size from a converging model, we ran a linear model on condition-level data (with proportion trials per condition) instead of trial-level data.This model converged and showed neither main nor interaction effect of matrix size (t ≤ 1.93, p ≥ .055;Table S6) except for a small 3-way interaction of visibility*matrix size*audio direction (right vs. up) (t = 2.26, p = .024),see also Fig. 4B and Table 1.Therefore, we continued without the factor matrix size and returned to trial-level data analysis.
We calculated a model (GLMM again) with matrix visibility, direction cue, and their interaction, and time of block as fixed effects.Time of block was excluded to achieve convergence.The final random structure comprised random intercepts for trials and subjects and random slopes per subject for matrix visibility and direction (fixed and random effects in Table S7, pairwise comparisons in Table 3, mean and confidence intervals in Fig. 5 A and Table 1).We found a main effect of visibility (χ 2 1,23 = 61.6,p < .001):Across directions cues, there were 3.57 to 12.50 times more trials with coupled saccades when the matrix was visible (means: 0.56-0.78)compared to invisible (means: 0.17-0.50).We found a main effect of direction cue (χ 2 3,21 = 113.0,p < .001)and an interaction of visibility and direction cue (χ 2 3,21 = 74.0,p < .001): in both the invisible and visible matrix, the horizontal directions (left, right) triggered coupled saccades 1.49 to 9.09 times more often than the vertical directions (up, down; Table 3, see also gaze direction in Fig. 4).This effect of horizontal vs. vertical directions was less pronounced in the visible condition compared to the invisible matrix condition, see Fig. 3 and 5B.Additionally, the down direction triggered 1.66 and 2.39 times fewer coupled saccades than the up direction in the visible and invisible matrix, respectively (z 49 ≥ 3.50, p < .003,d ≥ 0.27).
To summarize, saccades couple more often to the imagined movement of the patch when a visible matrix is available than when the matrix must be imagined.Matrix size (as operationalization of workload) does not affect coupling of saccades.The general pattern of more saccades in the horizontal than in the vertical direction was also present in the visuospatial WM task (see also Fig. S2 for horizontal bias in the passive and active listening task); however, this horizontal bias was smaller in the visible matrix compared to the invisible matrix condition, suggesting that eye behavior is more prone to general gaze behavior biases when the visuospatial WM task is performed fully in mind without external points of reference.

Timing of coupled saccades
Are there differences in the latency of coupled saccades between invisible and visible matrix and does matrix size and direction cue play a role?Saccade latency was calculated as the time between onset of a trial (onset of audio direction) and onset of the saccade.We performed an analysis (LMM) with matrix visibility, matrix size, direction cue, and their interactions, and time of block as fixed effects and the final random structure of random intercepts for trials and subjects and random slopes per subject for matrix visibility and direction.Main effect and interactions with matrix size were again not significant (ts ≤ 1.94; p ≥ .053)except for two small interaction effects (ts ≤ 2.47; p ≥ .014;see Table S9 for descriptive statistics and Table S10 for fixed and random effects).Hence, below, we report results of the model after exclusion of matrix size as factor (Table S11), see Fig. 5B.
To summarize, in the visuospatial WM task, coupled saccades were faster for visible compared to invisible matrix condition.Matrix size had no effect and there were subtle differences between directions.

Saccade amplitude
Are there differences in the amplitude (distance between starting and ending points of the saccade) of coupled saccades between invisible and visible matrix and does matrix size and audio direction have an effect?See Table S13 for descriptive statistics.We performed an analysis (LMM) with matrix visibility, matrix size, direction cue, and their interactions, and time of block as fixed effects and amplitude of coupled saccades as dependent measure (we excluded extreme outliers with amplitudes >10 deg).Final random structure included random intercepts for trial and subjects and random slopes per subject for matrix visibility and direction.Main effect and interactions with matrix size were not significant (t ≤ 1.70, p ≥ .090;see Table S14 for fixed and random effects).Hence, below, we report results of the model after exclusion of matrix size (Table S15).
We found a main effect of matrix visibility (F 1,55.7 = 13.15,p < .001),showing that saccade amplitudes were about 0.14-0.27degrees of visual angle smaller when the matrix was visible than when the matrix was invisible (Table S16), see Fig. 5C.Further, a significant main effect of direction cue (F 3,98.7 = 15.19,p < .001)showed that in both the invisible and visible matrix, the vertical directions (up, down) elicited smaller saccade amplitudes than the horizontal directions (right, left.The interaction of matrix visibility and direction cue (F 3,310.7 = 3.37, p = .019)indicated a slightly larger difference between up and right for the invisible matrix compared to the visible matrix.
To summarize, saccade amplitude decreased when a matrix was on the screen compared to when the screen was empty.Further, coupled saccades during vertical direction cues were smaller than during horizontal direction cues.

Variation in saccade amplitude
Besides differences in the directedness of saccades (coupling of saccades) and average amplitude of saccades, Fig. 3 also suggests that eye movements show higher variability in amplitude when the matrix is invisible vs. visible (see also Table S17 for descriptive statistics).Hence, we additionally (not preregistered) calculated within-subject root mean square error of saccade amplitude as a measure of variation in saccade amplitude.Since calculating variation in a measure requires a larger number of samples (in our case saccades), this measure was calculated per condition combination instead (invisible 3 × 3, invisible 5 × 5, visible 3 × 3, visible 5 × 5).We calculated an ANOVA, including only coupled saccades, with the predictors: matrix visibility, matrix size, direction cue and their interactions.Only matrix visibility was significant (F 1,24 = 30.66,p < .001)showing higher variation in amplitude when the matrix was invisible compared to visible.Matrix size, direction cue and all interactions were not significant (see Table S18), see Fig. 5D.

Association between coupling and performance
In the analyses above, we found no effect of matrix size on coupling, although matrix size increased task difficulty and was perceived as more demanding.To explore whether coupling is still associated with task performance, we additionally calculated the within-subject correlation between performance per block and rate of coupling within a block.Please note that performance was overall very high, leaving only few participants with variation in performance (not having 100% correct responses) for this analysis.Only the invisible 5 × 5 matrix had enough variation in performance to have a glimpse at possible associations between rate of coupling and performance.Spearman correlations varied strongly across participants from − 0.71 to 0.80 with a median of 0.05, see Fig. S4.Hence, there was no support for a consistent positive relationship of internal coupling rate with task performance.

Discussion
The goal of the present study was to identify the characteristics and moderators of internally coupled eye movements to further our understanding of the underlying mechanisms.We showed that eyes follow the imagined movement of a patch in a matrix.The observed coupling of saccades was not solely initiated by the act of hearing directions during passive listening or through semantic processing in active listening.Rather, it was predominantly driven by engaging in the visuospatial working memory (WM) task.The manipulation of workload via matrix size had expected effects on pupil diameter, perceived demand and performance, but we found no effect of workload on internal coupling.Coupling of eye movements was strengthened by the availability of spatial reference (having a matrix on screen vs. blank screen).This was evident through higher frequency, better precision, and reduced vulnerability to noise and horizontal bias in coupled saccades.Below we will discuss our results and their implications in more detail.We start by identifying the main trigger for internal coupling, followed by an exploration of workload's role, and finally, we discuss how availability of spatial reference affects coupled eye movements.

What task elicited internal coupling?
In the first part of our analyses, we showed that the visuospatial WM task triggered coupled saccades.This aligns with and extends findings from previous studies that found internal coupling of eye movements during memory retrieval (looking-at-nothing effect; e.g., Ferreira et al., 2008), attention shifts within mental representations (Gresch et al., 2023;van Ede et al., 2019a) and imagined movement (when trying to estimate time of contact of an moving object; Huber and Krist, 2004).Some studies even reported an effect of simply reading a word conveying spatial information (e.g., "sun"; Dudschig et al., 2013) on eye movements, suggesting that eye movements or spatial relations are associated with the memory content (also words) and are therefore co-activated when retrieving a memory or processing a cue (e.g., Mast and Kosslyn, 2002;Spivey and Geng, 2001).However, their studies showed effects in the form of speeding-up of congruent eye movements in the task, and not per se triggering a saccade (Dudschig et al., 2013).Our study found that internal coupling was mainly triggered by the performance of the visuospatial WM task and only marginally by simply hearing (passive listening) the direction cues.Consequently, our findings provide further support that the internal coupling phenomenon emerges primarily because eye movements follow shifts of attention within mental representations, similar to how they would when exploring the external world (e.g., van Ede et al., 2019a).

No effect of workload
Examining the reasons behind internal coupling of eye movements, workload was a key consideration.Eye movements during internally directed cognition may serve as a form of "cognitive offloading", linking memory to spatial locations (Risko and Gilbert, 2016).The presence of the matrix on the screen enhanced saccade coupling, indicating that participants likely utilized the visible matrix as a scaffold to facilitate mental navigation of the patch within the matrix, a process akin to "offloading."This raises the question: Are coupled saccades integral to offloading or only a byproduct of attention movement?Hence, when no spatial reference is provided and a task needs to performed solely in mind, does internal coupling of saccades continue to function as a form of offloading?If internally coupled saccades are strategically employed to lighten workload, higher workload should lead to increased coupling.
In our study, we observed no effects of workload, which we defined as the mental load placed on an individual without altering the task's quality.We manipulated workload through matrix size (3 × 3 vs. 5 × 5) thereby increasing the mental effort required to maintain, reconstruct and update the current position of the patch in the matrix.It's important to note that workload was defined and manipulated differently in other studies.This variation should be kept in mind when interpreting our findings and comparing them with existing literature.
Some studies suggested more internal coupling of eye movements with increasing workload (Draschkow et al., 2022;Gurtner et al., 2021;Kumcu and Thompson, 2020); see section 1.2 for a more detailed description of these studies).To test the effect of workload in our paradigm, we used a 3 × 3 and a 5 × 5 matrix size.In the absence of a spatial reference (invisible matrix), the 5 × 5 matrix increased pupil diameter, perceived demand and decreased performance compared to the 3 × 3 matrix, indicating a successful manipulation of workload.Nonetheless, higher workload did not increase internal coupling.The lack of workload effects suggest that internal coupling of eye movements might not be strategically used e.g., to reduce workload ("offloading"), but might rather be the effect of an automatic mechanism (e.g., linked to shifts of attention).Adding to that, our exploratory analysis of withinsubject correlations between rate of coupling and task performance showed no overall benefit of coupling.
Interestingly, we found large individual differences in the rate of internal coupling ranging from 1.5 to 78% (see Fig. S3).Future studies should explore those large individual differences, for example by investigating possible associations between working memory capacity, imaginative skills, and rate of internal coupling.Potentially, internal coupling is beneficial for some people as some sort of compensation mechanism, while it has no benefit or even detrimental effects for other people, resulting in an overall null-correlation between internal coupling and working memory capacity or imagery skills.For example, maybe internal coupling is effective for people with low working memory capacity or low spatial or object imagery skills.Recent studies found that lower working memory capacity (Klichowicz et al., 2022), lower cognitive resources (Wynn et al., 2019), and lower spatial imagery skills were associated with more looking-at-nothing behavior (Johansson et al., 2011).Similarly, in a 3D setting, one study found that lower object imagery skills are related to more looks back to now blank spaces when trying to imagine a previously seen object (Chiquet et al., 2022).
In contrast to that, strong visual imagination skills or higher working memory capacity might be required to build a visual mental representation that is strong enough for eye behavior to couple to it (Gurtner et al., 2021;LaBerge et al., 2018).These possible directions should be investigated further.

Spatial reference changes the characteristics of coupled eye movements
The availability of spatial reference during internal tasks like memory retrieval or mental imagery varies among studies (Gresch et al., 2023;Johansson et al., 2006;van Ede et al., 2019b), but its influence on eye behavior coupling was not clear yet.In the current study we tested the effect of spatial reference on coupling of eye behavior by having one condition in which an empty matrix remained on the screen (visible condition; compared to a blank screen in the invisible condition), allowing participants to use it as a scaffold for their task performance.The presence (or absence) of spatial reference emerged to be an important influencing factor of coupling of eye behavior.We found that having an empty matrix available resulted in substantially more coupled saccades, and coupled saccades were faster, shorter, had less variation in amplitude, and were less prone to the horizontal bias (see Fig. S2 and S5 for visibility effects in the passive and active listening condition).
When the matrix was invisible, hence the imagined movement was performed on a mental representation of the matrix, eyes followed the imagined movement around 50% of the time (for horizontal directions).In contrast, when spatial reference was available, eyes followed the imagined movement 75% of the time (for horizontal directions) when performing the visuospatial WM task.This increased coupling when spatial reference was available resembles findings from the looking-atnothing effect: When people have to remember information that was presented on specific positions, they look more often back to those positions when there is still some spatial reference on the screen, e.g., an empty grid or placeholders for the positions (Loaiza and Souza, 2022;Spivey and Geng, 2001).The current study extends the literature by showing that this effect is not only present when trying to remember something that was visually presented, but also during imagining movement in visuospatial WM.Given the strength of the spatial reference effect, it would be wise to consider the effect of availability of spatial reference when interpreting existing studies and when planning for future research.
Assessment of saccade amplitude and variation in saccade amplitude further showed that the lack of spatial reference made saccade amplitudes larger and noisier (see Fig. S11 for illustration of visibility effects on eye parameters in all three tasks).Without spatial reference on the screen, it becomes harder for the visual system to plan and execute saccades (Nuthmann et al., 2016).
We also assessed the horizontal bias in coupled eye movements.In free viewing conditions, humans generally make more horizontal than vertical (especially down) eye movements (Foulsham & Kingstone, 2010;Tatler and Vincent, 2009) and this horizontal bias is most likely the result of asymmetries in higher-level attentional processes (Ossandón et al., 2014).See Fig. S2 for illustration of the horizontal bias in all three tasks.We found that eyes couple more often to horizontal directions (left, right) than to vertical directions (up, down).The horizontal bias was smaller in the visible matrix compared to the invisible matrix condition in the visuospatial WM task.Why was the horizontal bias smaller when the matrix was visible and larger when the matrix was invisible during the visuospatial WM task?When the matrix is visible and the visuospatial WM task is performed, attention to the matrix and the resulting bottom-up input could overwrite some of the higher-level horizontal bias.When the screen is empty and the matrix has to be imagined, this imagination is less strong than actual perception (Koenig-Robert and Pearson, 2021) and therefore less able to overwrite the horizontal bias.
Hence, when helpful spatial reference is available, like an empty matrix, one can use this visual information as a scaffold instead of relying only on mental representations.By using the visual information as scaffold, task load can be reduced ("offloaded").In our paradigm, participants likely imagined the black patch moving along the visible matrix, thereby reducing the mental load of maintaining a mental representation of the whole matrix.In contrast, when no spatial reference is available, the matrix must be imagined.The absence of workload effects on coupled saccades in our study, in combination with the relatively high degree of coupling observed even in the invisible matrix condition, indicates that eye movements align with shifts in attention across both visually and internally represented information.This suggests that such eye movements are not merely strategic efforts to reduce workload ("offloading"), but rather a natural response to where attention is directed.
The mental representation of the matrix can be expected to be less vivid than the visual perception of the visible matrix (Koenig-Robert and Pearson, 2021), making it harder for the visual system to plan and execute eye movements (LaBerge et al., 2018).It takes more time and effort to produce an internal representation of the matrix than to build a perception of the visually available matrix (Koenig-Robert and Pearson, 2021).And the mental representation does not match the details and clarity of the perception, leading to delays in planning of and reduced accuracy of saccades (in specific, overshoot for short distances like in our paradigm; Nuthmann et al., 2016).Liu et al. (2022) demonstrated that not all shifts of attention within mental representations are associated with gaze shifts.They argued that although the shift of attention (co-) activates the oculomotor system, it does not always reach the threshold for outputting a saccade.The lower quality of mental representations compared to perception could be one reason why the threshold for outputting a saccade is less often reached.As a result, shifts of attention without spatial reference may more frequently occur covertly, lacking accompanying eye movements.This may also partly explain why we observed lower internal coupling in the invisible matrix condition.
Several studies showed that covert shifts of attention can have beneficial effects similar to eye movements (i.e., overt shifts of attention) on retrieving information associated with spatial locations (Godijn and Theeuwes, 2012;Scholz et al., 2018).In a similar vein, previous studies found that when participants had to perform an eye movement task (smooth pursuit, target-distractor task, or simply maintain fixation, respectively) in parallel to the visuospatial WM task, and hence internal coupling was restricted, performance was still high (Korda et al., 2023(Korda et al., , 2024;;Walcher et al., 2023).These findings suggest that eye movements follow internal shifts of attention automatically (Huber and Krist, 2004).

Limitations
Since this was an eye-tracking study, participants could have executed more saccades in accordance with the visuospatial WM task because they might have heard stories of "the eyes following the mind" and might have wanted to please the experimenter by showing exactly that behavior.We tried to counteract this effect by telling participants that this study examined effects on pupil diameter.Furthermore, by choosing matrix sizes that proved to be demanding in our in-house pilots (especially the 5 × 5 matrix), we tried to reduce the chance of intentional manipulation of eye behavior.Intentional eye movements require top-down control and cognitive resources.We showed in previous studies, that performing the visuospatial WM task even with the 3 × 3 matrix interferes with voluntary eye movements required by a parallel task (Korda et al., 2023;Walcher et al., 2023).Hence, it is unlikely that the pattern of internal coupling found in the current study is the result of participants' expectations and intentional manipulation of eye movements.
We recognize that the instructions given in the passive and active listening tasks might have influenced participants' natural tendencies for eye movement and attention.Specifically, the directive to remain disengaged in the passive listening task could inhibit spontaneous gaze shifts, while the active listening task's unique demands (react to numbers but not directions) might prompt participants to suppress eye movements to focus on the task at hand.If simply hearing spatial information (directions in our study, "up, down, left, right") already facilitates eye movements in the corresponding direction, one would still expect an effect on eye movements even if ocular activity was reduced.
In our study, "workload" is defined as the mental load imposed on an individual without changing the quality of the task, manifested through the complexity of the stimuli involved which increases the mental effort to maintain the information.The manipulation of matrix size achieved this by increasing the number of possible positions and paths of the patch in the matrix.This conceptualization and operationalization of workload partly differs from other studies and might explain divergence in results.
We acknowledge possible alternative explanations for the lack of differences between the 3 × 3 and 5 × 5 matrix regarding coupled saccades.Potentially, the 5 × 5 matrix was still not challenging enough to provoke a systematic use of coupling to support task performance.However, the self-reported demand and imageability as well as the increase in pupil diameter and drop in performance (from almost 100% to about 50%) in the invisible matrix conditions suggest an effective manipulation of workload.Additionally, our in-house piloting had not suggested a "kicking in" of coupling as compensation strategies when matrix size was increased even further.Another alternative explanation could be a ceiling effect: the 3 × 3 matrix might have already triggered the maximum coupling response in the invisible matrix condition, making further increases in matrix size irrelevant.For example, coupling of eye movements in the invisible matrix may be limited by the absence of visual cues and the inherent quality of mental imagery, as discussed earlier.Therefore, additional research incorporating a wider range of workload levels and spatial reference conditions is necessary to conclusively address alternative explanations.

Conclusion
This study presents a comprehensive analysis of the characteristics and moderators of internally coupled eye movements during visuospatial WM and thereby provides valuable insights into the underlying mechanisms at play.Our findings revealed that eye movements frequently align with the execution of imagined movements, and this is not merely a result of auditory or semantic direction processing.When spatial reference, such as a visible matrix, was available, participants exhibited more frequent and accurately coupled saccades suggesting that spatial reference was used to scaffold imagination.The absence of workload effects on coupled saccades in our study, in combination with the relatively high degree of coupling observed even in the invisible matrix condition (ca.50%), indicates that eye movements align with shifts in attention across both visually and internally represented information.This suggests that coupled movements are not merely strategic efforts to reduce workload, but rather a natural response to where attention is directed.We further found substantial individual differences in internal coupling effects, highlighting the need for deeper investigation into how differences in working memory capacity and the quality of mental representations influences internal coupling.

Fig. 1 .
Fig. 1.Time Course of a Single Block in the Visuospatial Working Memory Task.Note.Each block started with the information of the block's task (visuospatial WM, active listening, passive listening; not depicted here).Then the block's starting stimulus was presented, which was a 3 × 3 or 5 × 5 matrix with one patch filled (visuospatial WM task) or an empty matrix (active and passive listening task).After 1 s, the trials started.In half of the blocks, the empty matrix remained on the screen (visible condition), in the other half the screen was empty (invisible condition).Each trial within the block started with the 0.6 s presentation of the direction cue (left, right, up, down) via speakers and lasted for 1 s, before the next trial started.In the active listening task, some direction cues were replaced by numbers and participants were instructed to press the spacebar as fast as possible when hearing one of these numbers.In the passive listening task, the visual and auditory setup was the same as in the visuospatial WM task, but participants had no task to perform.Each block comprised 10 trials.At the end of the block, an empty matrix appeared and participants were prompted to report the last position of the patch (visuospatial WM task) or not (active and passive listening task) and continue by spacebar press.Matrix in figure is not up to scale.

Fig. 2 .
Fig. 2. Performance (A), Perceived Demand (B), Perceived Imagination (C), and Pupil Diameter (D) in the Visuospatial WM Task as a Function of Matrix Visibility and Matrix Size.Note.Bars represent overall mean; error bars indicate 95% confidence intervals corrected for within-subject design.Lines represent each participant's mean (plus jitter to avoid overplotting).The plots show means and confidence intervals calculated for the presented data, not derived from the statistical model used for drawing inferences.

Fig. 3 .
Fig. 3. Distribution of Gaze Positions (A) and Gaze Position Change during Saccades (B) as a Function of Task, Matrix Visibility, and Matrix Size.Note.This visualization of the gaze position distribution allows a first glimpse at how eye behavior was determined by task, matrix visibility, and matrix size.We zoomed in to 5 • visual angle from screen center.Gaze position change reflects the difference in gaze position from saccade starting position to saccade landing position.

Fig. 4 .
Fig. 4. Proportion of Trials with Saccades per Task (A), and Proportion of Trials with Coupled Saccades within the Visuospatial WM Task as a function of Matrix Visibility, Matrix Size, and Audio Direction (B).Note.Prop.Trial with Coupling stands for the number of trials that show a saccade in the direction of the direction cue divided by the number of valid trials (values can range from 0 to 1).The plots show means and confidence intervals calculated for the presented data, not derived from the statistical model used for drawing inferences.Lines represent means of each participant.Error bars indicate 95% confidence intervals, corrected for within-subject design.vsWM = visuospatial working memory task.

Fig. 5 .
Fig. 5. Proportion of Trials with Coupled Saccade (A), Latency (B), Amplitude (C), and Variation in Amplitude of Coupled Saccades (D) in the Visuospatial WM Task as a Function of Matrix Visibility and Audio Direction.Note.Prop.Trial with Coupling stands for the number of trials that show a saccade in the direction of the direction cue divided by the number of valid trials (values can range from 0 to 1).Saccade latency was calculated as the time between onset of a trial (onset of audio direction) and onset of the saccade.The plots show means and confidence intervals calculated for the presented data, not derived from the statistical model used for drawing inferences.Lines represent means of each participant.Error bars indicate 95% confidence intervals, corrected for within-subject design.

Table 1
Descriptive Statistics of Proportion of Trials with Coupled Saccades.

Table 3
Proportion of Trials with Coupled Saccades, Pairwise Comparisons of Matrix Visibility and Direction Cue.
Note. 95% Confidence intervals; Effect size approximates Cohens' d; inv.Odds Ratio = inverse of the Odds Ratio, hence comparison in the other direction.