Left hemispatial neglect and overt orienting in naturalistic conditions: Role of high-level and stimulus-driven signals

Deficits of visuospatial orienting in brain-damaged patients affected by hemispatial neglect have been extensively investigated. Nonetheless, spontaneous spatial orienting in naturalistic conditions is still poorly understood. Here, we investigated the role played by top-down and stimulus-driven signals in overt spatial orienting of neglect patients during free-viewing of short videos portraying everyday life situations. In Experiment 1, we assessed orienting when meaningful visual events competed on the left and right side of space, and tested whether sensory salience on the two sides biased orienting. In Experiment 2, we examined whether the spatial alignment of visual and auditory signals modulates orienting. The results of Experiment 1 showed that in neglect patients severe deficits in contralesional orienting were restricted to viewing conditions with bilateral visual events competing for attentional capture. In contrast, orienting towards the contralesional side was largely spared when the videos contained a single event on the left side. In neglect patients the processing of stimulus-driven salience was relatively spared and helped orienting towards the left side when multiple events were present. Experiment 2 showed that sounds spatially aligned with visual events on the left side improved orienting towards the otherwise neglected hemispace. Anatomical scans indicated that neglect patients suffered grey and white matter damages primarily in the ventral frontoparietal cortex. This suggests that the improvement of contralesional orienting associated with visual salience and audiovisual spatial alignment may be due to processing in the relatively intact dorsal frontoparietal areas. Our data show that in naturalistic environments, the presence of multiple meaningful events is a major determinant of spatial orienting deficits in neglect patients, whereas the salience of visual signals and the spatial alignment between auditory and visual signals can counteract spatial orienting deficits. These results open new perspectives to develop novel rehabilitation strategies based on the use of naturalistic stimuli.


Hemispatial neglect
Overt orienting Naturalistic settings Eye movements Stimulus salience a b s t r a c t Deficits of visuospatial orienting in brain-damaged patients affected by hemispatial neglect have been extensively investigated. Nonetheless, spontaneous spatial orienting in naturalistic conditions is still poorly understood. Here, we investigated the role played by topdown and stimulus-driven signals in overt spatial orienting of neglect patients during freeviewing of short videos portraying everyday life situations. In Experiment 1, we assessed orienting when meaningful visual events competed on the left and right side of space, and tested whether sensory salience on the two sides biased orienting. In Experiment 2, we examined whether the spatial alignment of visual and auditory signals modulates orienting. The results of Experiment 1 showed that in neglect patients severe deficits in contralesional orienting were restricted to viewing conditions with bilateral visual events competing for attentional capture. In contrast, orienting towards the contralesional side was largely spared when the videos contained a single event on the left side. In neglect patients the processing of stimulus-driven salience was relatively spared and helped orienting towards the left side when multiple events were present. Experiment 2 showed that sounds spatially aligned with visual events on the left side improved orienting towards the otherwise neglected hemispace. Anatomical scans indicated that neglect patients suffered grey and white matter damages primarily in the ventral frontoparietal cortex. This suggests that the improvement of contralesional orienting associated with visual salience and audiovisual spatial alignment may be due to processing in the relatively intact dorsal frontoparietal areas. Our data show that in naturalistic environments, the presence of multiple meaningful events is a major determinant of spatial orienting deficits in neglect patients, whereas the salience of visual signals and the spatial alignment between auditory Introduction Stroke is one of the main causes of permanent disability in Western countries. When occurring in the right hemisphere, it commonly results in hemispatial neglect, a complex neurological syndrome characterised by reduced ability to spatially orient towards the contralesional (left) hemispace (Bartolomeo, Thiebaut de Schotten, & Doricchi, 2007;Doricchi, de Schotten, Tomaiuolo, & Bartolomeo, 2008;Husain, 2008;Vallar, 1998). The presence of multiple stimuli/objects competing for processing resources reduces the ability of neglect patients to orient in space (e.g., cancellation tasks, see Albert, 1973;Menon & Korner-Bitensky, 2004;Rorden & Karnath, 2010; and visual extinction test, see Karnath, 1988;Driver & Vuilleumier, 2001). Consistent with an attentional (rather than a perceptual-based) explanation of neglect, spatial orienting deficits and visual extinction can be modulated by low-level characteristics of the stimuli, such as similarity or perceptual grouping (Baylis, Driver, & Rafal, 1993;Gilchrist, Humphreys, & Riddoch, 1996;Ward, Goodrich, & Driver, 1994), as well as high-level factors such as expectation, working memory content, task-demand and/or actionrelatedness (Ptak, Valenza, & Schnider, 2002;Rafal, Danziger, Grossi, Machado, & Ward, 2002;Riddoch, Humphreys, Edwards, Baker, & Willson, 2003;Soto & Humphreys, 2006;Wulff & Humphreys, 2013; see Riddoch, Rappaport, & Humphreys, 2009 for review).
Although extremely useful in clinical settings, traditional experimental paradigms fail in capturing the complexity of signals occurring in real-life situations, where both high-and low-level factors jointly contribute to govern orienting behaviour (Macaluso & Doricchi, 2013;Nardo, Console, Reverberi, & Macaluso, 2016;Santangelo, Olivetti Belardinelli, Spence, & Macaluso, 2009). A few recent studies assessed spatial orienting deficits using more naturalistic conditions (Fellrath & Ptak, 2015;Machner et al., 2012;Mü ri, Cazzoli, Nyffeler, & Pflugshaupt, 2009). Machner et al. (2012) presented neglect patients with both static pictures and dynamic videos of natural scenes in free-viewing conditions. Their results revealed that low-level sensory features (e.g., brightness, colour, static and dynamic contrasts) contribute to overt eye-movements in neglect, with patients fixating more regions with a high sensory contrast in the contralesional space. The same study also included an active search condition, where participants had to detect a predefined target (e.g., pressing a button if the scene contained a bus). In the control group, this led to a reduced contribution of low-level signals, with participants orienting towards the target location even when this had little contrast. However, this was not the case in neglect patients, suggesting alterations in the mechanisms regulating the interactions between top-down and stimulusdriven signals (see also Ptak & Fellrath, 2013).
These previous studies manipulated top-down, endogenous signalling by introducing explicit, goal-directed tasks. However, in real-life situations endogenous signalling is also associated with other types of high-level signals, such as those arising from the processing of meaningful events. While competition between stimulus-driven signals can be characterised by using computational models based on low-level features (e.g., Saliency Maps; see Itti, Koch, & Niebur, 1998;Itti & Koch, 2001), competition between semantic events needs to be addressed differently. The latter is based on the relationship between an object/agent and a meaningful context. In a recent study on healthy participants, we used short videos of everyday life situations in free-viewing conditions (i.e., without an explicit goal-directed task). We assessed the impact of low-level competition by using Saliency Maps and high-level competition by varying the number of semantically-relevant events (single vs multiple; cf. Nardo et al., 2016). The results showed that stimulus-driven salience affected spatial orienting only in presence of multiple competing (but not single) semantic events, indicating an interaction between stimulus-driven and internal/semantic signalling during processing of naturalistic stimuli, in the absence of any goal-directed task (cf. Machner et al., 2012).
Here, we investigated how high-level (i.e., distinctive and context-related semantically meaningful visual events) and stimulus-driven signals (visual salience and audiovisual spatial alignment, see below) contribute to spatial orienting behaviour of neglect patients (cf. Snow & Mattingley, 2006) using dynamic naturalistic stimuli and in the absence of any goal-directed task. The first experiment included short videos without any sound (visual only experiment, 'Vonly') and the participants were asked to freely view the stimuli. Operationally, we manipulated the competition between high-level representations by presenting videos that either included a single semantically meaningful event on one side, or multiple events on both sides of space (cf. Nardo et al., 2016). The time spent looking towards the left/right side was the primary dependent measure. Based on previous findings that brain damage in patients with hemispatial neglect would primarily entail the ventral attentional system (Corbetta & Shulman, 2011;Karnath & Rorden, 2012;Mort et al., 2003;Vallar & Perani, 1986), we hypothesised that these patients would show a contralesional orienting deficit selectively when stimuli contain multiple events competing for processing resources (see Ptak & Valenza, 2005;Geng & Behrmann, 2006, for related results using visual displays with simple stimuli), while the processing of stimulus-driven signals would be relatively intact even in the contralesional hemispace (cf. c o r t e x 1 1 3 ( 2 0 1 9 ) 3 2 9 e3 4 6 Machner et al., 2012; but see He et al., 2007, about the role of connectivity between the two systems, and see Discussion).
In the second experiment, we investigated orienting behaviour in naturalistic audiovisual conditions ('AVstim'). While the presence of multiple objects/events within a visual scene generally intensifies spatial orienting deficits in neglect patients (Bartolomeo & Chokron, 2001;Geng & Behrmann, 2006;Coulthard, Nachev, & Husain, 2008;Riddoch & Humphreys, 1983), multiple stimuli in different sensory modalities e but at the same location e can improve the orienting performance of neglect patients (Frassinetti, Pavani, & L adavas, 2002;Robertson, Mattingley, Rorden, & Driver, 1998;Van Vleet & Robertson, 2006). Here, we made use of short videos portraying everyday life situations, with sounds presented either on the same side (spatially-congruent) or on the opposite side (spatiallyincongruent) of the main semantically-distinctive visual event. Based on our previous imaging results in healthy subjects showing audiovisual spatial interaction in the dorsal parietal cortex (Nardo, Santangelo, & Macaluso, 2014), we predicted that neglect patients would show spared influences of audiovisual spatial alignment on orienting behaviour. In particular, we expected that coupling a left visual event with a spatially-congruent sound also on the left side would increase orienting towards the contralesional hemispace (Frassinetti et al., 2002).
To summarise, we used naturalistic stimuli to investigate the role of high-level, semantically-distinctive visual events and stimulus-driven signals (visual salience and audiovisual spatial alignment) for overt spatial orienting in neglect patients. Unlike the vast majority of previous studies on spatial orienting and orienting deficits, here the manipulation of high-level endogenous signals did not entail any task-directed goal, but rather concerned implicit processing related to internal knowledge (cf. also Riddoch et al., 2003) that characterises any everyday life situation. The use of this methodological approach should allow us to bridge the gap between previous results produced in highly controlled (but rather artificial) laboratory conditions and real-life situations that neglect patients experience in their everyday life.

2.
Material and methods
Demographic and clinical data of all participants are summarized in Table 1. Nþ and NÀ were recruited among hospitalized braindamaged patients at the Santa Lucia Foundation, Rome, Italy. The study was approved by the independent Ethics Committee of the Foundation. HS were recruited by means of private announcing. Exclusion criteria for patients (both Nþ and NÀ) were: tumour aetiology, presence of left-sided or diffuse/bilateral brain lesions, and presence of speech impairments. Healthy subjects reported no history of psychiatric or neurological disease or drug abuse. All patients and healthy subjects were right-handed, had normal or corrected-tonormal (contact lenses) visual acuity, as well as self-reported normal hearing. After having received instructions, all participants gave their written informed consent.
All patients were re-assessed on the day of the experiment with the letter cancellation and line bisection tests, to evaluate any change in neglect severity as a function of the time elapsed from lesion occurrence, plus the gap detection task (Ota, Fujii, Suzuki, Fukatsu, & Yamadori, 2001), to exclude the presence of allocentric neglect (see Table 1). HS also underwent these three neuropsychological tests before the beginning of the experiment, to exclude the presence of any visuospatial deficit. Patients also underwent a dynamic perimetry (to exclude the presence of any visual field reduction), and a visual extinction task (as described in Lecce et al., 2015).

Experimental design
This study included two behavioural experiments. In Vonly experiment, participants were presented with videos portraying everyday life scenes to investigate visuospatial orienting deficits in complex visual environments. The role of competition between co-occurring visual events and the role of stimulus-driven signals (salience) was investigated by presenting distinctive visual events either as single events lateralised to the left/right side of space (Lat-trials), or as multiple events presented across both sides (Multi-trials), and by quantifying stimulus salience using a computational approach (see Visual salience). In AVstim experiment, c o r t e x 1 1 3 ( 2 0 1 9 ) 3 2 9 e3 4 6 naturalistic sounds were delivered together with everyday life scenes to investigate the effect of crossmodal spatial interactions on spatial orienting deficits. We manipulated the spatial alignment between the side of the visual event and the side of sound delivery, thus obtaining spatially-congruent versus spatially-incongruent stimuli. In both experiments, we asked our participants to freely view the videos. The dependent variable was the ratio of time spent looking towards the left versus right visual hemispace, that is, a measure of efficacy of the stimuli in guiding visuospatial orienting (Gaze_idx, see Eye-movements).

Vonly
Stimuli consisted of 140 videos showing everyday life situations (cf. Nardo et al., 2016). The videos were non-Italian TV commercial clips, either purchased from an Advertising Archive (http://www.coloribus.com) or downloaded from YouTube. Using a video editing software (Final Cut Pro, Apple Inc.), we selected 1.5 sec video-segments that included a single continuous meaningful scene with either one lateralised distinctive event (Lat-trials) or multiple events on both sides (Multi-trials). The majority of distinctive events consisted in one or more persons in the foreground, who either performed an action (e.g., walking, dancing, manipulating objects, etc.) or changed posture. In approx. 10% of videos, the event consisted in a moving vehicle (car, motorbike, plane, etc., equally distributed across conditions). The selected segments did not include any writings/watermarks in the foreground. Stimuli were further divided into 'Left' and 'Right' as follows. Lateralised stimuli were sorted into L/R according to the side of the distinctive visual event. Videos containing multiple visual events were categorised as L/R on the basis of the corresponding Saliency Maps (Sal_idx, see Visual salience). The full set of videos included 80 lateralized trials and 60 multiple trials, equally split into left-and right-trials.

AVstim
Stimuli consisted of 96 short videos created on purpose (cf. Nardo et al., 2014) displaying everyday life situations in real environments. Each stimulus contained a distinctive main visual event on the left or right side, plus an environmental sound associated with the visual scene. The main visual event consisted of either an action performed by the agent (e.g., someone putting an object on a table) or the setting off of a device (e.g., switching on the TV). The sound was produced either by the actor's action (e.g., the noise of the object .0 (À) . 0( À) . 0 ( À) . 0( À) n.s. n.s. n.s. n.s.
Test scores express the difference of items correctly detected in the left and right hemispace. Line bisection is expressed in mm of rightward bias from the midpoint. Note that, although on average at the time of the experiments the severity of neglect symptoms had reduced from the admittance to the care centre (cf. the different values on the letter cancellation and line bisection tests), Nþ still showed significant different scores from both NÀ and HS.
c o r t e x 1 1 3 ( 2 0 1 9 ) 3 2 9 e3 4 6 hitting the table) or by an electronic device present in the scene (e.g., computer, mobile phone, radio, TV, etc.). Stimuli had a duration of 2.5 sec. By crossing the side of the main visual event and the left/right position of the sound, we obtained four main experimental conditions: spatiallycongruent AV trials on the left or right side (Lcon and Rcon), and left/right spatially-incongruent AV trials (Linc and Rinc, where the L/R label refers to the position of the visual event). In congruent trials, the sound was produced either by the same agent associated with the main visual event or by a different one. In incongruent trials, the sound was still produced by a person/object in the scene, but not the one associated with the main visual event. In addition, all videos were also presented without any sound (NoS: nosound conditions). This allowed us to ensure that any difference between congruent and incongruent conditions did not simply arise because of some uncontrolled visual feature distinguishing the two classes of videos (cf. Results). Both the distinctive visual event and the sound took place approx. 1 sec after the video onset. There were 56 spatiallycongruent trials and 40 spatially-incongruent trials (and 96 silent versions of the stimuli as a control). Within each category, 50% of the visual events included a human agent and 50% a device, balanced for left/right side of presentation.

Procedure
The experiments (10 min each) were carried out in a quiet and dimly lit room. The participants were seated in a comfortable way in front of a laptop equipped with a portable eye-tracking system (RED-m Eye Tracking System 3.2; SensoMotoric Instruments, operating at 120 Hz), at a distance of approx. 60 cm from the screen. In order to facilitate patients' accomplishment, the calibration of eyes position was based on a single (central) fixation point, then validated with a four points (corners) procedure. The visual stimuli covered a visual angle of approx. 25 Â 14 deg. In AVstim experiment, the auditory stimuli were delivered unilaterally either on the left or right side of the scene by means of two loudspeakers placed close to the screen of the laptop. In both experiments, participants were simply asked to freely view the videos, without any explicit task. In Vonly experiment, lateralised and multiple stimuli were counterbalanced by presenting half of the stimuli in their original orientation (i.e., left is left, right is right), and the other half in a flipped configuration (i.e., left is right, right is left) as a control for possible left/right biases in the videos related to their specific content both in terms of events (e.g., size, presence of objects/people) and stimulusdriven signals (i.e., salience). Videos presented in the original or flipped versions were counterbalanced across subjects, so that each of the 140 videos was presented to half of the subjects in the original, and to the other half in the flipped version.
In AVstim experiment, half of the stimuli were presented with sounds (S), whereas the other half included only silent versions of the videos (NoS). This allowed assessing the effect of sound presence, over and above any spatial bias associated with the visual component of the stimuli (i.e., the interaction between left/right side of the visual event and presence/ absence of the sound), and any uncontrolled differences between congruent and incongruent videos (note that each video could be used only for one of these two conditions). The presentation of sound (S) and silent (NoS) conditions were counterbalanced across subjects, so that each of the 96 videos was presented to half of the subjects with sound and to the other half without any sound. The order of presentation of the 96 stimuli was randomized across subjects.
The Inter-Trial Intervals varied between 3.5 and 5 sec in Vonly experiment (mean 4.25) and 4.5e6 sec in AVstim experiment (mean 5.25), during which a fixation point was presented at the centre of the screen.

Visual salience
For Vonly Experiment, we indexed the level of lateralisation of stimulus-driven visual signals using the computational model of visual salience proposed by Itti and colleagues (Itti et al., 1998;Itti & Koch, 2001). The videos were analysed using the MT_TOOLS toolbox developed in-house (http://www. slneuroimaginglab.com/mt-tools). Saliency Maps were computed by using local centre-surround contrasts separately for intensity, colour, orientation, flicker and motion (Itti et al., 1998;Itti & Koch, 2001). This generates a series of conspicuity maps that were then combined into a unique Saliency Map by equally weighting each visual feature. The resulting Saliency Map displays the most salient locations within a bidimensional space, representing the vertical and horizontal axes of a given visual stimulus for each frame of the video. On the basis of the Saliency Maps, a visual salience index (Sal_idx) was computed as the ratio between the salience of the left and the right side of the video. For each video, on a frame-by-frame basis, we extracted the mean salience separately for the two sides, excluding a central area of 2 deg. These values were averaged across all frames of the video. The values for the two sides were then subtracted and normalised [(RÀL)/(R þ L)], so as to obtain a single index ranging between 1 (salience fully lateralised on the right side) and À1 (salience fully lateralised on the left side). The Sal_idx of all videos with a single lateralised event (Lat-trials) was congruent with the side of the visual event (i.e., positive Sal_idx for all Rlat-trials, and negative Sal_idx for all Llat-trials). For videos including multiple events (Multi-trials), the Sal_idx was positive for half of the stimuli and negative for the other half. Thus, Sal_idx was used to categorise these videos into left and right conditions (i.e., Lmulti-and Rmulti-trials).

Eye-movements
The dependent variable of the present study was the ratio of time participants spent fixating on the left versus right hemispace (Gaze_idx). Raw eye-tracking data were processed using the MT_TOOLS toolbox. Fixations were defined as gazeposition remaining within an area of 1.5 Â 1.5 deg. for a minimum duration of 100 msec. In order to obtain a specific measure of how the presentation of the short videos affected orienting behaviour, we ensured that the participant's gaze was central before the video onset. For each subject and each trial, we considered only eye-traces where the pre-stimulus gaze position was within ±2 deg. of the centre of the screen. The selection of trials with central fixation allowed us to ensure the correspondence between left/right side of the stimulus and left/ right side of the stimulated visual field at the start of each trial. Furthermore, this enabled us to specifically target overt spatial shifts associated with salience and meaningful events, rather than any stimulus-unrelated sustained bias of gazeposition (see also Table S1, in the Supplementary Material reporting the percentages of trials excluded because of prestimulus biases). In addition, we considered only trials where at least 50% of data points during the presentation of video could be categorised as fixations, i.e., excluding trials including many blinks and/or other artefacts. We counted how many trials satisfied the specified criteria and discarded participants who had less than 4 trials for each experimental condition (see also Participants).
The computation of the Gaze_idx comprised several steps. First, for each subject, we extracted the time spent fixating the left and right side of the display (Ltime and Rtime), considering the full stimulus duration in Vonly experiment, and the 1.5 sec window after the onset of the main visual event/sound in AVstim experiment. The computation excluded any datapoint falling into a 2 deg. central area, because small deviations of horizontal gaze-position around the centre of the screen (even below the spatial precision of our measurement) would inappropriately affect the Ltime/Rtime ratios. Second, we computed the difference between the time spent in the two hemispaces and normalised this between 1 and À1 [i.e., (RtimeÀLtime)/(Rtime þ Ltime); cf. Sal_idx above]. For each video, we obtained a Gaze_idx by averaging the individual values across participants, separately for the four groups (NþHe, NþHþ, NÀ, HS).
Gaze_idx provides us with a video-specific measure of the orienting bias towards one or the other visual hemispace: positive values for longer times spent with gaze on the right side, and negative values for longer times spent with gaze on the left side. To facilitate the interpretation of statistical analyses, Gaze_idx values were further transformed to obtain positive values reflecting orienting towards the side of the main visual event and/or the side of the spatially-congruent audiovisual stimuli. Accordingly, for Vonly experiment Gaz-e_idx of left conditions (Llat and Lmulti) were multiplied by À1. For AVstim experiment, the transformation accounted for both the side of the visual event and the no-sound control condition, when participant watched the video without any sound (NoS). Thus, we first subtracted Gaze_idx in NoS condition from Sound condition and then, for trials with a visual event on the left side, multiplied the resulting value by À1. The resulting index will be positive when adding a sound increased the time participants spent with gaze on the side of the visual event, and negative when adding the sound decreased the time spent on the side of the visual event. Thus, we expected this measure to be positive for spatiallycongruent AV conditions, and negative for incongruent conditions.
The transformed data were used for statistical analyses that tested the effects of conditions and groups. For completeness, we also report the corresponding untransformed data (see Supplementary Figure S1). Please note that such data transformations do not affect statistics, they just reduce the number of factors and simplify the presentation of ANOVAs results.
It should be noticed that this index provides us with a global measure of orienting over the entire stimulus duration. Thus, in a set of additional and non-independent analyses, we considered the position and the timing of the first fixations to gain insights about the temporal dynamics of spatial orienting following the video/stimulus onset. This allowed us to answer the question of whether any left/right bias observed for our global/full-video measures was already present at the level of the initial orienting response, or rather may reflect some later (possibly more 'strategic') processing phase. For these additional analyses, each condition was further sub-categorised according to the position of the first fixation (e.g., Llat trials were subdivided in Llat-Lfix and Llat-Rfix), which precluded us from performing these additional tests for the AVstim experiment that entailed too few trials when sorted according to the side of first fixation.

Structural imaging and lesion mapping
We sought to confirm the overall lesion patterns associated with hemispatial neglect using anatomical scans. Patients underwent a standard neuroradiological assessment including Magnetic Resonance Imaging (MRI) scans of the brain, according to standard stroke protocols at the Radiology Unit of the Santa Lucia Foundation. Brain scans included an MPRAGE T1-weighted sequence (TR ¼ 2.5 sec, TE ¼ 2.74 msec, voxel size 1 Â 1 Â 1 mm, matrix resolution 256 Â 256 Â 176, axial acquisition), as well as a fluid attenuated inversion recovery (FLAIR), T2-weighted, and Proton Density sequences obtained with standard parameters on a 3T Siemens Allegra scanner. When an MRI scan was not possible (due to contraindications for the patient), a Computerized Tomography (CT) was acquired instead. Individual lesions were drawn by a trained physician (BS) on either MR (n ¼ 25) or CT (n ¼ 18) scans and doublechecked for accuracy by a senior neurologist (MB) experienced in reading brain scans, both blind to the medical history of patients. Hypointense lesions were outlined directly on the MPRAGE T1-weighted (or hypodense lesions on CT) slices using a semi-automated local thresholding contouring software (Jim 5.0, Xinapse System, Leicester, UK, www.xinapse.com) and were then normalized to the standard MNI space by using ANTs 1.9.x (picsl.upenn.edu/software/ants) to obtain an optimized spatial transformation (Avants et al., 2011).
Lesion overlaps are shown in Fig. 1A. In order to confirm the implication of the ventral frontoparietal cortex in neglect (Corbetta & Shulman, 2011;Mort et al., 2003;Vallar & Perani, 1986), we carried out regions-of-interest (ROIs) analyses comparing the frequency of frontoparietal lesions between groups (2-tailed Fisher exact test). We assessed the global level of damage of both ventral and dorsal frontopariatal networks, considering grey matter (GM) and white matter (WM) ROIs. Both GM and WM ROIs were defined based on available brain atlases and, thus, independently of the current MR and CT data. The GM ROIs were created using the AAL atlas (Tzourio-Mazoyer et al., 2002). The ventral frontoparietal network (vFP; 52.6 cm 3 ) was defined as the inferior c o r t e x 1 1 3 ( 2 0 1 9 ) 3 2 9 e3 4 6 frontal gyrus (both opercular and triangular parts), plus the inferior parietal lobule (supramarginal and angular gyri). The dorsal frontoparietal network (dFP; 38.3 cm 3 ) was defined as the superior frontal gyrus plus the superior parietal lobule. The WM ROIs were created using the Tractotron atlas (Rojkova et al., 2016). The three branches of the superior longitudinal fasciculus (SLF I, II, III; 16.1, 20.5 and 22.1 cm 3 , respectively) were extracted and thresholded at a probability of !.9. Each ROI was scored as 'lesioned' if at least 5% of its volume was damaged (see Fig. 1B). Each ROI was scored as lesioned if at least 5% of its volume was damaged. Legend: NþH¡ ¼ neglect patients without hemianopia; NþHþ ¼ neglect patients with hemianopia; N¡ ¼ right hemisphere brain-damaged patients without neglect; GM ¼ grey matter; WM ¼ white matter; vPF ¼ ventral frontoparietal network; dFP ¼ dorsal frontoparietal network; SLF ¼ superior longitudinal fasciculus; I, II, III ¼ branches of the SLF.

3.2.
Visual experiment (Vonly) Fig. 2 shows the visual exploration patterns, as a function of condition and group. Our main analyses quantified these patterns in term of the ratio of time spent looking toward the left/right side (see Methods). First, we assessed the influence of high-level signals by comparing videos that included a single event lateralised on one side (Lat-trials) with videos including multiple competing events on both sides (Multitrials). We performed a mixed three-way ANOVA on transformed Gaze_idx data with Group as between factor (NþHÀ, NþHþ, NÀ, HS), and Competition (Lat, Multi) and visual Side (Left, Right) as within factors. We expected that the level of competition would modulate the orienting deficits in the contralesional hemispace in neglect patients, which would be captured by the three-way interaction. All main effects and interactions were significant (see Table 3). Because the threeway interaction Group £ Competition £ Side was significant (p < .001), we will focus on this. Fig. 3A shows the time spent looking on the left/right side as a function of Group, Competition and Side. HS participants (green bars) oriented systematically towards the main visual event in Lat-trials and showed some tendency to orient towards the most salient side in Multi-trials (see below). NþHe patients (red bars) showed a pattern similar to HS in Lat-trials, irrespective of side. Accordingly, when an NþHe patient was presented with a video including a single lateralised visual event on the left side, s/he oriented systematically towards the contralesional hemispace. By contrast, when the videos included multiple visual events on both sides, a substantial deficit emerged. This time NþHe patients failed to explore the contralesional hemispace and spent most of the time gazing at the right side (cf. negative values for the Lmulti condition, red bars in Fig. 3A; plus Supplementary Figure  The gaze pattern in NÀ was somewhat in-between that of NþHe and HS groups (see cyan bars in Fig. 3A). By contrast, NþHþ patients displayed a pattern of spatial orienting that was qualitatively different. While in Multi-conditions they exhibited a deficit that appeared to be along a continuum with the effect observed in NÀ and NþHe patients (i.e., a rightward bias irrespective of the most salient side; cf. Multi-condition, orange bars in Fig. 3A), NþHþ failed to explore the contralesional hemispace even when the video contained a single lateralised event on the left side (see leftmost first orange bar in Fig. 3A; see also Fig. 2D).
In order to exclude that the behaviour of the NþHþ group was driving the significant three-way interaction when considering the four groups (Group Â Competition Â Side), we directly compared the NþHe and NÀ groups. Post-hoc analyses (Duncan test) showed that NþHe significantly differed from NÀ in Multi-trials (left and right: both p .01), but e critically e not when watching the Lat-videos (left and right: p ¼ .30 and p ¼ .65, respectively). The latter confirmed that when the videos contained a single meaningful event lateralised on the left side, the spatial orienting of NþHe patients was similar to that of Ne controls.
These results suggest that e overall e the spatial exploration of naturalistic stimuli containing a single distinctive event on the left side was intact in NþHe. Nonetheless, our primary measure considered the time spent on each side of space along the entire video duration. It is possible that the NþHe displayed a more selective deficit/bias only at a short interval after stimulus presentation (cf. Posner, Walker, Friedrich, & Rafal, 1984;Ptak & Golay, 2006). We tested this hypothesis in additional analyses that considered only the first fixation after stimulus onset. To do this, we computed the percentage of 'first fixations', that is, we re-computed the Gaze_idx index [i.e., (ReL)/(R þ L)], but this time using the number of first fixations on each side, rather than the total time on each side. It should be noticed that only fixations at eccentricities larger than 2 deg. from the centre of the screen were considered for these analyses (cf. also computation of the overall Gaze_idx). These percentages were then submitted c o r t e x 1 1 3 ( 2 0 1 9 ) 3 2 9 e3 4 6 to the same Group £ Competition £ Side ANOVA described above. The results confirmed the significant three-ways interaction (p < .001; see also Supplementary Figure S2A). The post-hoc analyses (Duncan test) showed again that NþHe significantly differed from NÀ in Multi-trials (left and right: both p .001), but not in Lat-trials (left and right: p ¼ .15 and p ¼ .25). Hence, NþHe orienting towards left lateralised stimuli was largely preserved even in the very early phases of stimulus processing (i.e., first fixations). Nonetheless, when we examined the timing of these fixations (i.e., the time between video onset and first fixation, considering only 'congruent' first fixations, i.e., Left first fixations for Llat and Lmulti trials, and Right first fixations for Rlat and Rmulti trials) we found that the NþHe were significantly slower in orienting towards left-compared with right-lateralised events (Llat vs Rlat: 471 vs 365 msec; p < .001), while NÀ patients did not show any such difference (Llat vs Rlat: 368 and 372 msec; p ¼ .87); see also Supplementary Figure S2B for the full Group £ Competition £ Side ANOVA (here NþHþ were excluded because most of these patients never made any first-fixation on the left side). Accordingly, while NþHe patients oriented towards the left side when the video contained a single semantically-distinctive event on that side, the processing of these contralesional single events was slowed down by about 100 msec.
Next, we assessed the relationship between stimulusdriven signals (visual salience) and overt spatial orienting. We formally tested the effect of stimulus-driven salience by correlating the amount of salience lateralisation (Sal_idx) with the amount of gaze lateralisation (Gaze_idx; see Fig. 4). The main hypothesis we sought to test here was whether visual salience affected spatial orienting in the left hemispace of NþHe patients on Multi-trials (cf. Introduction). For each subject, a linear regression estimated the relationship between gaze and salience using trial-by-trial variance. The corresponding regression slopes (betas) were submitted to a one-sample t-test for statistical inference at the group level. In NþHe patients, for the critical Multi-trials with saliency lateralised on the left side (Lmulti-condition), we found a significant effect of salience on gaze (T (14) ¼ 2.04, p ¼ .031; see red panel on the left in Fig. 4C). This demonstrates that, despite the presence of a rightward attentional bias (cf. corresponding condition in Fig. 3A), there was some spared processing of bottom-up salience in the contralesional left hemispace in NþHe patients. For completeness, we tested the significance of the relationship between salience and gaze also in all the other conditions and groups (see Table S2, in the Supplementary Material). Consistent with our previous results (Nardo et al., 2016), for the HS group we found a positive correlation between Sal_idx and Gaze_idx in presence of multiple/ competing events (Lmulti-and Rmulti-trials), but not in presence of single/lateralised events (Llat and Rlat; see Fig. 4A). The NÀ group showed an analogous pattern of results (Fig. 4B), and so did the NþHe patients in all the conditions. By contrast, in hemianoptic patients (NþHþ) there was no relationship between salience and gaze in any of the conditions (Fig. 4D, see also Table S2).

Audiovisual experiment (AVstim)
In the second experiment, we investigated stimulus-driven effects by pairing the onset of the main visual event with a sound, either on the same or the opposite side of space (spatially-congruent vs spatially-incongruent audiovisual conditions). We hypothesised that residual stimulus-driven processing in the left hemispace of neglect patients would lead to longer looking times on the left side when a left visual event was coupled with a spatially-congruent left sound. First, we performed a mixed three-way ANOVA on transformed Gaze_idx data (see Methods) with Group as between factor (NþHÀ, NþHþ, NÀ, HS), and Congruency (Congruent, Incongruent) and visual Side (Left, Right) as within factors (see Fig. 3B). Here, we predicted primarily a main effect of audiovisual spatial congruency. Next, we targeted more directly the influence of congruent audiovisual stimulation in the left hemispace in NþHe patients, using a one-tailed t-test assessing the significance of the crossmodal effect ('Sound minus No-Sound' >0) in the left hemispace in the NþHe group. A positive effect would confirm that stimulus-driven audiovisual interactions can boost spatial orienting towards the contralesional hemispace in these patients.
The mixed ANOVA showed a main effect of Congruency and a significant interaction Congruency £ Side (see Table 3). Overall, all participants (irrespective of group) spent more time orienting towards the side of the visual event when the sound was on the same side, than when the visual event and the sound were on opposite sides. This effect of audiovisual spatial congruency was larger when the visual event was on the left as compared to the right side (see below). Fig. 3B shows the transformed gaze data plotted separately according to Group and Condition (see Supplementary Figure S1B for the corresponding untransformed data). In these plots, positive values mean that adding a sound to the video (cf. subtraction of sound vs no-sound conditions) led participants to spend longer times looking towards the side of the visual event. By contrast, negative values indicate that adding the sound reduced the time spent on the side of the main visual event. The plot shows that in the HS group the effect of audiovisual spatial congruence was mainly driven by the incongruent condition, that is, presenting a sound on the opposite side of the visual event led to a reduction of orienting towards the visual event (cf. green bars with negative values for incongruent conditions in Fig. 3B). NþHe patients showed an analogous effect of audiovisual incongruence, but they also showed positive values for the Lcon condition. Accordingly, adding a sound to the left side, while watching a video with a visual event on the left, increased the time patients spent looking towards the contralesional hemispace. The orienting pattern in patients with hemianopia NþHþ was similar to that c o r t e x 1 1 3 ( 2 0 1 9 ) 3 2 9 e3 4 6 of NþHe, showing an even larger effect of audiovisual congruence on the left side (see first orange bar in Fig. 3B). By contrast, the pattern in NÀ was more similar to HS, primarily showing an effect of audiovisual incongruence (cf. cyan bars in Fig. 3B). We sought to further confirm our main hypothesis about partially spared stimulus-driven audiovisual spatial interactions in the left hemispace of NþHe patients. To do this, we tested whether the effect of audiovisual congruency in the left hemispace of NþHe patient was significant. A one-sample t-test on the corresponding Lcon-condition confirmed that indeed adding a left sound to a left visual event significantly increased the time NþHe patients spent looking towards the contralesional hemispace (T (12) ¼ 4.26; p ¼ .002).
In sum, the AVstim experiment showed that audiovisual spatial congruence affected orienting behaviour in all groups. This included a reduction of the time spent on the side of the visual event when an auditory signal was presented on the opposite side (audiovisual incongruence). Most importantly, in NþHe patients there was also a positive effect of audiovisual congruence in the left hemispace. Neglect patients (irrespective of the presence of hemianopia) spent longer time looking towards the contralesional side when a left sound was added to a left visual event.

Discussion
Spatial orienting deficits in neglect patients are thought to arise from a complex combination of factors related to both endogenous control and stimulus-related features. Standard paradigms making use of simple stimuli (e.g., geometrical shapes in search, cueing tasks) allow disentangling these influences, but fail to capture how external stimulus-related and endogenous signals jointly contribute to spatial processing in naturalistic conditions. The latter are characterized by a high number of objects and/or events that compete for processing resources. Here, we sought to reproduce such complex conditions using short videos portraying naturalistic situations. We characterised each video in terms of high-level features (presence of semantically meaningful visual events) and lowlevel sensory signals (visual salience and audiovisual spatial alignment). Our participants were asked to freely view the stimuli without any specific task, thus minimising any endogenous influence of strategic/task-based control. Our main finding was that in neglect patients without hemianopia (NþHe) spatial orienting deficits arose primarily as a result of  Figure S1 for raw, untransformed data of both experiments). Legend: LAT ¼ single/lateralised events; MULTI ¼ multiple/ competing events; CON ¼ spatially-congruent audiovisual stimuli; INC ¼ spatially-incongruent audiovisual stimuli; NoS ¼ stimuli without sound (visual-only controls); S ¼ stimuli with sound (i.e., audiovisual); HS ¼ healthy subjects; N¡ ¼ right-hemisphere-damaged patients without neglect; NþHe ¼ neglect patients without hemianopia; NþHþ ¼ neglect patients with hemianopia.
c o r t e x 1 1 3 ( 2 0 1 9 ) 3 2 9 e3 4 6 the competition between distinctive and semantically meaningful visual events, while the competition between sensoryrelated factors seemed to play a minor role in determining the attentional imbalance between the ipsi-and contralesional side of space in these patients. Our finding that internal (here, semantic-related) signals play a central role in controlling spatial orienting in naturalistic conditions is consistent with behavioural studies in healthy participants (Einh€ auser, Spain, & Perona, 2008a;Nuthmann & Henderson, 2010;Stoll, Thrun, Nuthmann, & Einh€ auser, 2015). In our first experiment (Vonly), we show that neglect patients oriented towards single distinctive events in the left hemispace, despite the onset of the videos implied stimulation of both sides. This indicates that in naturalistic conditions of visual stimulation the detection of single distinctive events can be relatively spared both on the left and right side, and that mere visual stimulation on the right side plays a minor role in reducing the detection of visual events in the contralesional hemispace in neglect patients (cf. Karnath, 2015). Additional analyses that specifically considered the first fixation after stimulus onset indicated that the likelihood of leftward fixations for Llat-trials (which included a single distinctive event on the left side), was the same in NþHÀ and NÀ patients. This confirms that events on the left side were able to grab the patients' attention/gaze even when embedded within a complex input comprising visual stimulation of both sides. The additional timing analyses revealed that these first gaze-shifts towards the contralesional side were approx. 100 msec slower in NþHe patients as compared with NÀ control. The latter indicates some processing deficit for stimuli on the left side.
Our current approach follows a well-established methodology of using gaze orienting as an indirect index of the allocation of visuospatial attention using naturalistic stimuli (e.g.,  .5 .6 .7 .8 .9 1 Fig. 4 e Relationship between spatial orienting (Gaze_idx) and stimulus salience (Sal_idx) in Vonly experiment, plotted separately for experimental condition and group. Gaze_idx correlated with Sal_idx in trials containing multiple/competing events on both sides (Multi-trials, plots in red) for all groups, except NþHþ. By contrast, in presence of single/lateralised stimuli (Lat-trials, plots in green), visual salience did not contribute and the side of the visual event fully determined the time spent on each side (cf. also Fig. 3A). For both Gaze_idx and Sal_idx positive values indicate a rightward bias, while negative values indicate a leftward bias. Legend: HS ¼ healthy subjects; N¡ ¼ right-hemisphere-damaged patients without neglect; NþH¡ ¼ neglect patients without hemianopia; NþHþ ¼ neglect patients with hemianopia. Land, Mennie, & Rusted, 1999;Foulsham, Walker, & Kingstone, 2011;Hwang et al., 2011;Tatler, Hayhoe, Land, & Ballard, 2011;Stoll et al., 2015; see also Henderson, 2017), but it does not inform us about the amount of processing of the objects at the fixated locations. This is particularly true for brain-lesioned patients, who may overtly orient towards the left hemifield but still exhibit limited processing on that side (Doricchi & Incoccia, 1998;Doricchi & Galati, 2000;see also Driver & Vuilleumier, 2001;Riddoch et al., 2003; discussing residual perceptual and semantic processing on the left side in neglect). On the other hand, patients with neglect have relatively spared ocular pursuit of motion and slow phases of optokinetic nystagmus towards the contralesional direction (Smith & Cogan, 1959;Baloh, Yee, & Honrubia, 1980;K€ ompf, 1986;Incoccia, Doricchi, Galati, & Pizzamiglio, 1995;Doricchi, Siegler, Iaria, & Berthoz, 2002;Doricchi, Iaria, Silvetti, Figliozzi, & Siegler, 2007;and Lynch & McLaren, 1983, for related studies in non-human primates). Here, spared ocular pursuit of our dynamic stimuli may have contributed to the residual capability of exploration of the left visual space in NþHe (please see also below, for additional points concerning the role of motion signals in the current study). Future studies may seek to better characterise to what extent distinctive events on the left side are processed, including dissociating attention and eye-movements (e.g., L adavas et al., 1997;Walker et al., 1996) and testing for the recognition of these events (Jelsone-Swain, Smith, & Baylis, 2012). Nonetheless, it should be noticed that introducing any explicit reporting procedure e such as explicit object search or scene memory e implies imposing a goal-directed task, which in turn may change the balance between the different signals contributing to spatial orienting (cf. reduction of the effect of stimulusrelated salience in the presence of a goal-directed task; e.g., Einh€ auser, Rutishauser, & Koch, 2008b). While the NþHe patients oriented systematically towards single events on the left side, we found a marked ipsilesional attentional bias as soon as multiple distinctive and semantically-relevant events were presented on the two sides of space (see also Riddoch & Humphreys, 1983;Bartolomeo & Chokron, 2001;Corben et al., 2001;Coulthard et al., 2008). Healthy participants spent approximately the same amount of time looking towards the left and right side (but see below for the effect of visual salience), while NþHe patients systematically oriented towards the right side (see also Fig. 2). This condition, with videos containing events on both sides of space, may be related to extinction tests that are routinely used to examine the impact of competition on spatial orienting (de Haan, Karnath, & Driver, 2012). Only two out of 15 NþHe patients showed signs of extinction in the clinical evaluation that here was assessed using a sensitive computerbased test (two black squares of 1 Â 1 deg. presented for 200 msec at 6 deg. of eccentricity with respect to a central fixation point). Given the short duration of the stimuli in the clinical test, as compared to the much longer duration of the videos in the Vonly experiment, it may be expected that any deficit in terms of low-level competitive interactions between the two sides would lead to higher rates of extinction in the clinical test as compared with the Vonly experiment (Bonato, 2012). Instead, the current pattern (no extinction with brief simple stimuli, vs marked rightward bias with long, semantically meaningful events) points to a central role of high-level signals and, more specifically, to the competition between events entailing high-level semantic information (e.g., see also Walker et al., 1996).

Gaze_idx
In contrast to this, in neglect patients without hemianopia the processing of low-level signals (salience) in the contralesional side appeared to be relatively spared. Both in NþHe and HS the effect of salience was found selectively when the videos included multiple distinctive events (Multi-trials). The finding that single distinctive events (Lat-trials) abolished any effect of visual salience fits with previous studies showing that guidance by internal, top-down signals can override the influence of stimulus-related salience (e.g., Einh€ auser et al., 2008a,b). Nonetheless, one study that directly addressed the interaction between visual salience and top-down control using dynamic naturalistic stimuli in neglect patients found that low-level visual features contributed to spatial orienting irrespective of current top-down signalling (Machner et al., 2012). The authors interpreted these results suggesting that the lesions interfered with endogenous control, thus allowing stimulus-driven control to guide spatial orienting even in presence of top-down signals.
A critical difference between Machner et al.'s study and our Vonly experiment here is that we manipulated the contribution of endogenous signalling by changing the level of competition between semantically distinctive events (Latvs Multi-trials), rather than imposing a specific goal-directed task (e.g., search for a specific target object as in Machner et al., 2012). The mechanisms regulating these two types of control are likely to be substantially different. Goal-directed control entails holding a specific target-template in memory, comparing the sensory input with this internal template, detecting the targets, rejecting distractors and e more generally e guiding the allocation of processing resources in a strategic manner (e.g., minimising the re-exploration of objects that were already fixated). By contrast, in our study participants did not receive any explicit instruction, hence the significance of visual events was determined by their distinctiveness within the scene rather than task-relevance. Accordingly, here endogenous control did not operate on the basis of any goal-directed target-related operations. Instead, the free-viewing condition most likely entailed a simpler form of event detection. The difference between the present findings and the results of Machner et al.'s (2012) emphasises the importance of addressing the role of stimulus-driven and endogenous factors from multiple perspectives, and indicates that different constraints govern the relative contribution of these two types of signals as a function of the specific context (e.g., goal-directed vs knowledge-based orienting).
Further evidence of spared stimulus-related processing in NþHe comes from the second experiment, based on audiovisual stimuli (AVstim). Crossmodal interactions are known to affect orienting behaviour in neglect patients, possibly via both general arousal (Chica et al., 2012;Robertson et al., 1998) and the engagement of multisensory spatial representations (Frassinetti et al., 2002;Pavani, L adavas, & Driver, 2003;Van Vleet & Robertson, 2006). Here, we lacked any clinical measure of auditory spatial processing, but our results indicate that patients were able to encode the spatial position of taskirrelevant sounds (left/right side) and e most importantly e to combine that with information about the location of the concurrent visual event. In line with Frassinetti et al. (2002), who used simple and stereotyped stimuli, we found a reduced orienting deficit for left visual events specifically when these were coupled with a sound on the left side. These results demonstrate that the mechanisms that enable combining information about (the position of) the distinctive visual event and (the position of) the task-irrelevant auditory stimuli was still effective in NþHe patients (see also Golay, Hauert, Greber, Schnider, & Ptak, 2005;Ishihara et al., 2013; for related effects using simple and stereotyped stimuli).
We interpret this overall pattern of results in the framework of a possible distinction between dorsal and ventral frontoparietal networks for the processing of endogenous versus stimulus-driven signals. We put forward that in Lattrials the detection of a single distinctive event generated a processing priority bias at the location of the event, which overrode other signals based on sensory salience. By contrast, Multi-trials were associated with a series of such event detections that overall did not generate any spatial bias favouring one or the other side. Under these circumstances (i.e., no spatial priority based on event-detection), low-level sensory signals start contributing to spatial orienting. Following existing proposals of attention control, the encoding of processing priorities may take place in the dorsal network (cf. Priority Maps, see Gottlieb, 2007;Ptak, 2012), while the detection of distinctive events could be initially implemented in the ventral network (Kincade, Abrams, Astafiev, Shulman, & Corbetta, 2005;Shulman et al., 2003). These detection signals would then contribute to updating Priority Maps in the dorsal network via interregional communication (cf. Astafiev, Shulman, & Corbetta, 2006;Corbetta & Shulman, 2002;Doricchi, Macci, Silvetti, & Macaluso, 2009;Macaluso & Doricchi, 2013;Shapiro, Hillstrom, & Husain, 2002).
The role of the ventral network for the detection of distinctive events in naturalistic conditions is in line with several fMRI studies that we previously carried out in healthy participants (e.g., Nardo, Santangelo, & Macaluso, 2011;Nardo, Console, Reverberi, & Macaluso, 2016). In particular, Nardo et al., 2016 made use of the same videos and free-viewing conditions as in the present Vonly experiment. Direct comparison between Multi-vs Lat-trials revealed activation of the right temporoparietal junction, plus more anterior regions including the right middle/inferior frontal cortex, that is, the same regions damaged in NþHe patients here. Given the complexity of our naturalistic stimuli, the heterogeneity of trials (each trial included different objects/events) and of the ensuing oculomotor behaviour, it is difficult to pinpoint the exact processes underlying the previous fMRI results and the neglect deficit here. However, it should be noticed that both studies associated the ventral network specifically with the processing of videos containing multiple distinctive events. This suggests that this system may not merely detect distinctive events (cf. also preserved orienting towards single events on the left side, Llat-trials), but rather may perform more complex operations needed to handle the co-occurrence of multiple such events. This may entail establishing an order of processing priorities that would then allow sequential orienting towards different distinctive events.
The second main result of the present study was that stimulus-related signals affected orienting behaviour in NþHe, in whom we found significant damage of the ventral frontoparietal cortex. This seems inconsistent with a mechanism where stimulus-driven signals are first detected in the ventral network (lesioned in NþHe; cf. also Corbetta & Shulman, 2002) and subsequently affect Priority Maps in dorsal regions via interregional connectivity. Nonetheless, we cannot exclude that the connectivity between spared regions in the ventral network and dorsal areas played a role here (e.g., see He et al., 2007; and see also limitations of the anatomical analyses here). An additional analysis with the Fisher exact test comparing the frequency of lesions between groups in the middle frontal gyrus (that may act as a connection hub between dorsal and ventral attention networks, see He et al., 2007) showed significant differences between NÀ and neglect patients (NÀ vs NþHe: p ¼ .004; and NÀ vs NþHþ: p ¼ .006), whereas no significant difference was found between the two groups of neglect patients (NþHe vs NþHþ: p ¼ 1.000). An alternative interpretation would be that salience modulated Priority Maps in the dorsal network via more direct occipitoparietal pathways (see Dragone, Lasaponara, Silvetti, Macaluso, & Doricchi, 2015;Geng & Vossel, 2013;Ptak & Schnider, 2011;Silvetti et al., 2016). Irrespective of the specific paths involved, the current finding that sensory salience affected orienting behaviour in patients with relatively spared dorsal regions fits with the proposal of stimulus-driven control in dorsal regions during orienting in naturalistic conditions (Nardo et al., 2011;. The results of the second experiment (AVstim) further support this view. We found significant effects of audiovisual spatial congruence on spatial orienting in NþHe patients with relatively spared dorsal frontoparietal cortex. These results are in agreement with our previous fMRI data in healthy participants that highlighted crossmodal spatial interactions in the dorsal parietal cortex during free-viewing of the same videos employed in the current AVstim experiment (Nardo et al., 2014).
While we were able to identify some differential contribution of endogenous and stimulus-related factors during spatial orienting in naturalistic conditions, the current approach includes several limitations. First, we made use of simple measures of overt orienting, that is, the ratio between the time spent looking towards one versus the other side of space, plus some additional tests regarding the side and timing of the first fixations. These do not provide us with any detailed information about oculomotor dynamics (e.g., saccades number/amplitude/direction). Our choice was primarily motivated by the complexity of the stimuli. Our videos included dynamic visual stimuli that differed for each single trial. A more detailed quantification of the oculomotor behaviour would have required also an analogous analysis of the stimuli, which in turns would need a large amount of subjective decisions (e.g., identifying the position of single objects, in each frame of each video; but see Machner et al., 2012). Related to this point, here the identification of the distinctive visual events and the categorization of the videos in Lat-vs Multi-conditions for Vonly experiment were based on a subjective evaluation of videos. It is unlikely that the subjective categorization led to some systematic bias across the different conditions (i.e., the vast majority of the events were very noticeable), but we cannot exclude that the distinctive events c o r t e x 1 1 3 ( 2 0 1 9 ) 3 2 9 e3 4 6 comprised also some stimulus-related features. In particular, the majority of the distinctive visual events entailed moving people or objects in the foreground. At the same time, motion is one of the 'channels' used for computing the Saliency Map (cf. Itti et al., 1998;Itti & Koch, 2001). Thus, local motion contributed to both the distinctive visual events and low-level sensory salience. While here dissociating the relative contribution of motion signals in defining the distinctive events versus lowlevel salience appears challenging, we should point out that our results highlighted a linear relationship between saliency and gaze within each side of space. This is more consistent with an influence of motion via low-level signalling rather than highlevel semantics, because it seems unlikely that a progressive increase in local motion would translate into an analogous change in semantic distinctiveness. Moreover, in both experiments we introduced specific experimental manipulations aiming at minimising possible confounding effects associated with the naturalistic stimuli. In Vonly experiment, we sought to account for possible left/right differences both in low-level signals (salience and/or other visual features) and high-level information (distinctiveness of the visual events) by presenting each video either in its original version or in a left/right flipped version. In AVstim experiment, we included no-sound baseline trials (NoS) to ensure that any difference between congruent and incongruent conditions (note that these comprised distinct videos) could not be simply attributed to visual differences.
Further, we need to acknowledge the small sample size of the NþHþ group. It should be noted that in the present study we were not interested in drawing conclusions about the role of hemianopia. Rather, we focussed on the NþHe group and ensured that our conclusions were not driven by any visual deficits. However, it should be noticed that in the Vonly experiment NþHþ patents showed a qualitatively different pattern of spatial orienting that may be further investigated in a larger sample size including an additional group of patients with hemianopia but without neglect ('NÀHþ').
Finally, the anatomical analyses concerning the lesions associated with neglect used a low-resolution ROI approach that does not consider specific frontoparietal sub-regions. In addition, our main analyses included CT scans that are less accurate than MR. We verified whether we could improve our anatomical study by using voxel-based lesion-symptom mapping (VLSM, as implemented in the NPM software; http:// people.cas.sc.edu/rorden/mricron/stats.html) on the subset of patients with MR scans available. The results of the power analysis showed that our data were underpowered (cf. Rorden, Karnath, & Bonilha, 2007;Rudrauf et al., 2008) and therefore the results of the VLSM could not be considered reliable. However, we need to point out that in the context of the present study the anatomical data were intended as a support for the behavioural findings, primarily seeking to confirm the association between neglect and structural damage of the ventral frontoparietal network (Corbetta & Shulman, 2011;Karnath & Rorden, 2012;Mort et al., 2003;Vallar & Perani, 1986).

Conclusions
In the present study, we have shown that the competition between semantically-distinctive, co-occurring events is a major determinant of spatial orienting deficits in hemispatial neglect. By contrast, spatial orienting towards isolated visual events was spared on both sides, suggesting relative sparing of detection mechanisms despite the lesion of the ventral frontoparietal cortex. Moreover, we found that both low-level visual salience and spatial alignment between audiovisual stimuli can increase the time patients spend exploring the contralesional hemispace. This putatively suggests that stimulus-related signals affect orienting behaviour via relatively intact (multisensory) representations of processing priorities in dorsal frontoparietal regions. These results provide us with a novel perspective about the influence of stimulus-driven and endogenous signalling in neglect (Bartolomeo & Chokron, 2002;Smania et al., 1998), highlighting the contribution of semantic-distinctiveness as opposed to task-relevance (cf. Machner et al., 2012) during spatial orienting in naturalistic conditions. Notwithstanding several limitations, the current approach contributes to bridge the gap between observations in well-controlled (but artificial) laboratory conditions and the problems that neglect patients may experience in their everyday life. The finding that competition between high-level, meaningful events plays a central role in neglect's impairment, together with the evidence that low-level sensory features play a minor role in the spatial imbalance in these patients, might open new perspectives for treatment. While traditional rehabilitation protocols heavily rely on goal-driven, voluntary strategies (e.g., external instructions, scanning training, etc.), our current results advocate for the development of novel approaches based on passive viewing of (multisensory) naturalistic stimuli with specific spatial configurations of distinctive and/or salient events.