Attentional Networks and Biological Motion

Our ability to see meaningful actions when presented with point-light traces of human movement is commonly referred to as the perception of biological motion. While traditional explanations have emphasized the spontaneous and automatic nature of this ability, more recent findings suggest that attention may play a larger role than is typically assumed. In two studies we show that the speed and accuracy of responding to point-light stimuli is highly correlated with the ability to control selective attention. In our first experiment we measured thresholds for determining the walking direction of a masked point-light figure, and performance on a range of attention-related tasks in the same set of observers. Mask-density thresholds for the direction discrimination task varied quite considerably from observer to observer and this variation was highly correlated with performance on both Stroop and flanker interference tasks. Other components of attention, such as orienting, alerting and visual search efficiency, showed no such relationship. In a second experiment, we examined the relationship between the ability to determine the orientation of unmasked point-light actions and Stroop interference, again finding a strong correlation. Our results are consistent with previous research suggesting that biological motion processing may requite attention, and specifically implicate networks of attention related to executive control and selection.

If attention is involved in the processing or interpretation of point-light stimuli, then we might expect to find a measurable relationship between the efficiency with which an individual controls attention, and their ability to process biological motion. Finding such a correlation would not, of course, tell us about the role attention might be playing. However, if the observed relationship is restricted to one or more of the previously mentioned networks of attention, then this may help to inform future studies aimed at more directly testing the nature of that role.
In the two experiments reported here, we found that performance varied quite considerably from observer to observer when performing biological motion tasks. This was true both when performance was measured in terms of resistance to visual clutter (Exp 1) and in terms of simple responses to unmasked upright and inverted actions (Exp 2). Importantly, this variation was highly correlated with one specific aspect of visual attention, namely the ability to selectively attend. Observers, who were better at selectively attending, were faster and more accurate at processing point-light displays.

EXPERIMENT 1
In Experiment 1 we measured individual thresholds for accurately determining the direction in which a masked point-light walker was facing. Thresholds were established by adaptively increasing and decreasing the number of masking elements in the display, so that a stable level of 71% correct was achieved. In the same individuals, we then measured the efficiency with which a range of attention-specific tasks were performed. Finally, we looked for correlations between biological motion processing and these networks of attention.

Method
Participants. Twelve members of the Tübingen community were paid for participation in this study. All observers reported normal or corrected to normal vision and were naive with regard to the purpose of the study.
Apparatus. Stimuli were presented on a 21 inch (37 cm x 28 cm) monitor with a refresh rate of 75 Hz and a resolution of 1152 x 870 pixels. Observers sat approximately 60 cm from the monitor in a dimly lit room. Responses were collected via a standard keyboard.
Biological Motion Task. The task for observers was to report the left/right orientation of a walking figure that was presented in sagittal view at a random location within a central 9.3° x 9.3° visual angle viewing area. Each walking figure consisted of 11 dots (head, near shoulder, both elbows, both wrists, near hip, both knees, and both ankles) drawn in black on a gray background, each dot subtending 0.17°. The figures subtended 3° in height (head to ankle) and 1°in width (at the most extended point of the step cycle) and were animated using James Cutting's synthetic walker algorithm (Cutting, 1978). A complete stride cycle was achieved in 40 animation frames with a frame duration of 40 ms simulating a natural walking speed of 38 strides per minute (Inman, Ralston & Todd, 1981). The walking figure did not translate, but moved in place, as if on a treadmill. The starting position within the step-cycle was randomly chosen on each trial.
Masking stimuli were created by randomly positioning the individual dots of a walking figure within the central viewing area. Such "scrambled walker" masks are very effective as they mimic the local behaviour of the target, without conveying global structure (e.g., Bertenthal & Pinto, 1994;Cutting, Moore & Morrison, 1988;Thornton et al, 1998). We note that as there was no global translation of either target or mask elements, and as limb pairs have pendulum-like, periodic motions, 180° out of phase, there is very little local information that would favour a directional right or left response. For example, an isolated elbow or knee dot moving from left to right, could equally have originated from a left or right facing walker. The only local cues to direction come from trajectory asymmetries, for example at the end points of ankle and wrist dot movements. To minimize the effects of these cues, 50% of all mask elements were generated from left-facing walkers, and 50% from right facing walkers. In general then, direction is almost exclusively conveyed by the global structure of the target figure, not by the local elements, either of the target or the mask.
During an initial training phase, the walker appeared unmasked for 100 trials. To assess individual mask thresholds, two interleaved staircases were presented in which the number of scrambled walker dots were either increased or decreased from starting levels of 110 (5 left and 5 right walkers) and 550 (25 left and 25 right walkers) dots respectively. For either staircase, two correct responses resulted in the addition of 22 dots (1 left and 1 right scrambled walker) to the mask. A single incorrect response resulted in the removal of 22 dots. A reversal occurred whenever the direction of this mask alteration changed, from addition to subtraction or vice versa. A staircase terminated after 30 such reversals. Thresholds were estimated by averaging across the last eight reversal points and collapsing across the two interleaved staircases. This standard 2 up/1 down procedure provides an estimate of the mask level at 71 per cent correct. The entire task took approximately 30 minutes.
Visual Search Task. Visual search has proven to be a very useful technique for exploring human perception, in particular the relationship between vision and attention (see Wolfe, 1998 for a review). Here we employed a relatively inefficient search for the absence of a feature -the letter "O" compared to the letter "Q" --that is thought to involve effortful, serial deployment of attention. Such a task not only provided an assessment of the ability to shift and selectively deploy attention, but also closely parallels the search component of our main walker task. Observers performed 320 trials in which the presence/absence of the target (the letter O, present 50% of time) and the number of distractors (6, 8, 10, 12 instances of the letter Q) were crossed and randomly intermixed. Target and distractor letters were drawn in black in a middle gray background, subtended 0.8° visual angle and were spatial distributed within a 16° x 16° viewing square. The main dependent measure of interest was search efficiency, indexed by the increase in target present response time as a function of set size.
Stroop Task. The Stroop (1935) colour naming task provides a simple but highly effective measure of selective attention. Observers are asked to read aloud the ink colour of each item in a list of words or neutral strings of letters (e.g., XXXXX). Even though observers are told to ignore the meaning of the items, when they consist of incongruent colour terms (e.g., the word red presented in blue ink) reaction times are dramatically slowed. The magnitude of this slowing provides an index of how well observers can selectively attend. Here we presented four lists of twelve items and manually recorded the total time taken to read down each list. The words were presented at the centre of the computer screen in a 12° x 3° column. The first two lists consisted of neutral words (e.g., Cat, Star, Poster, Watch) which could be drawn in red, blue, green or yellow. These lists were used as a training phase. A list of neutral items and a list of incongruent colour terms were then presented with the order counterbalanced across observers. The dependent measure was the reaction time difference between the neutral and incongruent lists. Reaction time was recorded via a manually-operated software timing routine under the control of the experimenter.

Attentional Network Test. The Attentional Network Test (ANT) was developed by Michael
Posner and colleagues (Fan, McCandliss, Sommer, Raz, & Posner, 2002) to provide a fast and efficient attentional assessment technique appropriate for use with children, animals, patient populations and in the context of brain imaging. The name derives from the observation, discussed above, that components of attention, such as alerting, orienting (e.g., selection of information) and executive control (e.g., conflict resolution), appear to be subserved by networks of different brain areas (Posner & Peterson, 1990).
To provide an assessment of these three functional networks in a single, short (approx. 30 min) task, the ANT combines a Posner cueing paradigm (Posner, 1980) with an Eriksen flanker task (Eriksen & Eriksen, 1974). Observers are asked to make a speeded response to left/right orientation of a central arrow that can appear above or below fixation. In some trials the target is preceded by a spatially uninformative (altering) or informative (orienting) cue and can appear alone or in the presence of congruent or incongruent flanking arrows (congruency). The task is run in a single session, with trial types fully intermixed. Appropriate reaction time subtractions are used to derive separate assessments of alerting, orienting and executive control. These subtractions are described in the results section.
Procedure. Each task was run as a separate mini-experiment, with written instructions, verbal explanation and relevant training proceeding each period of data collection. Short breaks were provided between each task. The Biological Motion task was always run first, with the order of the remaining tasks counterbalanced across observers. The entire data collection period was approximately two hours.

Results
Biological Motion Task. Table 1 contains a summary of each observer's performance, as well as overall means and standard deviations, on this and all other tasks from Experiment 1. On average, the biological motion staircases terminated after 226 trials, which took approximately 20 minutes. The average threshold was 242 mask dots. Of particular interest was the spread of this distribution, which ranged from 114 to 338 dots (see Table 1), suggesting considerable individual difference in the level of masking that led to 71% correct performance. Visual Search Task. As expected, search was slow and serial, with reaction time increasing linearly for both target present and target absent trials. To obtain individual measures of search efficiency, linear regression lines were fitted to the search data of each observer. These estimates indicated average target present slopes of 44 ms/item (see Table 1) and target absent slopes of 76 ms/item.
Stroop Task. Data from one observer was lost due to a technical error. For the remaining 11 observers there was a strong and consistent cost associated with the ink colour/colour label conflict. Specifically, reading times for the colour terms (M = 10 secs) were some three seconds longer than for the neutral letter strings (M = 7 secs), t(10) = 24.6, p <.001. These conflict scores are summarised in Table 1.
Attentional Network Test. The raw reaction time data obtained from the ANT are summarised in Table 2, as a function of cue and flanker conditions. Three measures of interest were obtained from the ANT (see Table 1). An orienting effect was computed by subtracting reaction times to spatial informative up/ down cues from centrally cued trials. This subtraction indicated that observers were on average 41.17 ms faster in the spatially cued trials (M = 570 ms) compared to the centrally cued trials (M = 612 ms), a pattern that was highly reliable, t(11) = 5.722, p <.001. Alerting was computed by subtracting double cue trials (M = 641 ms) from no cue trials (M = 602 ms). This subtraction revealed a reliable alerting effect of approximately 40 ms, t(11) = 7.375, p<.001. Chandrasekaran,Turner,Bülthoff & f Thornton 11 Finally congruency (executive control) was computed by subtracting congruent (M = 591 ms) from incongruent (M = 683 ms) trials, having collapsed across all cue types. There was a strong (M = 92 ms) effect of congruency which was again highly reliable, t(11) = 11.717, p<.001. In general, the raw reaction times and attentional estimates from the ANT were very similar to those previously reported by Fan et al., (2002).  (108) 602 (105) 612 (101) 570 (92) Correlation Analysis. To explore the relationship between the biological motion task and the various measures of attention, we constructed the correlation matrix shown in Table 3. Of primary interest is the final column that directly compares biological motion to the various attentional measures. There were only two factors that were significantly correlated with biological motion. Performance on the Stroop task was negatively correlated (r = -0.679, p <0.05), such that observers who performed well on the biological motion task were also less affected by colour conflicts in the Stroop task. Similarly observers who did well on the biological motion task were less affected by flanker congruency in the ANT, (r = -0.753, p <0.01). Scatter plots for these two effects are shown in Figures 1A and B. Interestingly, the cross-correlation between the Stroop and congruency effects was only marginally significant (r = 0.562, p = 0.07). This suggests that they may be relating to slightly different aspects of biological motion. Consistent with this notion, multiple regression including both the Stroop and congruency effect as independent parameters accounts for a larger percentage of variance (66.3%) compared to separate analysis of these factors (46% and 56.5% of the variance for the Stroop and congruency effects respectively). As the cross correlations that exclude biological motion are not the primary focus of this paper, we will not discuss them in detail here. However, we note that none of these cross-correlations reached significance, although the relationship between orienting and congruency was marginal (r = 0.57, p = 0.06).

Discussion
The results of Experiment 1 show a clear relationship between measures of selective attention and the ability to process a point-light walker in a mask. Specifically, those observers who are better able to selectively attend, can Chandrasekaran,Turner,Bülthoff & f Thornton 13 withstand greater levels of scrambled-walker noise and still achieve 71% correct performance. Other components of attention, such as orienting, alerting and shifting of attention during search, were not correlated with biological motion. The involvement of selective attention is consistent with previous studies that have suggested certain aspects of biological motion processing may be slow, active and effortful (e.g., Cavanagh, LaBianca, & Thornton, 2001;Thornton, Rensink, & Shiffrar, 2002), rather than spontaneous and automatic, as has been traditionally claimed (Johansson, 1973). Given the nature of the current walker task -finding a target in a mask and holding on to the dynamic pattern long enough to determine direction -it is perhaps not surprising that performance is correlated with an ability to selectively attend and to ignore irrelevant items. Of course, a task analysis might also have predicted that visual search would also be highly correlated, which was not the case. Similarly, the ability to orient attention or individual differences in arousal changes in response to a biologically salient target could have emerged. Again, there was no evidence that this was the case.
Nevertheless, it seems important to determine whether the current findings depend on the use of concurrent masking. That is, does the observed relationship reflect something about segmenting the target from a dynamic background of noise, or does it relate more generally to the demands of processing biological motion? The next experiment was designed to address this question.

EXPERIMENT 2
In Experiment 2, we presented observers with unmasked point-light figures performing a variety of complex, familiar actions. The task was simply to make a speeded judgement on the orientation of the display (Pavlova & Sokolov, 2000;Sumi, 1984). That is, on each trial, observers were asked to decide if the figure was upside-down or upright. On 50% of trials, the figures were presented in a normal orientation, on the remaining 50% of trials they were presented inverted, that is, rotated 180° in the picture plane. We measured selective attention via the same Stroop task used in Experiment 1. Our question was whether a similar correlation between the two types of tasks would still be present even when the figure did not need to be extracted from a mask.

Method
Participants. Twelve students from Swansea University took part in this experiment in exchange for partial course credit. All observers reported normal or corrected to normal vision and were naive with regard to the purpose of the study.
Apparatus. Stimuli were presented on a 21 inch (37 cm x 28 cm) monitor with a refresh rate of 75 Hz and a resolution of 1600x1200 pixels. Observers sat approximately 50 cm from the monitor in a dimly lit room. Responses were collected via a standard keyboard. A hand-held, electronic stopwatch was used to collect reaction times for the Stroop task.
Biological Motion Task. The point-light stimuli were obtained from the database of motioncaptured actions described in detail in Vanrie & Verfaillie (2004). Eighteen different pointlight actions were used, namely: Chop, Crawl, Cycle, Drink, Drive, Jump, Paddle, Paint, Play pool, Play tennis, Pump, Row, Saw, Spade, Stir, Sweep, Walk, and Wave. Each upright movie was duplicated and rotated 180° to give a total of 36 target actions. The figures were draw as 13 white dots on a black background presented in the centre of the screen. Each dot subtended approximately 0.25° visual angle and each figure approximately 14° in height, with width varying as function of action, between 3° and 5° at the widest extent of the limbs. All figures were oriented 45° away from the observer -right when upright, left when invertedto increase the visibility of limb movement during the action. As many of the actions were non-symmetrical, the difference in left/right orientation would not be a reliable cue to picture plane orientation, particularly as the starting frame of each movie was randomized. Individual actions ranged in duration from 0.6 to 3.4 seconds. However, the video files were looped and remained visible until the observer responded. Custom written MATLAB code was used to load the movies, control playback and collect responses.
Stroop Task. The stimuli and design of this task were identical to that described in Experiment 1, except a handheld stopwatch was used to record reaction times.
Procedure. As in Experiment 1, the biological motion task was always run first. Written instructions were provided and the task explained by the experimenter. Each trial was initiated by the participant and they were instructed to make speeded response to the orientation of the figures, by pressing two designated keys. Three practice actions were randomly selected, to familiarize the observers with the nature of the stimuli, but no specific training on biological motion was provided. Each participant completed 36 trials in a separately randomized order. On average the experiment took around 15 minutes to complete.
Once the biological task had been completed the Stroop task was performed. This began with verbal instructions and the presentation of a practice list, which contained colourneutral words. When the participant was familiarized with the task two further lists were read aloud and the total reading time recorded by the experimenter. The order of the second two lists -incongruent and neutral -were counterbalanced across participants. This section of the experiment took a further 5 minutes. k

Biological Motion Task. l
Performance varied considerably in this task, both in terms of speed and accuracy (see Table 4). The percentage of correct responses ranged from 33 to 91 % (M = 66 %) and the median reaction times for correct responses ranged from 646 ms to 1649 ms (M ( = 1052 ms). There was a clear, negative correlation between these two measures, with the more accurate observers also tending to respond more quickly (r = -0.69, p <0.05). Only one observer appeared to have been trading speed for accuracy, suggesting they were not performing the task as instructed. Removal of this potential outlier increased the fit, both here and in subsequent analysis. However, as the overall pattern of results did not change we opted not to exclude the data. Further analysis of the accuracy data revealed that the four lowest scoring participants were at or below chance levels of performance, with d-prime values in the range -0.6 to 0.0. The patterns of hits and false alarms shown in Table  4, indicate that these observers found the task very difficult and appear to be responding almost randomly. More generally, we note that only three observers exceed 75% correct responses, suggesting that this task was far from easy. We return to this issue, and to the possible implications of chance-level performance in the discussion.
Stroop Task. As in Experiment 1, Stroop interference was calculated as the difference in total reading time for the inconsistent list (M = 12.7 seconds) minus the reading time for the neutral list (M = 7.2 seconds). The presence of interference was consistent across all observers, (t(11) = 19.5, p <.001), and ranged from 4.51 seconds to 7.11 seconds (M = 5.5 seconds). Figures 1C and 1D, there were strong correlations between Stroop interference and both biological motion measures. Specifically, there was a positive correlation between Stroop interference and reaction time, with observers most able to withstand Stroop interference, also responding more quickly on the biological motion task (r = 0.72, p <0.01); and a negative correlation between Stroop and correct responses (r = -0.69, p <0.01), with high levels of interference associated with low levels of accuracy.

Discussion
The relationship between biological motion performance and selective attention, observed in Experiment 1, was replicated in this experiment, despite the absence of masking. This suggests that the correlation does not reflect a general ability to segment targets from their background but is more specifically related to the presence of point-light figures in these two, quite different, tasks.
A rather unexpected finding in Experiment 2 was the relatively low levels of correct responses. In general, observers were not able to perform the orientation judgement as easily as we anticipated. As mentioned above, only three observers performed at anywhere near ceiling levels. Modest hit rates and the appearance of false alarms were quite common (see Table 4). Sumi (1984) noted that inverted figures are sometimes interpreted as novel, upright actions, demonstrating the dominance of the stimulus-driven percept. This may account for the frequency of the false alarms. It remains unclear why the upright actions were missed on numerous occasions. One possibility is that the use of a 45°v iewing angle could have caused some confusion.
The fact that four of our observers were at or below chance raised concerns about whether they should be included in the correlation analysis. However, we should note that when they were removed, the patterns of both speed and accuracy correlations did not change. For example, the strength of the reaction time/Stroop relationship actually improved (r = 0.79, p <0.05). r

GENERAL DISCUSSION
In two experiments we have demonstrated that performance on biological motion tasks is highly correlated with the ability to control selective attention. Observers who are better able to focus on relevant targets or dimensions -as indexed by Stroop and Flanker Interference measures -are better able to process point-light figures. This is true both when those figures are embedded in noise and when they appear alone, without any form of masking. Other aspects of attention, such as orienting, alerting and visual search efficiency showed no such relationship.
The significance of these results lies in the fact that biological motion processing is typically characterized as a spontaneous and automatic process, not one thought to rely heavily on central resources, such as attention (Giese & Poggio, 2003;Johansson, 1973;1975;Mather et al., 1992). Thus, while many other forms of effortful visual task might show a similar correlation with attention, such a finding is not a straightforward prediction within the context of biological motion.
We should be clear that we are not disputing the fact that the processing of biological motion can proceed in a fast, efficient, bottom-up manner. There is considerable evidence to support this notion (e.g., Giese & Poggio, 2003;Mather et al., 1992), and, indeed, some of the most direct empirical work comes from our own group (Thornton & Vuong, 2004). It also seems, however, that this processing route is sometimes not sufficient to support appropriate behavioural responses (Blake & Shiffrar, 2007). That is, certain display manipulations or task demands may require the addition of top-down, active processing strategies in order to explicitly interpret point-light stimuli (e.g., Bertenthal & Pinto, 1993;Bülthoff et al., 1998;Cavanagh et al., 2001;Thornton et al., 1998;Thornton et al., 2002). We believe our current findings relate specifically to this latter form of processing.
Following from previous studies indicating that attention may be involved in biological motion processing (e.g., Battelli, et al., 2003;Cavanagh, et al., 2001;Thornton, et al., 2002) our contribution here is to highlight the likely role of one specific network of attention, one that is involved in selection and executive control (e.g, Posner & Peterson, 1990). This finding may help to identify appropriate tasks to further probe the role of attention in biological motion processing -for example search for change (Rensink, 2002), attentional blink (Raymond et al., 1992) or visual marking (Watson & Humphreys, 1997)and may even help constrain the interpretation of emerging lesion and imaging studies, such as the recently noted involvement of frontal cortex in the processing of biological motion (e.g., Saygin et al., 2004;Saygin, 2007).
Finally, we'd like to consider the implications of another feature of our data, that is, the variability in biological motion responses. While some intersubject variability is always to be expected in any task, the range of responses we found, particularly in Experiment 2 -with one group of observers at chance, and another close to the ceiling -was a little surprising. Such variability seems inconsistent with the notion that a single, bottom-up, passive mechanism could be supporting behaviour in an automatic fashion (Johansson, 1973;1975). Our observers were relatively naïve, and received only very minimal pre-experimental exposure, but then their task -is the actor upside-down -could not be considered very demanding. This raises two issues. Firstly, is such variability unusual when naïve observers have to make decisions on unmasked point-light stimuli? Typically, studies of biological motion, at least with normal populations, have not considered individual data to be of interest, so it will have been averaged away. Second, could this variability tell us something useful about the nature of biological motion processing? Here, clearly, we have shown one way in which such data can be informative.