Main

The optic flow1 that is generated when a person moves through the environment can be locally decomposed into several basic components, including radial, circular, translation and sheer motion2,3. Neurons in the dorsal portion of the medial superior temporal cortex (MSTd) of macaque monkeys respond selectively to these components, alone or in combination4,5,6,7. Microstimulation can influence the direction of heading of a behaving monkey8, which demonstrates the functional importance of MST to heading. In the adjacent area MT (or V5), neurons also respond to motion and are highly directionally selective, but they do not show specific selectivity to circular or radial trajectories9. The neurophysiology findings are reinforced by psychophysical studies suggesting the existence of analogous neural units in humans that integrate local-motion signals along complex flow trajectories10,11. In agreement with the neurophysiological studies, these units have very large receptive fields12 and sum information over periods of one or two seconds (L.Santoro and D.C.Burr, Perception 28, 90c, 1999). There is also some evidence in humans for selectivity along the 'cardinal directions' of optic flow (radial and circular)13, although this issue is somewhat controversial14.

A very strong motion-selective response at the boundary of Brodmann's areas 19 and 37 is shown by human imaging15,16,17,18,19,20,21,22,23. This area is generally thought to be the human analogue of monkey V5/MT-MST regions, and is referred to as 'V5/MT complex' (MT+). Other motion-sensitive areas have been identified: a dorsal region referred to as V3A16,21,22,24 and a ventral region, seemingly activated by motion boundaries and second-order motion20,21. Here, using fMRI, we examined the response to optic-flow stimuli, and showed selective activation of a large area within the V5/MT complex to radial and circular motion, very different from the area activated by translational motion. Activation in response to optic flow occurred only when the direction of optic flow changed—abruptly or gradually—during the presentation period; continuous radial or circular motion produced no reliable activation in any area when measured against a matched random control.

Results

Response to optic flow

We created dynamic displays of circular, radial, spiral and translational motion from random-dot patterns. The control stimuli were locally identical to the active stimuli; each dot moved along independent, randomly chosen trajectories (Fig. 1; Methods). Responses of subjects to flow motion were measured against these random controls, with both active and control stimuli reversing direction every two seconds. There was a clear and specific activation in the temporal–occipital cortex, with no response in V1 or any other cortical area (example, Fig. 2a). The response extended for more than 1 cm along the sulcus that separates Brodmann's area 19 and 37, within the region that is usually referred to the human analogue of V5/MT complex15,16,17,18,19,20,21. Figure 2d shows the response time course, averaged over nine subjects; even when measured against a control that was well matched for local motion, there was a strong and specific fMRI response within the V5/MT complex, suggesting specialized detectors for this type of motion.

Figure 1: Stimuli used in this study.
figure 1

Left, the three types of coherent motion: rotation, radial and translation. Right, noise controls for each condition. For rotation and radial conditions, control dots moved along independent spiral trajectories, following equation 1 with φ set to random values. Both coherent and noise stimuli had the same center of flow. For the translation condition, the control dots moved along independent straight trajectories, with α of equation 2 set to random values. In conditions when the coherent motion abruptly inverted direction, the noise inverted direction as well.

Figure 2: Example of localization of flow response in subject MC (left), and response time courses averaged over subjects (right).
figure 2

Left, activity maps for three bicommensural slices for three different stimuli in subject MC, recorded consecutively in the same session. Red, voxels with correlation coefficient greater than 0.2 and aggregation areas greater than 6 voxels (corrected, p < 0.01). There was a strong response for the rotation stimulus, which abruptly changed direction every two seconds (a). However, there was no measurable response to continuous rotation in any region (b). Gradually changing flow (periodicity, 4 s) produced a response as strong as abruptly changing motion (c). Right (d–f), time courses of the response, normalized to the individual mean, then averaged over subjects. Error bars, s.e.m. The number of subjects included in the average was 9 (d), 6 (e) or 9 (f). To facilitate comparison between one condition and another, time courses all referred to the same ROI in each individual subject (Methods), encompassing the extensive response that was obtained by the gradually changing flow (pink in the example in b).

We also measured the response to non-inverting flow motion against non-inverting random controls. Surprisingly, the non-inverting flow stimulus elicited no response, neither in the individual activity maps (Fig. 2b) nor in the averaged response time courses (Fig. 2e). However, when the same stimulus was measured against a control of mean luminance (data not shown), there was a very strong response, both in V5/MT complex and in many other visual areas (including V1).

To investigate further the importance of change of flow direction, we produced a stimulus that changed flow direction gradually. By steadily increasing the angle φ of equation 1, the direction of flow motion changed smoothly from pure expansion, to expanding clockwise rotation, to pure clockwise rotation, to contracting clockwise rotation, and returned to pure expansion over a period of four seconds. This stimulus also produced a specific response against a continuous motion control (Fig. 2c and f) of similar strength to the response produced by the abruptly reversing stimuli. This result suggests that the change in flow direction is important to elicit the response, not the local transients of the abrupt inversion.

We quantified the strength and reliability of the fMRI responses in two ways: as the percentage of response modulation in synchrony with the stimulus, and as the ratio of the synchronous to the asynchronous response (signal-to-noise ratio, S/N; Fig. 3). All forms of changing flow motion produced strong and significant responses, indicated both by the high S/N ratio and percentage response, and by the consistent phase, all grouped around the stimulus phase of 90° (Fig. 3). Continuous motion, however, produced weak responses with low S/N ratios and a wide spread of phases.

Figure 3: Summary of responses to various flow stimuli for 13 subjects.
figure 3

S/N ratio, left, and percentage response, right, represented as the distance from the origin. The polar angle represents the phase of the fundamental harmonic of the response. Small open circles (bottom), continuous flow (either radial or circular); large open circles, reversing rotation; triangles, inverting radial motion; filled diamonds, gradually changing flow. For all three conditions of changing flow, response was strong and phase-locked to the stimulus (corresponding to a temporal phase of 90°). The continuous flow condition produced low amplitudes and low signal-to-noise ratios, with no phase coherence.

For each subject, we compared both S/N ratio and percentage response of changing flow (abrupt and gradual) to that produced by continuous flow (Fig. 4). For all subjects, both S/N and percent responses to changing flow were far greater than responses to continuous flow. We compared the responses to the three types of changing flow, by plotting S/N and percent modulation to gradually changing and inverting radial motion against inverting rotation (Fig. 5). For both measures, there was no systematic tendency for responsiveness to differ for the two types of flow, or for gradual versus abrupt change.

Figure 4: Comparison of response to continuous flow and response to the various types of changing flow, measured in the same recording session for the same ROI.
figure 4

S/N ratios, left; percentage response, right. In every case, the response to continuous flow was negligible and always less than responses to changing flow. The continuous lines show the best fit of the data, with correlation coefficients of −0.052 for S/N and 0.13 for percent modulation.

Figure 5: Comparison of gradually changing flow and of inverting radial motion to inverting rotation, both measured in the same recording session and the same ROI.
figure 5

The results showed no obvious tendency for the response to be higher for either condition.

Response to translation

We measured the response (against matched controls) to comparable stimuli that translated vertically, either continuously or inverting direction every two seconds. Both the individual time courses and the time courses averaged over subjects showed strong and reliable modulation in synchrony with the stimulus alternation. Unlike for the flow stimuli, there was a clear response to continuous motion, stronger than that to inverting motion, and both originated from the same cluster of voxels (Fig. 6).

Figure 6: Time course of the responses to vertical motion.
figure 6

Responses were normalized and averaged over seven subjects for the upper curve and five subjects for the lower. Error bars, s.e.m. Top curve, continuous (upward) translation against random noise of matched local speed. Unlike radial and circular motion, this produced a strong response, slightly greater than the response to inverting vertical motion against inverting noise (bottom curve).

The response to translational motion was highly localized, but distinct from that to flow stimuli (examples, Figs. 7 and 8). Both the series of contiguous slices and the three-dimensional reconstruction clearly showed that the two types of motion excited distinct areas. The area responding to flow motion was often larger than that responding to translation, although the depth of the modulation was less for rotation than for translation. Talairach coordinates of the most significant voxels responding to translation and to flow (Table 1; analyzed by SPM99, Methods) showed that, for all subjects, the centers of the regions were far apart; the average distance was 1.4 cm. As these areas (particularly the area responding to flow) were often extensive (1 cm), we also calculated the gap between the regions defined by significantly responsive voxels. Again, this distance was considerable, on average 1.0 cm. In five of the seven subjects, the center of the area responding to translation was more dorsal and posterior than the center of the area responding to flow. In the other two subjects (TLB and AP), the two areas were clearly separated, but the flow area was more posterior. Interestingly, neither of these two subjects had a normal sulcus separating BA 37 and 19; TLB had a slight dysplasia, and AP showed an increased sulcation.

Figure 7: Three contiguous slices parallel to the calcarine fissure, obtained during stimulation by either inverting rotation against inverting noise (top slices) or inverting translation against inverting noise (bottom slices).
figure 7

As before, the labeled areas show voxels with correlation coefficient greater than 0.2, and aggregation areas greater than 6 voxels (corrected p < 0.01).

Figure 8: Three-dimensional reconstruction of the areas of responsiveness to translation (purple) or rotation (red) in subject SB (correlation coefficient greater than 0.2, and aggregation areas greater than 6 voxels, corrected p < 0.01).
figure 8

Although the area that responds significantly to translation is smaller than the area that responds to rotation, the average depth of response modulation was higher for translation, 1.6% compared with 1.4%.

Table 1 Talairach coordinates for significant voxel activation to flow and translation.

Discussion

Previous imaging studies have measured the response to translating stimuli against stationary controls. This protocol tends to activate not only direction-selective motion neurons, but all neurons with transient temporal tuning. We avoided this problem by using controls with well-matched local-motion properties. Each individual dot in the control and in the active stimulus moved at the same local speed, and followed a similar trajectory (linear for translation, spiral for optic flow). Differences in response between the active and control stimuli must have been due to the coherence of the motion trajectories. This implies that neurons exist within the human V5/MT complex that integrate local-motion information along complex trajectories, including translation, rotation and radial motion, confirming neurophysiological4,5,6,7,8,9 and psychophysical10,11,12,13,14 evidence.

The cortical areas that responded to flow motion and translation were distinct. The region that responded to translation was similar to the region reported in other studies that used dynamic controls16,22. On the other hand, the region that responded to flow stimuli was separated by more than 1 cm from the area responding to translation, and was usually more ventral. Both regions were within the confines of the area usually considered to be the human homologue of V5/MT complex, which shows a functional subdivision of this area for different types of motion analysis, as previously suggested19.

The fMRI response to radial or circular motion was not only localized to a different position than the response to translation, but also showed very different functional properties. A response could be obtained from radial or circular motion stimuli only if the direction of flow was periodically changed, either gradually or abruptly, whereas there was a clear response to translation for both continuous and alternating motion. This finding could account for the seemingly contradictory results reported between studies that used continuous flow motion (R.M. Rutschmann et al., Invest. Ophthalmol. Vis. Sci. Suppl. 40, S819, 1999)23,25 and another that used abrupt reversal (J. Intriligator et al., Invest. Ophthalmol. Vis. Sci. Suppl. 40, S819, 1999). To understand the response specificity to flow change, it is useful to consider the neural responses to the stimuli of this study. It is well established that neurons in various visual areas, including both V1 and MT, show directional selectivity, so a subset of these neurons will respond well to the upward motion of the active translation stimulus, whereas most others presumably remain silent. However, the random control comprises motion in all directions, so it should stimulate, to some extent, all motion-sensitive neurons. If the neural activity and the BOLD response are both completely linear, the mass response to the random control and to the unidirectional stimulus should be equal, producing no net modulation. Indeed, this is observed in V1, which showed a strong response with a blank control but no response with the matched random control. The finding that MT complex responded against the matched random control implies a nonlinearity, either within the neurons themselves or in the BOLD response. As there is good evidence for approximate linearity in the BOLD response, at least in V1 (ref. 26), the nonlinearity probably arises from the neural response. A likely source for neural nonlinearity is mutual inhibition between neurons of differing direction selectivity, for which there is fairly good evidence in MT—but not V1—in both monkeys27 and humans28. The mutual inhibition would clearly decrease the response to the random control, where all directions are presented simultaneously, but not the response to the active stimulus that has a single motion direction.

The same argument can be applied for flow-field stimuli. Individual neurons in MSTd are selective to particular directions of optic flow, including the two directions of circular and radial motion. All of these should be weakly activated by the random control, whereas a small subset should be strongly activated by the single direction, such as clockwise motion. Again, if everything were linear, there should be no response, either to the continuous or changing motion. This is one possible explanation for the lack of response to continuous flow motion. Perhaps neurons selective to flow do not exhibit the same type of mutual inhibition that seems to occur in MT, so the mass response to the random control and to unidirectional flow are equal.

Why, then, was there a strong response to changing but not constant flow? Besides the linearity argument stated above, it is also possible that constant flow causes strong adaptation in neurons, so the response is too weak to distinguish from the response to well-matched noise. However, single-cell recordings from macaque MSTd do not support this idea. These neurons have a strong sustained component that remains invariant for up to 30 seconds29,30. However, over 50% of neurons have a transient nonlinear component, both to motion onset and offset29 and to change in flow trajectory30. It is possible that this transient component makes a major contribution to the fMRI response observed here.

The selective response to changing flow stimuli may not have explicit physiological involvement; it may be a simple consequence of the temporal characteristics of the cells tuned to this type of motion. However, it is also possible that the temporal selectivity to change in flow conveys information that is useful, for example, for navigation. In natural conditions, the flow of images on our retinas is rarely constant over time; it changes continuously with each movement of the eyes, head and body. Psychophysical studies show that humans can accurately estimate heading during pursuit eye movements, somehow correcting for the eye movement31. Physiological recordings show that MST neurons change their specificity for flow focus during tracking eye movements, presumably taking advantage of an eye-position signal32. A neuronal population that signals the change of flow produced by these body movements may be instrumental in synchronizing the eye-movement signal with the flow response, to extract stable visual information about the external world from sensors mounted on a highly unstable platform.

Methods

Subjects.

Subjects were eighteen healthy young volunteers (12 male, 6 female). Seven subjects participated in at least two sessions, and some participated in up to five sessions. Subjects lay on their backs and fixated a central dot on a translucent screen within the bore (15 square cm, subtending 33° × 33°) through a mirror at an optical distance of 26 cm. All subjects had normal or slightly myopic vision and could perform the task without effort. Stimuli were generated by a framestore (Cambridge Research Systems, Rochester, Kent, UK) and back-projected onto the screen by a Polaroid (Cambridge, Massachusetts) projector with a suitable collimating lens (Buhl Optics, Pittsburgh, Pennsylvania).

Stimuli.

Motion stimuli were generated from random-dot patterns that moved coherently along either circular, radial or translational trajectories that defined optic flow (Fig. 1). Each dot traveled along an appropriate trajectory for a 'limited lifetime' of 10 frames (300 ms), after which it disappeared to be 'reborn' at a new random position. Fifty dots were presented, half black and half white against a gray background of 300 cd/m2, with 90% contrast. The optic flow trajectory was defined by the following equations:

Radial and angular velocities are defined by dr/dt and dθ/dt respectively, and local speed is defined by v (set to 7°/s, found to be optimal in previous psychophysical studies12). The angle φ defines the direction of optic flow, φ = 0 or π generates radial motion (outward and inward, respectively), φ = ± π/2 clockwise and anti-clockwise circular motion. Intermediate values of φ generate spiral motions (combinations of radial and circular motion). The local speed did not vary with distance from the origin (as it would for rigid rotation), but was constant for all positions (because of the normalization by radius) so as to match more closely the translation condition. However, in two subjects, we also measured the response to rigidly rotating patterns (where local speed varied with radius), and found that the response to this stimulus was virtually identical to the response produced with constant local speed.

For translational motion, the horizontal and vertical velocities dx/dt and dy/dt were given by the following equations:

Direction of motion is defined by α, and local speed is defined by v, again set to 7°/s. In most experiments, the control stimulus was also generated by these equations, by choosing random constant values of φ or α for each dot.

The use of vertically moving stimuli minimized tracking eye movements. However, to be certain that this was not a problem, we measured the eye movements of two subjects observing the stimuli under similar conditions; there were occasional (but rare) saccades, but no continuous optokinetic nystamus.

fMRI methods.

BOLD responses were acquired by 1.5 T General Electric Signa Horizon System (General Electric, Milwaukee, Wisconsin), equipped with echo-speed gradient coil and amplifier hardware, using a standard quadrature head coil. Activation images were acquired using echoplanar imaging (EPI) gradient-recalled echo sequence (TR/TE/flip angle, 3 s/50 ms/90°; FOV, 280 × 210 mm; matrix, 128 × 96, acquisition time, 3.13 min). Volumes of contiguous 5-mm slices parallel either to the calcarine fissure or to the bicommensural axis were acquired every 3 s. A time course series of 64 images for each volume was collected usually in 6 epochs alternating between control and active conditions, each 30 seconds in duration. In some sessions, only five stimulus epochs were used. The first epoch always lasted 13 s longer to allow the signal to stabilize. This initial period was eliminated from any successive analysis. The original sampling volume matrix was resampled to 128 × 128 pixels over 8–10 slices, with final voxel size of 2.19 × 2.19 × 5 mm.

An additional set of anatomical high-resolution two-dimensional SPGR data (TR/TE/flip angle, 150 ms/2.3 ms/120°; RBW, 12.8 kHz; FOV, 280 × 210 mm; matrix, 256 × 192; NEX, 3; acquisition time, 1.41 min) matched to the fMRI images are acquired to identify subsequent localization of the activation areas. A volumetric set of data was acquired (three-dimensional FSPGR, TR/TE/TI/flip angle, 21.1 ms/3.8 ms/700 ms/10°; RBW, 10.4 kHz; FOV, 240 × 180 mm; matrix, 512 × 192; NEX, 1; acquisition time, 9.51 min) to generate a three-dimensional whole-brain reconstruction and a bicommensural axial projection to estimate the anatomical Talairach coordinates33.

BOLD maps for signal intensity changes were generated by temporal correlation of the T2*-weighted EPI images acquired with the task and baseline alternation sequences34 using the software package STIMULATE (J.P. Strupp. Neuroimage 3, S607, 1996). The correlation coefficient threshold used to create the functional maps was set at 0.2, with the additional requirement of a cluster size of 6 voxels. With these values, it was possible to evaluate the effective probability, after considering the correction for multiple comparison (following refs. 3537), to a value less than 0.01. To test whether the subject moved during the recording session, we measured the time course of the center of mass of the head, and these signals were analyzed by FFT (fast Fourier transform). If, in any recording, the S/N ratio of any of the signals (x, y or z) were greater than 1.5, that recording session was rejected.

To validate the differential activation localization between the translational motion and flow motion, we also analyzed the same data using SPM-99 (Wellcome Department of Cognitive Neurology, London, http://www.fil.ion.ucl.ac.uk/spm). The scan from each subject was realigned using the first frame of the temporal sequence as reference. The correction was usually very limited, given that recording sessions with significant head movements had already been eliminated. The images were then smoothed with a Gaussian filter of space constant equal to 6 mm FWHM, and the response was modeled with a delayed box-car function with no temporal filtering. A t-test parametric map was generated from voxels labeled significant with p < 0.0009 (uncorrected). The analysis also corrected for multiple comparisons, both for individual voxels and clusters. We considered significant areas with corrected probability for both the cluster and the single voxel less then 0.05, and determined the Talairach coordinates and z-scores of the most significant voxel of each area (Table 1).

For further quantitative analyses of the signal, we made two separate estimates of signal strength: signal-to-noise ratio and the more conventional modulation amplitude. Both estimates were derived from the average signal of voxels in a particular region of interest (ROI), without considering the individual temporal correlation coefficients. The ROI, usually comprising 100 voxels, encompassed the posterior extreme of the inferior temporal sulcus, extending over the contiguous areas 37 and 19. The average signal was fast Fourier transformed to yield the phase and amplitude of the harmonics in synchrony with the stimulus alternation. The amplitude of the fundamental component (half the peak-to-through excursion), normalized by the mean signal over the full time course, defined the 'response amplitude.' The S/N ratio was given by the ratio of the fundamental component to the mean amplitude of the first 10 components asynchronous to the stimulus periodicity. This approach does not rely on previous knowledge of the shape or delay of the temporal filter, and the phase values of the reliable response give an independent evaluation of the synchrony of the response to the stimuli. The method is particularly valuable in demonstrating the absence of response, which is not available with standard statistical maps.