Feature binding in zebrafish

The binding problem is the brain's fundamental challenge for advanced sensory processing: objects in the outside world possess multiple features, which must be bound into a cohesive perceptual representation. Although there is suggestive evidence that nonmammalian vertebrates (and possibly insects) may support it, this rudimentary form of sensory syntax is ascribed primarily to cortex or similarly complex avian structures. The experiments reported here provide evidence that a small vertebrate lacking cortex supports visual feature binding for social behaviour. Zebrafish, Danio rerio, displayed spontaneous preference for images of other zebrafish in which the visual attributes of form and motion were paired in a meaningful fashion, while each attribute in isolation was rendered ineffective as a cue for discrimination. The ability to conjoin the two features was robust and remarkably flexible. These results challenge the notion that feature binding may require cortical structures and demonstrate that the nervous system of small vertebrates can afford unexpectedly complex computations.

The binding problem is the brain's fundamental challenge for advanced sensory processing: objects in the outside world possess multiple features, which must be bound into a cohesive perceptual representation. Although there is suggestive evidence that nonmammalian vertebrates (and possibly insects) may support it, this rudimentary form of sensory syntax is ascribed primarily to cortex or similarly complex avian structures. The experiments reported here provide evidence that a small vertebrate lacking cortex supports visual feature binding for social behaviour. Zebrafish, Danio rerio, displayed spontaneous preference for images of other zebrafish in which the visual attributes of form and motion were paired in a meaningful fashion, while each attribute in isolation was rendered ineffective as a cue for discrimination. The ability to conjoin the two features was robust and remarkably flexible. These results challenge the notion that feature binding may require cortical structures and demonstrate that the nervous system of small vertebrates can afford unexpectedly complex computations.
Ó 2012 The Association for the Study of Animal Behaviour. Published by Elsevier Ltd.
The earliest stage of biological image processing is widely regarded as a highly specialized process supported by detectors selectively tuned to individual features of the incoming stimulus, such as orientation, colour and motion (Zeki & Shipp 1988); these different attributes, initially encoded within distinct neural structures, must be reassembled into a unified perceptual representation of the outside world (Treisman 1996). It has been recognized for several decades that this more advanced stage of processing is highly demanding and can fail under some circumstances (Wolfe & Cave 1999), thus representing a challenging 'binding' problem for sensory systems (Roskies 1999).
Current attempts to relate existing theories of feature binding (Treisman 1996) to known neural structures rely primarily on cortex (Zeki & Shipp 1988;Shafritz et al. 2002;Robertson 2003;Botly & De Rosa 2009). The preferential attribution of featurebinding capabilities to this highly evolved mammalian structure is motivated by the lack of conclusive evidence that perceptual feature binding may be performed by animals with allegedly more limited neural resources than mammals (chapter 3 in Shettleworth 2008). Birds (which lack cortex) possess this ability (Cook 1992;Blough & Blough 1997;Katz et al. 2010) but their brains are equipped with neural structures of equivalent estimated potential to those of mammals (Jarvis et al. 2005).
The above statements specifically refer to perceptual feature binding: the ability to carry out perceptual discriminations that require access to a bound sensory representation and cannot be performed by relying on individual features alone (see General Discussion for further clarification). At present there is no conclusive experimental evidence for this ability in reptiles, amphibians or fish (there is also no definitive evidence from invertebrates such as insects, even though these animals have been shown to display remarkably complex visually guided behaviour; Collett & Collett 2002;Srinivasan 2010).
Because nonhuman primates find conjunction tasks especially difficult (Smith et al. 2004), it is conceivable that creatures such as fish may not support this ability at all, particularly in view of the current notion that binding is intimately linked to higher-level cognitive phenomena such as attention (Treisman 1996;Robertson 2003). On the other hand, there is substantial suggestive evidence from other forms of binding-like operations that nonmammalian vertebrates (e.g. toads, Bufo bufo: Ewert et al. 1979) and some insects (e.g. honeybees, Apis mellifera: Schubert et al. 2002) may support this type of cognitive operation; furthermore, fish possess neural structures that may be homologous with the mammalian cortex (Mueller & Wullimann 2009). The question remains open.
In this study, I investigated whether the zebrafish, Danio rerio, a small teleost, supports feature binding and whether it relies on this ability for the purpose of social aggregation (Miller & Gerlai 2011: 'shoaling'). I used stimuli specifically designed to exclude the possibility that the results may be explained by the animal relying on a single visual feature (Shepard et al. 1961;Smith et al. 2004), requiring instead compulsory conjunction of form and motion. These two visual attributes are widely believed to be processed by different cortical regions in primates (Zeki & Shipp 1988) encompassing a rich circuitry which, by some morphological accounts, may appear orders of magnitude more complex and articulate than the zebrafish brain.

Animals and Test Apparatus
I used wild-type zebrafish (age range 4e12 months) bred and maintained by trained staff in a dedicated facility (Institute of Medical Sciences, Aberdeen, U.K.). Outside testing, fish were kept inside a 10-litre storage tank (average density two fish per litre) attached to a recirculated system (Aquatic Habitats, Apopka, FL, U.S.A.) at 27 C on a 14:10 h light:dark photoperiod and never exposed to heterospecifics. They were fed brine shrimp twice a day (at 0930 and 1630 hours). During testing, one fish was transferred to a test tank measuring 25 Â 13 cm and 11 cm high; water within the test tank came from the storage tank and room temperature was thermostatically controlled. The two furthest sides of the test tank were placed against two identical LCD monitors (Samsung EX1920W) while the remaining sides were lined with nonreflective white paper. The two monitors were clones controlled by one computer but the two regions of the monitors that were adjacent to the tank were different, allowing independent control over the images displayed to the two sides. All stimuli were generated and presented using custom Matlab (Mathworks, Natick, MA, U.S.A.) software; the operating system (linux) simultaneously controlled a webcam located above the test tank (44 cm from the water surface) and acquired images of 320 Â 240 pixels at 4 Hz (see Supplementary Movie S1). These images were stored on the hard drive for automated offline analysis (see below). To tailor image quality to the tracking algorithm, as well as to avoid the fish inspecting irrelevant features lying above the tank, the sides of the test tank were raised 24 cm above the water surface using black nonreflective cardboard and indirect lighting was generated by a halogen lamp. Each fish was tested only once for a given experimental condition and stimulus generation/data acquisition were automatically controlled by computer software; after placing the fish in the test tank and launching the software, I would leave the room and return at the end of the experiment to repeat the process for a different fish. After testing, fish were returned to the breeding stock. Ethical approval for all the research reported in this study was obtained from the University of Aberdeen Ethical Review Committee. The work was deemed as nonregulated by the Home Office Inspector; however, input was received from the Home Office Inspector and the Named Veterinary Surgeon and the care of all fish was under the remit of the Animals (Scientific Procedures) Act 1986. No animal licence was required because the behavioural procedures used here were harmless and only involved wild-type animals.

Visual Stimuli and Presentation Protocol
The footage shown in Supplementary Movie S2 was obtained by filming wild-type zebrafish from the same colony that comprised the test fish. In addition, synthetic movies (Supplementary Movies S3eS5) were generated by adding small images of a zebrafish, a manipulated zebrafish or a needlefish, Xenentodon, to a grey background. I refer to these images as 'icons' and illustrate the procedure for the movie shown in Supplementary Movie S3; identical procedures were adopted for the other movies. Individual icons were initially placed within the image at random spatial locations and made to drift horizontally at a constant speed of 6.5 cm/s. Half the icons faced left and half faced right; half moved to the left and half to the right. Icons that were facing left (right) were also moving left (right) in the congruent condition; the opposite pairing was adopted for the incongruent condition (this was simply obtained by playing the movie backwards). When two icons overlapped within the image, the icon added more recently was painted over the other icon (partial occlusion, see Supplementary Movie S3). All movies lasted 16 s and were generated using a cyclical structure: the end of the movie matched the beginning of the movie, so that the movie could be played smoothly for many repetitions without glitches. The footage clip was similarly selected so that the first and last images were almost identical (see Supplementary Movie S2), resulting in a smooth transition during repetition (no detectable glitch). Each phase (test/ baseline) lasted 8 min (30 movie cycles). The movie presented on one end of the tank was 8 s out of phase with the movie presented on the other end; this means that even during baseline phases, when the same movie was presented on both ends, two different portions of the movie were presented at a given time. When different movies were presented on the two ends (test phase), the movie presented on a given monitor was alternated between monitors from fish to fish to eliminate potential lateral bias (all data were realigned to the same notional side for analysis and presentation purposes). Any such bias would also be factored out by subtracting the baseline phase from the test phase (Figs 1e4); however, in practice there was no significant bias during the baseline phase ( Supplementary Fig. S1). I retained this phase in all experiments for two reasons: (1) it enabled me to confirm, on an experiment-by-experiment basis, that the apparatus and procedures were correctly calibrated (i.e. unbiased); (2) it allowed the fish to acclimatize and recover from the stress of being caught. The baseline phase displayed the original movie for experiments shown in  Fig. 3, the baseline phase displayed the congruent zebrafish stimulus for experiments involving zebrafish stimuli (black/yellow/red symbols) and the congruent needlefish stimulus for experiments involving needlefish stimuli (blue/magenta symbols); for the form-only experiments in which both zebrafish and needlefish stimuli were presented during the test phase (green symbols in Fig. 3b), the baseline phase displayed blank screens.

Movie Tracking
I wrote software specifically tailored to the images collected during the experiments; the algorithm was therefore robust and efficient in the absence of any human intervention (see Supplementary Movie S1). The software relied on standard subtraction methods for motion detection (McIvor 2000): the average image was computed across all 16 min of movie recording (baseline plus test phases) and subtracted from each individual frame. The software then applied a threshold of 6Â the standard deviation of intensity values within each frame and performed cluster analysis of the threshold image around the location of minimum intensity (fish image was dark). The resulting cluster was selected (red-tinted pixels in Supplementary Movie S1) and its centroid coordinates were used as position pointers for the test animal (yellow cross in Supplementary Movie S1). After the animal's position had been identified on every frame, the software automatically selected (via edge detection) an active area for the test tank (indicated by blue rectangle in Supplementary Movie S1) and rescaled all longitudinal positions to range between 0 and 1 within this region (so that 0.5 corresponded to equidistance from the two monitors). In Fig. 1 the active area is indicated by the outer rectangle and the individual tracked positions by dots.

Zebrafish Discriminate the Movie Played Backwards
Individual zebrafish were placed inside a test tank (rectangular area in Fig. 1b, e) fitted with two visual displays on opposite ends. The position of the fish was monitored via an automated videotracking procedure (Cachat et al. 2010; see Methods and Supplementary Movie S1). When footage of other zebrafish was shown on one end ('test' phase), the fish displayed a robust tendency to spend more time within that region of the tank (Rosenthal & Ryan 2005;Engeszer et al. 2008;Miller & Gerlai 2011;Fig. 1def; see Supplementary Movie S2 for stimulus examples). No such biased behaviour was observed when the same movie was shown on both ends ('baseline' phase, Fig. 1aec). This result was confirmed across the sample population by computing the difference in mean longitudinal position between test and baseline phases for each fish: data points (one per fish) in Fig. 1g fall to the left of the vertical dashed line at P < 10 À7 on a two-tailed t test (the same conclusions were reached when test and baseline phases were analysed separately, see Supplementary Fig. S1).
I then presented the intact movie on one end of the tank and a manipulated movie on the other end. The first manipulation consisted of flipping the movie upside-down (see Supplementary Movie S2) and was motivated by the established result that inversion impairs the perception of agency in human vision (Neri et al. 2007). In contrast to humans and other vertebrates (Vallortigara & Regolin 2006), zebrafish did not discriminate between upright and upside-down images of conspecifics (Fig. 1h). The second manipulation consisted of playing the movie backwards. This manipulation left form information unaltered (the same set of static pictures was displayed) and motion information virtually unaltered (with the exception that accelerating motions became decelerating and vice versa); form and motion therefore had to be integrated for the fish to distinguish the intact movie from the one played backwards, in line with current thinking about the perception of agency as representing a paradigmatic example of form -motion binding in human vision (Oram & Perrett 1996;Giese & Poggio 2003;Ibbotson 2007;Nishida 2011). Zebrafish were able to carry out this discrimination and demonstrated preference for the intact movie (Fig. 1i).

Zebrafish Prefer Congruent Form and Motion
The footage stimulus preserved many aspects of the sensory stimulation experienced by the fish during natural vision, but was difficult to manipulate in a detailed and controlled fashion. therefore resorted to using computer-generated stimuli (Saverino & Gerlai 2008) containing a side-view image of a zebrafish that was artificially moved along a rectilinear, constant-velocity trajectory. The final movie contained 12 such synthetic zebrafish, six moving to the left and six to the right ( Fig. 2a;  . As a first step towards validating this artificial stimulus, I successfully replicated the results obtained using real footage for detection (Fig. 2c), inversion (Fig. 2d) and time-reversal (red symbols in Fig. 2e), and verified that the choice of baseline stimulus was unimportant (magenta/yellow symbols in Fig. 2e). I also found that zebrafish showed positive preference for the incongruent stimulus in a detection experiment (Fig. 2f), demonstrating that both congruent and incongruent stimuli were preferred by the fish in the absence of a competing stimulus. However, when the two stimuli were directly pitted against each other, the fish showed a marked preference for the congruent condition (Fig. 2e).
Reversing the artificial stimulus left all motion information unaltered (this was not exactly the case for the footage owing to the presence of accelerating/decelerating movements): both intact and reversed movies contained six rightward-moving and six leftwardmoving objects (Fig. 2a, b); current models of biological motion detectors are insensitive to featural shape information (Adelson & Bergen 1985), so for example a rightward-preferring motionenergy detector will respond equally to a rightward-moving fish shape regardless of whether the fish shape is oriented to face right or left (Feldman & Tremoulet 2006). Form was equally preserved: both movies contained six rightward-facing and six leftward-facing fish shapes. The difference between the two movies lay solely in the conjunction of motion and form: congruent in one case (the fish image faced the same direction in which it moved; Fig. 2a), incongruent in the other ( Fig. 2b; see also Supplementary Movie S3). The result obtained with the artificial stimulus demonstrates unambiguously that the fish were able to bind form and motion. In related experiments with human participants, reversing the heading direction associated with ambiguous figures often biases object interpretation (Bernstein & Cooper 1997).

Feature Conjunction, But Not Discrimination, Depends on Shoal Size
Shoaling of an individual towards a conspecific shoal may depend on shoal size (Pritchard et al. 2001). I found that detection showed no such dependence (black data in Fig. 3a), while the conjunction of form and motion showed a marked dependence on shoal size (red data in Fig. 3c). A possible explanation for this result is that one of the two features (either form or motion) ceased to be perceptually detectable at small shoal sizes: detection would still be possible by relying on the other feature, while binding would fail. I carried out two additional experiments to test this possibility.
The role of form was selectively examined by presenting synthetic needlefish (known to share habitat with zebrafish, Engeszer et al. 2007) on the competing end of the tank (see icons at bottom of Fig. 3b and Supplementary Movie S3). Zebrafish and needlefish stimuli delivered comparable motion signals; however, different shapes were presented on opposite ends of the tank. Zebrafish shoaled towards their conspecifics (Ward et al. 2003) for all shoal sizes tested (green data in Fig. 3b), demonstrating that the visual feature of form (comprising shape, colour and size) remained discriminable independently of shoal size.
I probed the role of motion by presenting the intact synthetic zebrafish movie on one end and still images randomly sampled from the same movie every 16 s on the other end. Over the course of the test phase (8 min) the same average number of shapes was therefore presented on both ends of the tank. However, only the intact movie contained motion. Zebrafish showed robust preference for the moving stimulus; similarly to form, the visual feature of motion remained discriminable independently of shoal size (yellow data in Fig. 3b).
From the above two experiments I conclude that both form and motion could be discriminated by the zebrafish for small shoal sizes. This result implies that when only two synthetic fish were presented, the fish was able to discriminate both their form and their motion (bottom data points in Fig. 3b) but was nevertheless unable to discriminate the conjunction of these two features (bottom red point in Fig. 3c): feature binding failed despite both features being independently available to the sensory system of the fish. As the number of stimulus fish was increased, feature binding became successful and the task of discriminating congruent versus incongruent movies could be performed (top red data point in Fig. 3c).

Response to Images of Heterospecific Animal
It is important to establish whether the feature-binding ability documented so far is flexible and capable of generalization, or whether it is rigidly connected with the specific stimulus configuration used in the previous experiments. As a first step towards answering this question I repeated the experiments for a different synthetic stimulus fish, the needlefish used earlier (see Supplementary Movie S3). The results for detection were similar to those obtained with synthetic zebrafish (blue data in Fig. 3a); those for conjunction showed similar dependence on shoal size but were in the opposite direction (zebrafish shoaled towards the incongruent stimulus; see magenta data in Fig. 3c).
When taken together, the results detailed above provide evidence that feature binding in the zebrafish is a qualitatively different phenomenon from the perception of individual features: while preference for the congruent movie against a blank screen was similar regardless of the identity of the stimulus fish and showed little or no dependence on shoal size (Fig. 3a), preference for the congruent versus incongruent conjunction was qualitatively different depending on the identity of the stimulus fish and showed a marked dependence on shoal size (Fig. 3c). One could speculate that these phenomena may be supported by different neural structures. It is relevant in this respect that human patients with parietal lesions experience illusory conjunctions of form and motion: when presented with a letter 'X' moving vertically and a letter 'T' moving horizontally, they occasionally perceive a letter 'X' moving horizontally and a letter 'T' moving vertically (Bernstein & Robertson 1998).

Local Processing and Invariance to Orientation/Direction
I created a version of the synthetic fish in which head and tail swapped positions (see icon in Fig. 4a). In one movie (locally congruent) the fish moved in a direction that matched the local orientation of its subparts (head/tail) but not their global arrangement (the tail moved in front of the head; left icon in Fig. 4a); conversely, in the other movie (globally congruent) the fish moved in a direction that matched the relative arrangement of head and tail (head moved in front of the tail) but was opposite to their local configuration (right icon in Fig. 4a). By pitting these two stimuli against each other, I could determine whether zebrafish encode stimuli in global terms, that is, as consisting of head and tail regardless of the local details of each, or in local terms, that is, with relation to how each subcomponent is oriented and regardless of how they are arranged with respect to each other. The results showed that feature binding was supported by the local configuration of the stimulus (Fig. 4a).
I then tested the role of head and tail separately. Feature conjunction was supported by head-only stimuli (Fig. 4c) but not tail-only stimuli (Fig. 4b), indicating that the critical component of the visual stimulus is represented by the head. Using this minimal stimulus, I then asked whether feature binding generalized to other shape orientations and motion directions (see Supplementary Movie S5). Zebrafish displayed shoaling preference for the congruent conjunction of form and motion when the synthetic fish heads were oriented vertically (Fig. 4e) as well as diagonally (Fig. 4d), demonstrating remarkable flexibility of the underlying visual computation (see Ewert et al. 1979 and the General Discussion for related results in toads).

GENERAL DISCUSSION
Visually driven preferential shoaling of zebrafish behaviour has been demonstrated on many occasions in the literature (Miller &  Gerlai 2011), but the results from all these previous studies can be explained in terms of selectivity for a single visual feature (e.g. body stripes: Rosenthal & Ryan 2005;Engeszer et al. 2008). This explanation is unable to account for the preferential response to the congruent stimulus documented here; the results detailed earlier provide the first conclusive demonstration of perceptual feature binding in a teleost fish. Because the phenomenon reported here is robust and can be demonstrated using fully automated procedures on a relatively small number of animals (ca. 10), it represents a promising avenue of investigation for high-throughput genetic characterization and manipulation (Muto et al. 2005;Friedrich et al. 2010;Norton & Bally-Cuif 2010). It is also noteworthy that it relates to the perception of agency with associated shoaling preference and that it applies not only to synthetic stimuli ( Fig. 2e) but also to more ecologically valid ones (Fig. 1i), underscoring both its robustness and its immediate relevance to the animal's social behaviour (Miller & Gerlai 2011). The class of associative perceptual operations probed by the experimental paradigm used here is distinct from other forms of complex associations that have been studied in animal behaviour such as delayed-symbolic-match-to-sample tasks (Alsop et al. 1995). The ability to perform the latter class of tasks demonstrates that the animal supports memory-based groupings of different features (D'Amato et al. 1985); for example honeybees can be trained to associate (via reward) specific colours with specific orientations presented at different times (Zhang et al. 1999), and even to assign different choice patterns to multiple combinations of orientations and shapes (Fauria et al. 2000). In related experiments, spatial navigation in fish has been shown to rely on both geometric and featural (nongeometric) information in a conjoined fashion (Sovrano et al. 2002;Vargas et al. 2004). These notable results have important implications for memory and learning (Srinivasan 2010) but not necessarily for perception: they do not imply that the animal is able to discriminate the conjunction of those two features visually in the absence of other (e.g. single-feature) visual cues. The protocol used here was designed to probe the strictly perceptual nature of feature binding in the sense of classical work on visual exclusive-or (XOR) classification (Shepard et al. 1961) in which binding information is selectively isolated by rendering individual features ineffective as cues for discrimination.
Previous work with nonmammals has provided suggestive evidence that they may support this kind of classification, but has never tested it directly. For example, in what is perhaps the series of experiments that has come closest to addressing this topic, honeybees have been shown capable of learning stimulusereward associations involving combinations of colour and orientation of simple visual patterns (Schubert et al. 2002). These results suggest that honeybees support associations between different features, but the bees were not required to perform perceptual discriminations involving compulsory feature binding (all stimulus pairs to be discriminated differed with respect to at least one visual feature). This ability has been tested directly in humans (Shepard et al. 1961;Neri & Levi 2006) and nonhuman primates (Smith et al. 2004), but not in other vertebrates or insects. In order for this ability to be tested directly, the same visual pattern should contain both features (e.g. orientation and colour) and each feature should appear in both configurations (e.g. vertical and horizontal for orientation, blue and yellow for colour); the way in which the two features are then combined into separate objects (e.g. vertical yellow objects, horizontal blue objects) becomes the sole cue for discrimination (Shepard et al. 1961).
The above specifications are not merely of academic interest, but rather fundamental to the notion of binding: they emphasize the importance of demonstrating that the animal is able to bind features to specific objects using spatial proximity as the cohesive element defining distinct objects (Treisman 1996;Wolfe & Cave 1999). In previous studies in which preference was demonstrated for the presence of multiple features (e.g. Ewert et al. 1979;Fauria et al. 2000;Smith et al. 2004) it was not established whether such preference was conditional upon two features belonging to the same visual object: preference may have been driven by the presence of the two relevant features regardless of whether these were arranged in a manner consistent with an object-bound interpretation. For example, Ewert's pioneering work on toads has demonstrated that these animals show preference for a segment oriented along the direction in which it is moving (Ewert et al. 1979; as opposed to orthogonal to it). This preference was invariant to the chosen direction/orientation, a result that is clearly related to the one demonstrated in Fig. 4cee. These experiments have provided suggestive evidence that toads are able to bind form and motion, but have not established whether they are able to exploit the binding information when all other cues are selectively eliminated, and have not provided clear indications as to whether orientation and motion must belong to the same object for the toad to operate perceptual binding. In the experiments reported here, binding to a spatially defined object was critical for supporting discrimination: the differential pairing between a specific direction of motion (e.g. leftward-moving) and the underlying shape (e.g. rightward-facing) was defined locally within the image at the level of individual objects. In the absence of object-based spatial binding, congruent and incongruent stimuli are not discriminable (they are matched with respect to both form and motion signals); this fundamental binding property is selectively targeted by variants of the XOR classification protocol (Shepard et al. 1961).
It may be argued that the very definition of the binding problem relies on the prior notion that different visual features, such as form and motion, are analysed separately in the brain of the animal (Roskies 1999), at least in the early stages of the sensory process. As mentioned earlier, this notion is supported by substantial evidence in primates (Zeki & Shipp 1988;Treisman 1996). Although we lack sufficient physiological evidence to be certain that it would also apply to fish, existing measurements from motion-sensitive neurons in goldfish, Carassius auratus auratus (Masseck & Hoffmann 2009), dogfish, Scyliorhinus canicula (Masseck & Hoffmann 2008), trout, Salmo gairdneri (Klar & Hoffmann 2002) and zebrafish (Sajovic & Levinthal 1982) all indicate that these animals possess dedicated centres for processing visual movement with similar properties to those studied in tetrapods. Furthermore, behavioural responses to motion signals in the zebrafish are consistent with the motion energy model across a diagnostic range of stimulus configurations (Orger et al. 2000), underscoring the general applicability of this modelling framework. It therefore seems likely that the conceptual structure underlying the binding problem is relevant to the visual system of fish. In this context, it should be noted that in the experiments reported here I observed qualitative differences between feature discrimination and feature conjunction (Fig. 3), suggesting that it is reasonable to treat them as separate operations.
Earlier research has shown that zebrafish can detect secondorder motion (Orger et al. 2000), a visual function previously hypothesized to require advanced neural circuitry (Baker 1999;see Sovrano & Bisazza 2008 for examples of higher visual function in other fish species); however, the visual ability reported here is of a substantially different scale in its complexity. The only known neural structure able to support the level of discrimination required to explain the present results is the superior temporal polysensory area (STPa), a higher-level associative region of the monkey brain in which neurons respond preferentially to congruent form and motion of walking agents (Oram & Perrett 1996). The complexity of this cortical network (Van Essen et al. 1992) should be contrasted with the far simpler response properties documented for visually responsive neurons in the zebrafish optic tectum (Sajovic & Levinthal 1982;McDowell et al. 2004;Del Bene et al. 2010) and the associated neuronal resource which, at an estimated 100 000e200 000 cells (Kawai et al. 2001;Hill et al. 2003), is comparable in size to the minimal functional unit of monkey primary visual cortex (Hansel & Sompolinsky 1996;Horton & Adams 2005; the hypercolumn). It is possible that zebrafish are equipped with single cells that respond selectively to both direction and orientation of visual stimuli, thus integrating motion and form in an analogous manner to STPa neurons in the monkey (Oram & Perrett 1996); future electrophysiological experiments will hopefully clarify this issue. The behavioural results reported here show that, notwithstanding the vast disparity in available circuitry between primate and teleost, the zebrafish brain can compete in its capability for visual function; it therefore appears that the algorithm underlying feature binding can be implemented with substantially less neural resource than posited by existing theories of functional architecture in neural systems (Treisman 1996;Shafritz et al. 2002;Robertson 2003).