Abstract
Understanding how the human visual system develops is crucial to understanding the nature and organization of our complex and varied visual representations. However, previous investigations of the development of the visual system using fMRI are primarily confined to a subset of the visual system (high-level vision: faces, scenes) and relatively late in visual development (starting at 4–5 years of age). The current study extends our understanding of human visual development by presenting the first systematic investigation of a mid-level visual region [the lateral occipital cortex (LOC)] in a population much younger than has been investigated in the past: 6 month olds. We use functional near-infrared spectroscopy (fNIRS), an emerging optical method for recording cortical hemodynamics, to perform neuroimaging with this very young population. Whereas previous fNIRS studies have suffered from imprecise neuroanatomical localization, we rely on the most rigorous MR coregistration of fNIRS data to date to image the infant LOC. We find surprising evidence that at 6 months the LOC has functional specialization that is highly similar to adults. Following Cant and Goodale (2007), we investigate whether the LOC tracks shape information and not other cues to object identity (e.g., texture/material). This finding extends evidence of LOC specialization from early childhood into infancy and earlier than developmental trajectories of high-level visual regions.
SIGNIFICANCE STATEMENT Understanding visual development is crucial to understanding the nature of visual representations in the human brain. Previous studies of visual development have investigated children (4 years and older) and high-level visual areas. This study expands our knowledge of visual development by investigating the functional development of mid-level vision [lateral occipital cortex (LOC)] early in infancy. We find surprisingly adult-like functional specialization of the LOC by 6 months of age: infants exhibit shape selectivity, but not object selectivity, in this region.
Introduction
In the last 15 years, fMRI has been used to reveal the incredible complexity and specialization in the adult human visual system with regions exhibiting preferential activation for low-level, (e.g., color, motion), mid-level (e.g., shape, objects), and high-level (e.g., letters, faces, places) visual features (Grill-Spector and Malach, 2004; Wandell et al., 2005). However, the meaning of these differential patterns of activation is widely debated (e.g., what does the fusiform face area represent?; McKone et al., 2007; McGugin et al., 2012). (Note: The meaning of these segregated levels of vision are also debated given the bidirectional flow of information in the adult brain.). A fresh approach to these entrenched debates is to investigate the development of the visual system, but this work has been highly restricted in two ways. First, since fMRI experiments are prohibitively difficult with awake populations below 4 years of age (a rare exception; Biagi et al., 2015), fMRI studies have been restricted to mid-to-late childhood (e.g., 5–12 years; Golarai et al., 2007; Scherf et al., 2011; Cohen Kadosh et al., 2013). By contrast, behavioral research has established that there are rapid changes in visual perception in the first year of life (Aslin and Smith, 1988; de Haan and Nelson, 1999; Arteberry and Kellman, 2016). Thus, these studies miss the periods with the most rapid development. Second, there are very few studies that have investigated low-level (e.g., retinotopic organization; Conner et al., 2004) and mid-level (e.g., object perception; Dekker et al., 2011) vision. These studies have reported subtle developmental differences, but this is likely because they have only investigated older children and have missed the crucial, early developmental stages of these abilities.
We present the first systematic investigation of a mid-level visual region [the lateral occipital cortex (LOC)] in a population much younger than has been investigated in the past: 6 month olds. We use an emerging neuroimaging modality called functional near-infrared spectroscopy (fNIRS), which uses optical methods to record the same physiological signals as fMRI but in a way that allows infants to be able to move, interact with caregivers, and, importantly, attend to visual stimuli (Gervain et al., 2011; Aslin et al., 2015).
Previous research has suggested that the LOC is functionally similar to adults by 5–8 years of age (Golarai et al., 2007; Scherf et al., 2007; Dekker et al., 2011). To functionally localize the LOC, these studies used the classic comparison of activation to various objects compared with their scrambled counterparts (Grill-Spector et al., 1999). However, it is well known that object identity can be identified through many different cues (e.g., shape, color, and texture; Nicholson and Humphrey, 2003), and this localizer includes many of these cues to object identity. Subsequent neuroimaging work has suggested that the LOC is particularly sensitive to object shape: Cant and Goodale (2007) presented adults with various objects distinguished by either shape or texture/material and found that the adult LOC is modulated by changes in shape but not texture. Here, we investigate whether the infant LOC is also tracking object identity and, furthermore, whether it is using shape as the major cue to object identity.
Crucially, we conducted the most rigorous anatomical localization for fNIRS data to date. Based on a meta-analysis that revealed a highly consistent anatomical location of the LOC in adults, we used MR coregistration to determine the anatomical location of each fNIRS channel, based on age- and head-size-matched templates, thereby selecting only those channels sampling the same anatomical location in the infant brain. Visual feature selectivity was investigated using repetition suppression (RS; Grill-Spector et al., 1999; Vuilleumier et al., 2002; Eger et al., 2008).
Materials and Methods
Participants
Six-month-old infants were recruited for two experimental conditions that varied one of two visual features: shape or texture. In the Shape group, 19 infants were included in the final sample (M = 5.78 months; SD, 0.77; range, 4.9–7.0; six males; 1 Hispanic, 18 non-Hispanic; 18 white and 1 “other” ethnicity). An additional five infants were excluded because of to poor signal quality (three infants) and failure to watch to the minimum criteria (two infants). In the Texture group, 22 infants were included in the final sample [M = 5.90 months; SD, 0.72; range, 5.1–7.4; 10 males; all non-Hispanic: 21 white and 1 biracial (white and Asian)]. An additional four infants were excluded from the final sample by failing to watch to the minimum criterion (three infants) and experimenter error (one infant).
Stimuli and experimental design
Two sets of visual stimuli were designed to disentangle LOC sensitivity to two different cues to object identity (shape and texture). One set varied in shape as well as color while holding texture constant, whereas the other set varied in texture and color while holding shape constant. Thus, for both sets, color was a relevant feature, but only shape or texture was a relevant feature in their respective conditions (Fig. 1). These sets of stimuli were presented in a between-subject design (to avoid interference between stimulus types across blocks and global habituation relevant to stimulus dimensions). A given infant only viewed one set of stimuli but viewed these stimuli in either repeated or variable blocks. All stimuli had two eyes that varied in gaze direction to promote the infant's interest in the stimuli. However, the gaze direction did not vary for a given stimulus, rendering stimulus repetition unconfounded by changes in gaze direction for the two stimulus sets.
Stimuli were presented in Repeated (a single stimulus presented eight times) and Variable (eight different stimuli) blocks. Each stimulus was presented for 750 ms with a 250 ms interstimulus interval. Infants could view eight different Repeated blocks (one per stimulus) and eight Variable blocks (each with a shuffled order of the eight stimuli). Between these 8 s blocks, a jittered interblock interval of 4–9 s allows the fNIRS activation to partially return to baseline (Plichta et al., 2007). Whereas under ideal circumstances the baseline would contain no stimuli, it is not possible to maintain infants' attention and avoid their tendency to become fussy in the absence of any stimulation.
Thus, the low-salience fireworks and calming music were presented (Emberson et al., 2015). Stimuli were presented on a Tobii 1750 eye tracker (screen measuring 33.7 × 27 cm) using MATLAB for Mac (R2007b) and Psychtoolbox (3.0.8 Beta, SVN revision 1245). Each stimulus subtended 15.2° of visual angle (11.5 cm with the infant sitting <43 ± 5 cm from the screen).
The experiment was conducted in a darkened room with black floor-to-ceiling curtains surrounding the infant and the caregiver. Each infant sat on their caregiver's lap. Only the monitor (Tobii eye tracker) was visible to the infant. The experiment ended when infants watched all possible blocks or were judged by the experimenter to no longer consistently attend to the stimuli (e.g., became interested in an object other than the monitor as judged by an experimenter viewing the caregiver and infant from a video camera underneath the monitor). In the Shape condition, infants were also presented with two blocks of auditory stimuli (Repeated and Variable: novel, nonspeech auditory sounds). These four block types were presented in shuffled order throughout the experiment (the auditory data are not reported here). In the Texture condition, only visual blocks were presented in the same manner (shuffled order of two blocks continued until the end of the experiment). Previous research has shown that the inclusion or exclusion of additional auditory blocks does not affect visual repetition suppression effects in infants of this age using fNIRS (Emberson et al., 2017).
fNIRS recordings and preprocessing
fNIRS data were collected using a Hitachi ETG-4000. The NIRS cap included two 3 × 3 arrays consisting of five emitters and four detectors. The cap was placed so that one 3 × 3 array was centered over the left ear and the other was centered at the midline on the back of the head with the most ventral row over the inion. This cap position was chosen based on which NIRS channels were most likely to record from temporal and occipital cortex in infants. Although there were 24 possible fNIRS channels, because of curvature of the infant head, a number of channels did not provide consistently good optical contact across infants (the most dorsal channels for each pad). We did not consider the recordings from these channels in subsequent analyses and only considered a subset of the channels that had good optical contact across the entire population (seven for the lateral pad over the ear and five for the pad at the rear of the head). Caretakers were not prevented from viewing the stimuli but were instructed to refrain from influencing their children, only providing comfort if needed and keeping them from either grabbing at the cap or rubbing their head against the caregiver. Compliance to these instructions was monitored through observation by the experimenter (as reported above).
fNIRS recordings were sampled at 10 Hz. Using a serial port, marks indicating the start and end of each block were presented from MATLAB on the stimulus presentation computer to the Hitachi ETG-4000 using standard methods. The raw data were exported from the Hitachi ETG-4000 to MATLAB for subsequent analyses with HomER 2. As in the study by Emberson et al. (2016), first the raw intensity data were normalized by dividing each sample by the mean to provide a relative (percent) change (HomER 1.0 manual), and then the signal was low-pass filtered (cutoff, 3 Hz) to remove high-frequency noise such as cardiac signals. A Principle Component Analysis (PCA) analysis and a removal of the first principal component helped reduce artifacts attributable to motion. Then, changes in optical density were calculated for each wavelength. Finally, the modified Beer–Lambert law was used to determine the changes (delta) concentration of oxygenated and deoxygenated hemoglobin for each channel (the procResult.dc output variable was used for subsequent analyses; as HOMER1 and HOMER2 have nearly identical preprocessing methods for the options that we have selected, see the HOMER 1 Users Guide for full details). Subsequent analyses were conducted in MATLAB (version R2015b) with custom analysis scripts.
Anatomical localization of the LOC for ROI analyses
Following seminal work by Grill-Spector et al. (1999), the LOC has been functionally localized using a standard task consisting of “2-D black and white line drawings of objects (animals, tools and letters) alternating with scrambled versions of the same images” (Large et al., 2007, p. 131). The relative activation for these object images (animals, tools, letters) compared with scrambled versions reveals activation of the LOC, among other regions. A review of the literature revealed that across a number of studies that use either identical or nearly identical localizers, there is remarkable consistency in the coordinates of the peak activation of the LOC. Focusing only on Talairach (TLRC) coordinates and tasks that used identical localizers, we find the following coordinates: Grill-Spector (2003): 40, −72, −2; Freud et al. (2013): 4 ± 5, −75 ± 6, −6 ± 4; −47 ± 5, −75 ± 4, −8 ± 5; Large et al. (2007): −41.8 ± 4, −72.3 ± 8, −3.4 ± 5; 41.3 ± 2, −78 ± 6, −1.85 ± 3.
Given the extensive neuroanatomical change exhibited early in development, TLRC coordinates cannot simply be plotted on an infant-size MR atlas. To ensure that we are localizing the same anatomical region early in development, the infant LOC, we used the following procedure. The average of these adult coordinates was taken and then plotted on an adult MR template. Then, an atlas (e.g., LPBA, Hammers) was used to determine the macro-anatomical area(s) the coordinates are in. Fillmore et al. (2015) created versions of these atlases that are age- and head-sized appropriate for infants in the first year of life (3–12 months). The same macro-anatomical area can be identified using the correspondence of the location in the adult atlases and the infant versions of these atlases. Thus, we did not simply perform a numerical transformation of TLRC or MNI coordinates but determined the location of the infant LOC through the use of analogous macro-anatomical regions in these adult and infant atlases.
Bilateral LOC ROI.
The center location for each fNIRS channel was calculated for each infant based on the MR-localization methods reported by Emberson et al. (2015) and Lloyd-Fox et al. (2014). The distance between each channel and the LOC (as determined in the meta-analysis) was calculated. Each fNIRS channel was identified by a 3 cm separation of each light source–detector pair. Thus, the center represents the center of a larger swath of cortex that is being sampled. We determined which channels were within 3 cm of the LOC bilaterally for each infant. These channels created a bilateral LOC ROI (Fig. 1).
In the Shape condition, a total of 55 channels across infants were included (24 on the left, 31 on the right). For each infant, an average of 2.89 channels were included (SD, 0.88; range, 2–5). The average distance of the channels from the interpolated center of the LOC center was 2.56 cm (minimum, 1.89; maximum, 2.98), with no difference in the mean distance for channels across the left and the right LOC (p > 0.5). Even with a precise cap placement procedure, there was a distribution of channels that localized the LOC. Indeed, the LOCs were localized using six separate NIRS channels (channel 3, 18; channel 5, 15; channel 1, 13; channel 15, 4; channel 13, 3; channel 2, 2).
In the Texture condition, a total of 58 channels across infants were included (25 on the left, 33 on the right). For each infant, an average of 2.64 channels were included (SD, 1.33; range, 1–5). The average distance of the channels from the interpolated LOC center was 2.62 cm (minimum, 2.03; maximum, 3.00), with no difference in the mean distance for channels across the left and the right LOC (p > 0.14). Again, there was a distribution of channels that localized the LOC (channel 3, 18; channel 1, 15; channel 5, 11; channel 15, 5; channel 13, 5; channel 2, 4).
Results
We find that the infant LOC exhibits significantly attenuated responses to repeated versus variable blocks of shapes. In our LOC ROI (defined as center localization of a channel <3 cm from the anatomical coordinates of LOC as defined based on an adult meta-analysis), we find significant response from baseline to both repeated and variable shape blocks (all t(19) > 3.5, all p < 0.003). However, response to the presentation of variable shapes was significantly greater than to repeated presentation of shapes (t(19) = 1.78, p = 0.044; Fig. 1). There is no significant difference between these two conditions if all occipital channels are included in the analysis (p > 0.15).
Importantly, this difference is specific to shape and not to texture. In the Texture condition, there was a significant response from baseline for both repeated and variable blocks (all t(20) > 2.5, all p < 0.03), but activation for variable texture blocks was not significantly greater than for repeated texture blocks (t(20) = 0.13, p = 0.45). Direct comparisons across the Shape and Texture conditions (mixed ANOVA) yielded a significant effect of condition (F(1,38) = 4.96, p = 0.032) but no main effect of block type nor their interaction (all p > 0.2). A priori t tests between the Shape and Texture conditions confirmed that there was a significant difference between the variable blocks (t(37.9) = 3.15, p = 0.003) but no difference between the repeated blocks (t(38) = 0.55, p = 0.59).
Having confirmed the presence of repetition suppression to repeated shapes but not to repeated textures, we further probed this finding by examining the time course of oxygenated hemoglobin responses between variable and repeated shape presentation blocks (bilateral LOC ROI; Fig. 2). A priori defined time bins compared the average response for each second of recording (10 bins) between block types, revealing significantly greater response to variable shape presentations starting at 10 s after stimulus onset. The slower fNIRS response is consistent with many infant studies (Aslin et al., 2015) and with the blocked design used in the current experiment (stimulus offset at 8 s). No robust differences between presentation block types were found in the Texture condition. Finally, we conducted an exploratory analysis of the lateralization of these effects (Fig. 3), which revealed that only the right LOC exhibited significant RS in the Shape condition. Although there were no significant differences between left and right regions and the localization tended to be better for the right LOC, this suggests some lateralization of the effects that warrants future investigation.
Discussion
We present results that represent the first investigation of a mid-level visual area in early development (6 months). Specifically, we combined the most rigorous anatomical localization of fNIRS to date and a RS paradigm to examine whether the infant LOC exhibits shape selectivity, as has been established in adults. Indeed, when a single shape was repeated eight times, we found an attenuation of hemodynamic responses compared with the presentation of eight different shapes. In contrast, when shape was held constant but texture was varied, we found no evidence of RS, confirming that this cortical area is specific to shape and not to other cues to object identity.
These findings extend the study of shape selectivity in the LOC to early infancy. Importantly, the pattern of cortical selectivity found in the LOC of young infants is remarkably consistent with previous findings in adults. Almost two decades of research has examined object selectivity in the LOC while allowing numerous cues to object identity to naturally vary (Grill-Spector et al., 1999; Grill-Spector, 2003; Large et al., 2007; Freud et al., 2013). Indeed, the standard localizer for the LOC presents highly varied objects (e.g., tools, letters, animals) compared with highly visually scrambled versions that vary many cues to object identity, notably shape and texture (both cues that have been found to affect object perception; Nicholson and Humphrey, 2003). Subsequent work has argued that the aspect of object identity that the LOC is tracking is shape. Cant and Goodale (2007) found that the LOC fails to track objects when shape is held constant (i.e., object identity varied based on material/texture/color alone without changes in shape).
It is notable to find parallel functional specialization of the LOC between adults and infants. Given the protracted development of higher-level regions of the visual system (e.g., the fusiform face area [FFA] exhibits developmental changes into adulthood; Golarai et al., 2007; Scherf et al., 2011), it was quite possible that mid-level regions like the LOC would exhibit relatively late functional emergence and strong developmental refinement. Previous work established evidence that the LOC is sensitive to object identity by 5–8 years (Golarai et al., 2007; Scherf et al., 2007; Dekker et al., 2011). However, this is quite late in visual development. Indeed, looking 5–8 years earlier, the present findings provide evidence of adult-like LOC functional specialization, suggesting that the LOC follows a rapid time course of early development. Future work is needed to examine whether even younger infants exhibit similar stimulus selectivity of this region. Moreover, following the classic studies in this literature (Grill-Spector et al., 1999), previous developmental studies used stimuli that varied on multiple features of shape, color, and material/texture and thus provide little evidence for which aspect of object identity the LOC is tracking. The current study uses fNIRS to demonstrate adult-like functional properties of LOC early in infancy and identifies the specific cue to object identity that infants are tracking: shape. However, despite finding a strong convergence between the LOC at 6 months and adulthood, it is still possible that this region exhibits developmental changes. Indeed, even after selectivity in a cortical region has been demonstrated for higher-level stimuli like faces and places, they exhibit notable developmental changes (Golarai et al., 2007; Scherf et al., 2007).
Whereas there have been previous investigations of object processing using fNIRS, these studies have not focused on uncovering shape-selective regions per se and have instead focused on how object identity modulates activity in the inferior temporal cortex (Wilcox et al., 2008, 2014; Wilcox and Biondi, 2015) or responds to faces, as a visual category, compared with baseline nonface stimuli (Otsuka et al., 2007; Honda et al., 2010). However, although the current study augments our stimuli with faces to maintain infant attention, facial information is identical across conditions (the eyes and mouth are identical for all stimuli), and whereas we did vary eye gaze across shapes (and thus this varies between variable and repeated blocks), this does not change across conditions. Thus, although current findings do not provide evidence about face perception, they do bear on the question of how object perception develops early in life and build from the work of Wilcox and colleagues. Broadly, Wilcox and colleagues have found that the infant inferior temporal cortex selectively tracks object-individuation events (i.e., a greater response when a new object is introduced through a change in shape compared with trials where the types of visual movements where shape and other individuation cues are held constant; Wilcox et al., 2008, 2014). In addition to temporal lobe recordings, Wilcox et al. (2008) included fNIRS recordings from low-level visual areas and did not target the LOC specifically. Correspondingly, they found that, in contrast to the profile of activation in the inferior temporal cortex, these low-level visual cortex areas are activated during visual events regardless of whether there are changes in object identity (e.g., as cued through shape or other cues). Latter studies by this group have provided some recordings between their temporal recordings and their lower-level visual recordings. Unfortunately, it is not clear whether the regions sampled by these channels are LOC specifically. However, these recordings have also provided some convergent evidence that these regions also do not track object identity based on multiple cues (Wilcox and Biondi, 2015). In summary, the current results are consistent with the work by Wilcox and colleagues on object individuation in the ventral visual stream in the first year of life. They find that regions of the mid-level visual system do not track object identity. We extend this work by integrating MR localization with fNIRS recordings to anatomically target the LOC in infants and demonstrate that this mid-level visual region is selectively responsive to object shape early in life.
Wilcox (1999) also demonstrated that the use of either color alone or texture alone are later developing object individuation abilities. However, the current study provides strong individuation cues in the absence of shape by presenting substantial changes in both texture and color. The combination of color and texture has been shown to be readily used by young infants to individuate visual events. For example, Needham et al. (2005) found that presenting either a single exemplar or varying exemplars based on color and pattern modulated the ability of infants age 4–5 months to form object categories. Needham et al. (2005) provide direct evidence that the combination of color and texture is a sufficiently strong cue to induce object individuation for infants at the age studied here.
The early specialization of the infant visual system to shape information provides new insight into the heterogeneous behavioral evidence for shape sensitivity early in development. On one hand, in the object-recognition literature, it is well established that infants use shape information as a cue to object identity starting quite early (Wilcox, 1999). On the other hand, shape is an important cue in early word learning, and infants fail to use this cue until late into their second postnatal year (Smith et al., 2002). This latter finding suggests some kind of late developing access to shape information. However, our current fNIRS findings demonstrate that by 6 months, infants have a visual region that has selectivity to shape. Whereas future work is needed to track the featural selectivity of the LOC from 6 months to 5–8 years, it is clear that shape information is available early in life and that shape is part of the specialized visual machinery of the developing brain. The convergence of behavioral and neural data confirmed that shape information is readily available early in life and that the inability to use shape in word learning likely arises from establishing that this cue is systematically informative (in the semantics of object naming) and not about the fidelity of the shape information being received.
In summary, starting with a meta-analysis of fMRI studies in adults using identical stimuli, we established the likely anatomical coordinates of the LOC in young infants. In a rigorous MR coregistration of fNIRS data, we included fNIRS channels only a small distance from the LOC. This LOC ROI, based on functional data from adults, exhibits a highly similar profile of shape sensitivity as the adult LOC. Together, this finding establishes not only that the same functional similarity exists somewhere in the infant brain but that it exists in the same anatomical location as in adults. This finding extends our understanding of the development of mid-level vision from 5–8 years to 6 months, revealing that this mid-level visual region develops much earlier than high-level vision (e.g., faces, scenes).
Footnotes
This work was supported by Grant K99 HD076166-01A1 Grant 4R00HD076166-02 (Eunice Kennedy Shriver National Institute of Child Health and Development), Canadian Institutes of Health Research postdoctoral fellowship 201210MFE-290131-231192 (to L.L.E.), and National Science Foundation EAGER (Early-Concept Grants for Exploratory Research) Grant BCS-1514351 (to R.N.A.). We thank the infants and caregiver who volunteered their time to make this research happen. We also thank our funding sources.
- Correspondence should be addressed to Lauren Emberson at the above address. lauren.emberson{at}princeton.edu.