Humans appear to rely on spatial mappings to represent and describe concepts. We refer to someone who is happy as up and describe someone who is condescending as looking down upon others, and we look forward to the future or back in time. The systematic associations between concepts and space has been documented in a growing body of studies (Chasteen, Burdzy, & Pratt, 2010; Dudschig, Souman, Lachmair, de la Vega, & Kaup, 2013; Estes, Verges, & Barsalou, 2008; Fischer, Castel, Dodd, & Pratt, 2003; Gozli, Chasteen, & Pratt, 2013; Meier & Robinson, 2004; Santiago, Lupiáñez, Perez, & Funes, 2007; Weger & Pratt, 2008; Zwaan & Yaxley, 2003). Whereas this research has been predominately descriptive, in this article, we propose and provide evidence for a potential explanatory factor for the origin of this phenomenon.

In one of the early demonstrations of such associations between concepts and space, Zwaan and Yaxley (2003) demonstrated that the spatial layout in which words appeared influenced how efficiently they were processed, even though this dimension was not relevant to the task. The participants’ task was to determine whether two words presented on the screen, one above the other, were semantically related or unrelated. When the words were semantically related and referred to objects that possessed a canonical spatial arrangement (e.g., attic and basement), participants’ responses were faster when the words appeared in locations consistent with the spatial arrangement (i.e., attic above basement), as compared with when the placement of the word pairs violated this arrangement (basement above attic).

In addition to affecting the processing efficiency of words, concept words appear to have the ability to orient attention. For example, Chasteen et al. (2010) presented participants with a cue word in the center of the screen, followed by a white-square target, which could appear above, below, to the left of, or to the right of the center of the screen. The cue words referred to religious deities. Participants were faster to detect the above and right targets after a God-related cue word (e.g., Lord), whereas they were faster to below and left targets after a Devil-related cue word (e.g., Satan). Importantly, the location of the cue had no direct relevance to the task (target detection), yet the relationship between the meaning of the cue word and the location of the subsequent target systematically influenced participants’ response speed at different locations, suggesting that the location of attention in space had been shifted (Chasteen et al., 2010). Similar effects have been reported for nonreligious words that have a positive or negative valence (e.g., champion vs. enemy), which facilitate responses to targets above and below screen cente, respectively (Meier & Robinson, 2004). Furthermore, concrete object words (e.g., hat and boot) have also been found to have systematic associations with spatial locations (Estes et al., 2008). We refer to the ability for the meaning of a word to orient spatial attention as conceptual cuing.

Conceptual cuing has typically been explained within the embodied cognition framework, according to which higher cognitive processes, such as concept representation, are grounded in lower-level sensorimotor mechanisms (Barsalou, 1999, 2005, 2008; Gallese & Lakoff, 2005). This means, for example, that one’s concept of apple would draw upon sensory and motor memories of having seen, held, and tasted apples. According to this perspective, participants are faster to process the word attic above basement because of our experiences with objects conforming to this arrangement in the world around us. Similarly, attention is shifted to above locations after the word sky because we have experiences of the sky being above us (an upward affordance). While this provides a plausible explanation for effects obtained with concrete words, it is not as clear how this explains the mapping with abstract words (e.g., happy or hero), which have no obvious spatial affordance other than that conferred to them by language. While conceptual metaphor theory (Lakoff & Johnson, 1999; Landau, Meier, & Keefer, 2010) proposes that people ground abstract concepts in easier-to-comprehend concrete concepts (e.g., comparing a relationship with a journey), it does not offer a compelling explanation for the particular spatial mappings that occur. More problematically, in previous work, researchers appear to have assigned both abstract and concrete words to categories on the basis of intuition, rather than any specified criterion or independent ratings (e.g., Meier & Robinson (2004) classified ambitious as a positive word, although it is plausible that this concept could equally well have negative connotations). While these intuitive groupings have produced systematic effects, suggesting that they are tapping something meaningful, a more objective a priori criterion is required. Otherwise the categorization of cue words to upward versus downward affordance is in danger of becoming circular—that is, dependent on the effect it produces. Here, therefore, we sought to test a potential explanatory factor for conceptual cuing: language use (specifically, the frequency with which spatial terms and concept words co-occur in language).

Several features of language representation and use implicate language in the conceptual cuing effect. First, language frequently draws on spatial metaphors (Barsalou, 2008): We talk about “God above,” “feeling down” (sad), and “perking up” (happy). Second, theoretical development in linguistics has led to an increasingly rich view of the lexicon, where lexical entries are annotated with rich information about their meaning and use (e.g., Bybee, 2010). Finally, psycholinguistic data suggest that frequently occurring strings are stored in the lexicon as holistic chunks (e.g., a cup of tea, I don’t know) (Arnon & Snider, 2010; Bannard & Matthews, 2008). The suggestion is that the conceptual cuing effect may be driven by the frequent use of spatial terms with target concepts (e.g., God above); that is, participants may shift their attention upward upon processing God because they have frequently and systematically been exposed to the term in conjunction with above and up, such that this spatial information has become associated with the concept itself. This account stipulates that the magnitude of conceptual cuing is predictable from linguistic collocations of the concept word and spatial words.

Previous research has yielded some preliminary support for the notion that language use might account for concept–space mappings. It has been shown that the efficiency of semantic-relatedness judgments for vertically separated words pairs (e.g., atticbasement, as in Zwaan & Yaxley, 2003) is predicted by the frequency with which the words appeared in this order in language (Louwerse, 2008; Louwerse & Jeuniaux, 2010). That is, the speed of responses can be predicted from the frequency with which words appear in the presented order (attic–basement), as compared with the reverse order (basement–attic). In the present study, however, rather than a semantic relatedness judgment, which may prime people to the language context, we used a paradigm where participants did not directly respond to the cue word and, instead, made a judgment about a subsequent target that was unrelated to the meaning of the cue. This would provide insight into conceptual cuing—that is, the shift of attention associated with processing word meaning. Moreover, Louwerse (2008) and Louwerse and Jeuniaux (2010) used only concrete object words in their stimulus set, whereas it is conceptual cuing for abstract words that is most in need of explanation. Abstract words, by definition, are not tangible objects, and therefore, the ability for perceptual simulation or experience to create the affordance is unclear. Here, therefore, we used a classic conceptual cuing paradigm, in which participants were presented with a central cue word and then identified a target letter that appeared above or below fixation. We selected abstract and concrete cue words that had been used in previous studies, half of which were associated with a downward affordance and half of which were associated with an upward affordance. We calculated the differential collocation (frequency of use of the words together) for each cue word with spatial words for up versus their collocation for spatial words for down for each of these words. This allowed us to directly compare the effectiveness of the affordances derived from the embodied cognition framework against language use as explanations for conceptual cuing.

Method

Participants

Fifty-seven participants (40 female) completed the experiment in exchange for course credit or monetary compensation (mean age = 22 years, SD = 7.2). All participants reported normal or corrected-to-normal vision.

Materials

Stimuli were presented on a cathode ray tube monitor running at a refresh rate of 75 Hz. The experiment was programmed in the Psychophysics Toolbox extension within MATLAB. A chinrest was used to fix viewing distance at 44 cm.

Item selection

A total of 231 words were selected from previous studies (Estes et al., 2008; Gozli et al., 2013; Meier & Robinson, 2004; Zwaan, Stanfield, & Yaxley, 2002). Each word was coded according to whether it had been categorized as an upward or downward affordance (affordance variable). We then used the English corpus of Google Ngram (Michel et al., 2011), a large, publicly searchable corpus of approximately 361 billion words, to calculate bigram frequencies (collocations) between target words and spatial words denoting the up and down dimensions. Language use changes over time; therefore, we restricted our search to publications that fell within 1998–2008 (the 10 most recent years available). To calculate the collocation between words and the upward spatial dimension, for each target word, we took the sum of the collocation between the word and the two synonyms up and above, and to calculate the collocation between words and the downward spatial dimension, for each target word, we took the sum of the collocation between the word and the synonyms down and below. These figures were then log transformed.Footnote 1 The difference between these two figures yielded the collocation difference value for the word (this corresponds to the sixth column in Table 1). Therefore, a positive difference score equals an upward affordance; a negative difference score equals a downward affordance. This was done for all 231 words.

Table 1 The 24 conceptual cues selected from the present study, as a function of word type, affordance, and collocation log frequency

In order to ensure sufficient repetitions of cue words to obtain reliable estimates of cuing, it was necessary to select a subset of these 231 words. Twenty-four words were chosen for the present experiment. Twelve were concrete, and 12 were abstract. Within each of these categories, half of the words were categorized as having a downward affordance in past studies, and half had been categorized as having an upward affordance. Collocational values were chosen to vary across these affordance categories as much as possible. That is, we attempted to select words with collocational difference values that were both consistent and inconsistent with the prior affordance categorization (see Table 1). However, it is important to note that overall, these difference values were, on average, consistent with the affordance categories (upward, M = 0.96, SD = 1.44; downward, M = −.39, SD = 1.44), t(23) = 2.48, p = .02 (two-tailed), d = 1.02. Therefore, there was significant overlap between the two independent measures.

Design

The experiment used a 2 (affordance: up and down) × 2 (word type: abstract and concrete) within-subjects design, with collocation differences as a continuous predictor. The dependent variables measured were accuracy and reaction time (RT), with only correct RTs included in the analysis. Each participant had 24 average RT differences (differences in RT for above vs. below targets), one average for each word.

Procedure

The stimuli were white presented on a black background. Each trial began with the presentation of a small cross (fixation cross) in the center of the screen for 1,000 ms. Next, a cue word was presented for 800 ms. Following this, a target letter k or l appeared 8° either above or below the center of the screen (each letter occurred equally often overall and at each location). Participants’ task was to identify the target letter (as k or l) as quickly and accurately as possible by pressing the corresponding key on the keyboard. Importantly, this means that participants’ task was unrelated to the meaning of the cue word. The target letter was visible until participants responded (Fig. 1). Finally, the screen was blank during a 1,000-ms interval prior to the start of the next trial. Across the experiment, each cue word was repeated 20 times (10 trials for each subsequent target location), yielding a total of 480 trials. Each participant completed these 480 trials, divided into four blocks with rest breaks in between, the length of which was at the discretion of the participant. One participant’s data were excluded because of low accuracy (70 %). Only RTs for correct trials were analyzed.

Fig. 1
figure 1

A schematic representation of the trial structure. Participants’ task was to identify the target letter (as k or l) as quickly and accurately as possible. A cue word was presented on every trial but was not relevant to participants’ task

Results

Trials (1.6 % of total) were excluded from analysis if responses were quicker than 200 ms or slower than 2.5 standard deviations above the participant’s mean RT. Accuracy of responses for the remaining participants was high (96.7 % for above targets and 95.5 % for below targets). We calculated the difference in RTs between above and below fixation targets for each word cue for each participant [i.e., RT(target above center) − RT (target below center)]. The mean RT differences by word type, affordance, and target position are shown in Table 2.

Table 2 Mean reaction time (RT) differences (in milliseconds) by word type (abstract vs. concrete), affordance (upward vs. downward), and spatial dimension of target (above vs. below)

Table 2 indicates that participants were faster to respond to targets presented above the center, regardless of whether it was in a location consistent with its affordance. The larger negative RT difference values for words with an upward affordance suggest facilitation for affordance-consistent targets. This difference was similar for both abstract and concrete words.

The data were analyzed using linear mixed effects models in R (version 2.15.2; R Development CoreTeam, 2009), which were calculated using the lme4 package (Bates & Maechler, 2010). The fixed effects were (1) word type (abstract vs. concrete), (2) affordance (upward vs. downward), and (3) log collocational frequency difference. Participants and items were treated as random effects to account for by-participant and by-item variation in one model (Baayen, 2008), and by-participant and by-item random slopes were included to ensure that significant fixed effects reflected the slopes for the effects and not between-participant or between-item variance (Barr, 2008). We started with a null model and added fixed and random effects systematically, keeping in terms if they significantly increased model fit (as indicated by model comparison using the anova function). Separate models were built for the affordance and collocational variables, and the extent to which they predicted RT difference was compared via a comparison of fit indices. The final models for both the affordance and collocational frequency variables are reported in Table 3. In each case, the fixed effect of word type did not significantly contribute to model fit. The significant intercepts in both models confirm the processing advantage for above-center targets. Both affordance and collocational frequency predicted RTs in the same direction. The model fit statistics suggested that the two models explain a similar amount of variance in the data.Footnote 2

Table 3 Final linear mixed effects models predicting reaction time difference scores from word affordance (model 1) and collocational frequency difference (model 2)

Discussion

In the present study, we found that linguistic collocations between cue words and spatial words predict the magnitude of conceptual cuing. These data provide the first principled operationalization and potential explanation of affordance in the literature on conceptual cuing. The finding is striking, because the participants’ task was unrelated to both the meaning of the word and the spatial dimension (unlike, e.g., Louwerse, 2008). Yet the relationship between these two factors had a systematic effect on participants’ responses. The predictive values of collocations in explaining conceptual cuing with concrete words implies one of two possible interpretations. Either language use is the cause of the associations between concepts and space, or language use reflects a consequence of an implicit, prelinguistic mapping arising from our perceptual experiences with objects in the world around us. Our results cannot differentiate between these two possibilities for concrete words. For abstract words, there are not direct perceptual experiences of tangible objects to draw on to create these systematic associations (e.g., we do not see a sensible occupying any particular spatial location in the world around us), and thus perceptual simulation is inadequate as an explanation. We note, however, that conceptual metaphor theory (e.g., Landau et al., 2010) predicts that abstract concepts are spatially mapped, without attributing a causal role to language in this process. Future theoretical development should consider language usage as one potential contributing factor to the development and transmission of spatial biases associated with concepts.

Our data firmly ground embodied cognition accounts in language use (see also Hutchinson & Louwerse, 2013) and are consistent with neo-Whorfian arguments regarding the effect of language on cognition (e.g., Boroditsky, Fuhrman, & McCormick, 2011). We stress, however, that more research is required to determine exactly how language usage exerts its influence on the conceptual cuing effect. Our operationalization of usage, collocational frequency, is likely to only partially capture this relationship, because it measures just one possible way in which concepts and spatial terms can be expressed (i.e., word + spatial term). There are a multitude of ways in which spatial relationships can enrich concepts that do not involve a simple juxtaposition of words (e.g., God lives up in heaven). Future research should seek to provide more fine-grained estimates of word–spatial-term associations (e.g., via more sophisticated computational analyses of corpora) to determine whether such estimates explain additional variance in conceptual cuing. At the same time, although our language use measure is likely to have been imperfect, the fact that it explained the data as well the traditional affordance categorization is significant.

The finding that responses were overall faster for above-fixation targets is consistent with previous basic visual and attentional research, in that there appears to be a lower-visual-field advantage for contrast sensitivity (Cameron, Tai, & Carrasco, 2002), resolving the orientation of crowded low spatial frequency Gabors (He, Cavanagh, & Intriligator, 1996) and sensitivity to motion (Edwards & Badcock, 1993), whereas there is an upper-visual-field advantage for visual search (Previc & Blume, 1993) and object recognition (Chambers, McBeath, Schiano, & Metz, 1999), consistent with the properties of the dorsal and ventral cortical processing streams, respectively (Previc, 1990). The present finding that responses were facilitated to letter above fixation is consistent with this pattern, whereby processing of detailed content (such as the identity of a letter) should be facilitated in the upper visual field.

In conclusion, the linguistic collocation between spatial words and concepts predicts the magnitude of the shift of attention upon presentation of concept words. This suggests that language use is one potential explanation for the relationship between concepts and space.