Introduction

Within psychology and neuroscience, there are two prevalent theories that seek to explain the relationship between space and time perception: a theory of magnitude (ATOM) and metaphor theory. ATOM suggests that space and time originate from the same neurally represented general magnitude system (and thus have a roughly symmetrical relationship; Walsh, 2003). Alternatively, metaphor theory suggests that our sense of time is grounded in our sense of space (implying that space should influence time more than time influences space; Boroditsky, 2000; Lakoff & Johnson, 1980; Lakoff et al., 1999). Thus far, direct tests of these two theories have favored metaphor theory (Casasanto & Boroditsky, 2008; Casasanto, Fotakopoulou, & Boroditsky, 2010; Merritt, Casasanto, & Brannon, 2010). However, more recent evidence suggests that the assumptions of metaphor theory may not hold under all conditions, such as when space and time are not presented in the visual modality (Cai & Connell, 2015). Thus, we conducted a registered replication of the original work by Casasanto and Boroditsky (2008) to clarify this theoretically important finding.

Metaphor theory suggests that due to the intangible and abstract nature of time, we use concrete and tangible experiences with space as metaphors to ground our representation of time (Lakoff & Johnson, 1980; Lakoff et al., 1999). This theory is empirically supported by the work of Boroditsky (2000) who showed that people employ spatial metaphors to represent time both cognitively and linguistically, but not the other way around. Casasanto and Boroditsky (2008) tested these refined theoretical predictions with a task where space and time perception were measured simultaneously. In this task, participants viewed a horizontal line that grew across a screen (which varied in spatial length/displacement and duration) and then were asked to reproduce either the length or the duration of that line. Casasanto and Boroditsky (2008) found that participants incorporated irrelevant spatial information into their time estimates (they estimated spatially longer lines as having longer durations), but did not incorporate irrelevant temporal information into their space estimates (displacement of a line was estimated similarly regardless of duration). These results were taken as support for metaphor theory. More recent investigations of metaphor theory suggest that time can influence space, just as long as space has a greater influence on time (i.e., space and time have an asymmetrical relationship; Merritt, Casasanto, & Brannon, 2010).

ATOM similarly acknowledges the relationship between space and time, yet explains these relations with a different framework. ATOM proposes a shared magnitude system between space, time, and numerosity grounded in evidence that these dimensions may share neural underpinnings within the parietal cortex (Walsh, 2003; Bueti & Walsh, 2009). Critchley (1953) noted that parietal cortex damage almost always impaired both spatial and temporal perception and that it was rare for neurological damage to disrupt one process without disrupting the other. Further, behavioral evidence from animal cognition demonstrates that time and number are discriminated in similar ways (Church, 1984), and are similarly affected by substances known to alter time perception such as amphetamines (Meck & Church, 1983). ATOM also draws support from the developmental literature documenting how children often confuse and have trouble discriminating space, time, and number (Bryant & Squire, 2001). More recently, ATOM has been favored by the work of Cai and Connell (2015), who found that when space and time are presented auditorily and haptically they have bidirectional effects on one another.

Cai and Connell (2015) suggest that findings in support of metaphor theory may be explained by the precision with which we represent and remember magnitudes. Humans are visually dominant animals, and because space is frequently experienced visually, it may be the case that spatial representations have less variability than temporal representations and can be remembered with higher accuracy. Indeed, when spatial extent is presented with an unfilled length instead of a solid line (making the spatial representation noisier), space and time have bidirectional effects on one another (Wang & Cai, 2017; Cai, Wang, Shen, & Speekenbrink, 2018).

Table 1 OLS pilot data results (N = 19)
Table 2 MLM pilot data results (N = 19)

These findings highlight the somewhat unclear explanations of the relationship between space and time perception. The foundational work of Casasanto and Boroditsky (2008) provides compelling evidence for metaphor theory. However, our recent unpublished online replication attempt of this work found different results. We analyzed the data with both OLS regression (in order to be consistent with previous analytical approaches; see Table 1) and mixed models with the inclusion of participant as a random intercept (see Table 2) to account for the nested nature of the data (tables were made using the Stargazer package in R; Hlavac, 2013). Our pilot data indicate that irrelevant temporal information affected spatial estimates and irrelevant spatial information affected temporal estimates to roughly the same degree. The semi-partial marginal R\(^{2}\) values (calculated in accordance with the methods outlined by Stoffel, Nakagawa, & Schielzeth (2021)) for the variance explained by actual displacement in predicting duration estimates and for actual duration in predicting displacement fell within each other’s confidence intervals (see Table 2). This suggests a roughly symmetrical effect where irrelevant spatial information explained a similar amount of variance in duration estimates, as irrelevant temporal information explained in displacement estimates.

Given online data collection, it is possible that our results are an artifact of an online environment. Yet, despite potential concerns about online behavioral research, there is evidence that at least some behavioral and psychophysical tasks can replicate in an online environment (Semmelmann & Weigelt, 2017; Crump, McDonnell, & Gureckis, 2013). Given that online data collection has many benefits, such as gathering more diverse samples and being more efficient than in-person research, we think it is critical to identify and evaluate which tasks may be well suited for online data collection. Thus, this registered report provides a replication of the original work of Casasanto and Boroditsky and compares results from this task in a laboratory to results obtained from an online setting. This approach allows for a replication of a theoretically important finding, while also contributing to the body of work comparing results of behavioral studies conducted online to those in a laboratory. Our hypotheses are as follows:

  1. 1.

    Participants will incorporate spatial information into their temporal estimates such that they will judge lines with more displacement as having longer durations (consistent with both ATOM and metaphor theory).

  2. 2.

    Participants will incorporate temporal information into their spatial estimates such that they will judge lines with a longer duration as having more displacement (consistent with both ATOM and metaphor theory).

  3. 3.

    The relationship between space and time will be roughly symmetrical (as predicted by ATOM, but not metaphor theory).

Methods

Participants

Effect sizes from our pilot data (given they were smaller than the original work) were used in our sample size calculation. Data were analyzed with mixed models to account for the nested nature of the data, and potential individual differences in time perception (Matthews & Meck, 2014). A power analysis was conducted in the R package simr (Green & MacLeod, 2016), using our smallest effect size, with the approach suggested by Arend and Schäfer (2019). The correlation coefficient for our weakest correlation was r = .07 (estimated displacement and actual duration). To be conservative, a slightly smaller correlation (r = .05) was used for our standardized effect size alongside the ICC of the corresponding null model (ICC = .08). The alpha value was set at .05 and cluster size (number of trials) to 162. Our analyses indicated that with 30 participants (162 trials each), we had an observed power of .92. To ensure power would be well above .9 we slightly over-sampled (N = 35). This is a larger sample size than the original work of Casasanto and Boroditsky (2008) (N = 9), our pilot sample (N =19), and previous studies using similar tasks such as Wang and Cai (2017) (N = 23–25).

Thirty-five online and thirty-five in-lab participants (N = 70) were recruited. Participants were excluded if they did not provide complete data (e.g., did not finish the experiment), incorrectly responded to more than one catch trial, or showed low accuracy for either duration or displacement estimates (correlation of below .5 for estimated duration/displacement to actual duration/displacement). Catch trials were added for online data collection, but otherwise the exclusion criteria was kept the same as Casasanto and Boroditsky (2008).

Materials

Our task was modeled on the experiment 1 methods section of Casasanto and Boroditsky (2008), was programmed in PsychoPy (Peirce, 2007), and run on Pavlovia (https://run.pavlovia.org/MirindaWhitaker/rrcb_task). Participants were presented with lines of varying displacements that appeared for varying durations. In-lab participants viewed stimuli on a Dell OptiPlex 7050 desktop with an attached monitor (1920 X 1200, 59-Hz refresh rate). Online participants completed the experiment on their own devices that were required to have a tangible keyboard and track-pad or mouse (i.e., not a touchscreen-only tablet or a phone). Durations ranged from 1000 to 5000 ms in 500-ms increments and displacements ranged from 200 to 800 pixels in 75-pixel increments. The nine durations and displacements were fully crossed (81 unique line types, presented twice for 162 total trials). Target lines were black and grew from the left side of the screen. Their starting point was randomly jittered (200–600 pixels left of screen center) so that the monitor would not provide a reliable frame of reference for estimates. The method for line jittering is a slight deviation from Casasanto and Boroditsky (2008). This was done to spawn all stimuli relative to the center of the screen, which ensured the task would work regardless of screen size. Each line stayed on the screen until its displacement value was reached and then disappeared. All experimental materials are available at https://osf.io/t2a67/?view_only=9f5af0489440428a8fd5d6f6d756aedf (see Fig. 1 for a depiction of the task).

Fig. 1
figure 1

Schematic of experimental procedure

Procedure

After reviewing a consent form on Qualtrics, participants were redirected to Pavlovia and entered their participant ID, age, gender, and first language. Participants were given unlimited time to read the instructions (which included an instructional YouTube video to ensure participants understood the task) and then proceeded to the task. For the task, a fixation cross appeared for 500 ms followed by a line that grew across the screen and then disappeared once it reached its displacement value (200–800 pixels). A prompt then appeared instructing the participant to reproduce either the length (“line”) or the duration (“time”) of the line. For the length reproduction, participants saw a white “X” with a black outline appear in either the upper or lower left-hand corner of their screen. Participants were instructed to click in the center of the “X” with their mouse to acknowledge that as the starting point of their length reproduction. Participants were then instructed to place another “X” at a horizontal distance away from the first “X” that they thought was equivalent to the length of the line they had just seen. For the duration reproduction, participants saw a grey hourglass icon appear in either the upper or lower left-hand corner of their screen. Participants were instructed to click the hourglass to start their duration reproduction (at which point the hour-glass turned black) and to click the hourglass again to end their reproduction (at which point the hour-glass turned back to grey). Before starting the experimental trials, participants completed eight practice trials to get familiar with the keyboard controls and then moved through the 162 experimental trials at their own pace. Halfway through the experimental trials participants were given the opportunity to take a 5-min break. Approximately every 20 trials, participants encountered a catch trial (eight total) that asked them to “please press the “c” key on your keyboard.”

Data analysis

Participant-level and sample-level correlations were run between all main variables (estimated duration, actual duration, estimated displacement, and actual displacement). Mixed models were run in R’s lme4 package (version 1.1-25) using the lmer function (Bates, Mächler, Bolker, & Walker, 2015). Estimated duration or estimated displacement were predicted by the fixed effects of actual duration and actual displacement, with the inclusion of a participant random intercept. All predictors were centered at the lowest meaningful value (1 s for actual duration and 200 pixels for actual displacement). Centered duration was then multiplied by 100 as a re-scaling procedure to ensure appropriate model fit. For intraclass correlations, an intercept only model was run for each dependent variable. Although comparing the symmetry of the space/time effect is a somewhat difficult methodological problem given that space and time were on different scales, a measure of the symmetry of the space/time and time/space effect (i.e., hypothesis 3) was obtained using semi-partial marginal R\(^{2}\). Semi-partial marginal R\(^{2}\) and corresponding confidence intervals were computed in accordance with the methods outlined by Stoffel, Nakagawa, and Schielzeth (2021), which allowed for comparison of the amount of variance explained by each predictor in the mixed models. To formally test hypothesis three, we determined if the semi-partial marginal R\(^2\) for the opposite predictors (displacement in Eq. (1) and duration in Eq. (2) below) fell within each other’s confidence intervals. If the space/time and time/space effects are symmetrical we would expect these semi-partial marginal R\(^2\) values to fall within each other’s confidence interval. Whereas, if the effects are asymmetrical we would expect them to fall outside each other’s confidence interval and for the two confidence intervals to be non-overlapping. All mixed model formulas are listed below, where i represents individuals and j represents trials (i.e., single observations; see Eqs. 13).

$$\begin{aligned} Y(est.\,duration)_{ij}= & {} \beta _{0i} + \beta _{1}(duration)_{ij}\nonumber \\&+ \beta _{2}(displacement)_{ij} + \epsilon _{ij}\end{aligned}$$
(1)
$$\begin{aligned} Y(est.\,displacement)_{ij}= & {} \beta _{0i} + \beta _{1}(duration)_{ij} \nonumber \\&+ \beta _{2}(displacement)_{ij} + \epsilon _{ij}\end{aligned}$$
(2)
$$\begin{aligned} \beta _{0i}= & {} \gamma _{00} + u_{0i} \end{aligned}$$
(3)

All data and materials are available at https://osf.io/t2a67/?view_only=9f5af0489440428a8fd5d6f6d756aedf.