Linear effects, which capture a straight-line relation between two variables, are very common in cognitive research in general, and in the field of numerical cognition in particular. Consider, for example, a task in which participants are presented with pairs of numbers and are asked to decide which member of the pair corresponds to a greater magnitude. Response latencies in such a case are described by a robust linear effect, known as the distance effect (Moyer & Landauer, 1967). The distance effect refers to the linear decrease in response latencies as the intrapair distance between the compared numbers increases (e.g., comparison is faster between 2 and 8 than between 2 and 3; Fig. 1). This effect is taken as evidence for a continuous representation of numerical magnitude, often referred to as the mental number line (e.g., Brysbaert, 1995; Dehaene, 1992; Gallistel & Gelman, 1992; Verguts, Fias, & Stevens, 2005; Zorzi & Butterworth, 1999).

Fig. 1
figure 1

Mean reaction time (RT) as a function of distance. The distance effect is depicted by faster responses as the intrapair distance increases. A regression line with a negative slope can be fitted to model this linear relation. The vertical bars denote 95% confidence intervals

Another robust phenomenon in the field of numerical cognition that captures linear relations is the SNARC (i.e., “spatial numerical association of response codes”) effect (Dehaene, Bossini, & Giraux, 1993). It refers to the pattern of faster left-hand responses for smaller numbers (e.g., 1 or 2) and faster right-hand responses for larger numbers (e.g., 8 or 9). The effect is obtained in parity judgments, numerical comparisons, and other tasks (for reviews, see Fias & Fischer, 2005; Hubbard, Piazza, Pinel, & Dehaene, 2005; for a recent meta-analysis, see Wood, Willmes, Nuerk, & Fischer, 2008). The SNARC effect is taken as evidence for the mapping of numerical magnitude onto a spatial representation, with smaller magnitudes represented on the left side and larger numbers on the right side. The effect can be estimated as an interaction between number magnitude and hand, calculated on reaction times (RTs), or as a main effect of number magnitude calculated on the difference in RTs (dRT) between right- and left-hand responses (Fig. 2a and b, respectively). The latter analysis is in fact an analysis of the linear relation between dRT and magnitude. The association between magnitude and space would be indicated by a negative linear relationship between dRT and magnitude. For smaller numbers, dRT should be positive, while it should be negative for larger numbers.

Fig. 2
figure 2

(a) The SNARC effect presented as an interaction between hand and number magnitude computed on mean RTs. A group of 10 participants performed a typical parity judgment task on the numbers 1, 2, 8, and 9 to measure the SNARC effect (Pinhas & Fischer, unpublished data; for a similar method, see, e.g., Fias et al., 1996). (b) The same SNARC effect data presented as a main effect of number magnitude calculated on the RT difference (dRT) between right- and left-hand responses. Positive dRTs indicate faster left-hand responses; negative dRTs indicate faster right-hand responses. A regression line with a negative slope can be fitted to model the linear relation between dRT and number magnitude. In both panels, vertical bars denote 95% confidence intervals

The goal of the present methodological note is to review the common, existing methods used for analyzing linear effects in numerical cognition, to point to their advantages and disadvantages, and to propose analyzing such effects within the framework of analysis of variance (ANOVA).

Following Fias et al. (1996), the SNARC effect has frequently been estimated as the negative correlation between dRT and number magnitude. Fias et al. adopted Lorch and Myers’s (1990) proposal of analyzing repeated measures data with multiple regression. One way to do this requires running a regression analysis for each individual participant and then averaging over participants. The averaged regression coefficients are then statistically tested (each with a separate t test) to determine whether they deviate significantly from zero at the group level. This method is termed here the individual regression equations method.

As was pointed out by Fias et al. (1996), a key advantage to this method is that it allows for quantifying the SNARC effect size in terms of a slope, which captures the essence of the effect (i.e., what is the expected latency difference between the right- and left-hand responses for a given change in number magnitude?). However, while their work has become the model for analyzing linear effects in the field of numerical cognition, there are two main practical disadvantages to this method, originally discussed by Lorch and Myers (1990). First, it requires the computation of individual regression equations, one for each participant. Second, and more importantly, effect sizes cannot be estimated in terms of the proportions of variability accounted for (R 2), because this method does not provide such values.Footnote 1

Please note that in most studies the presence of the distance or SNARC effect was not the only question investigated, but rather researchers examined other factors in addition, such as notation, or the task requirements and the extent to which they affected behavior. Thus, the common method has been to apply an ANOVA for the latter questions and to conduct a separate regression analysis to address the presence of the distance and/or the SNARC effect (SNARC effect: see, e.g., Fias, 2001; Fischer, 2003; Fischer & Rottmann, 2005; Gevers, Verguts, Reynvoet, Caessens, & Fias, 2006; Nuerk, Iversen, & Willmes, 2004; Shaki & Petrusic, 2005. Distance effect: see, e.g., Bonato, Fabbri, Umiltà, & Zorzi, 2007; Fischer & Rottmann, 2005; Ganor-Stern, Pinhas, Kallai, & Tzelgov, 2010; Kallai & Tzelgov, 2009). Applying two different statistical analyses to the same data set is both inefficient and flawed in terms of statistical reasoning.

In this methodological note, we propose to examine linear relationships using repeated measures ANOVA followed by linear trend analysis. This method is termed here the repeated measures ANOVA and linear trends method. As we will show next, this method has the advantages both of simplicity and of providing the necessary information in terms of slope and of variability accounted for. Moreover, the same analysis is used to test all effects of interest rather than applying multiple statistical analyses on the same data set.

To demonstrate our analysis with real data, we ran a simple experiment in which 8 participants were presented with pairs of numbers (generated from the integers 1 to 6) and were asked to select the larger member of the pair (for a similar method in a numerical comparison task, see, e.g., Ganor-Stern et al., 2010, simultaneous condition). We used the distance effect as an example and tested the relations between the methods discussed above for estimating linear effects; that is, we used both (1) individual regression equations and (2) repeated measures ANOVA and linear trends.

Results and analysis

Individual regression equations

Following Lorch and Myers (1990) and Fias et al. (1996), we ran a single regression analysis for each participant, with the mean RT as the dependent variable and intrapair numerical distance as the predictor. The averaged negative slope of −17.31 ms significantly deviated from zero [t(7) = −6.36, SD = 7.70, p < .05], signifying the distance effect—a decrease in response latencies with increase in the intrapair distance (see Table 1 and Fig. 1). Note that this procedure does not provide information regarding the proportion of variance accounted for by distance.

Table 1 Mean RTs as a function of the intrapair distance and the corresponding regression slope for each participant (individual regression equations)

Repeated measures ANOVA and linear trends

Mean RTs were submitted to a one-way ANOVA with distance (1, 2, 3, 4, or 5) as a within-participants variable. The distance effect was evaluated by testing the significance of the linear trend. Before relating to the results of the ANOVA (summarized in Table 2), let us briefly discuss the formal partition of the variance involved in such an analysis.

Table 2 Overview of the repeated measures ANOVA conducted to test the distance effect

Given that there were s participants, each of whom performed the task with each intrapair distance condition, the total sum of squares (SST) of the mean latency scores representing the performance of each participant in each condition can be written as

$$ SST = SSS + SSA + SSAS, $$
(1)

where SSS (the subjects’ sum of squares) refers to the between-participants variability; that is, it reflects the individual differences in latency among the participants, and both SSA (sum of squares due to the manipulated factor A) and SSAS (sum of squares of the interaction between factor A and participants) reflect the within-participants variability. In particular, in the example, SSA represents the variability between (the means of) the intrapair distance conditions, and it can be further decomposed into

$$ SSA = SSlinear + SSres, $$
(2)

where SSlinear reflects the sum of squares due to linear differences between the intrapair distance conditions of factor A, and SSres is the residual variability remaining (after removing the linear component) of the differences between the different conditions of factor A (i.e., numerical distance). Thus, SSres is the sum of squares of the nonlinear systematic differences among the intrapair distance conditions.

The significance of SSlinear is tested against SSsubjects × Linear (divided by the relevant degrees of freedom of this error term), which is the sum of the squares due to the interaction between participants and the linear component of distance. As can be seen in Table 2, the linear trend capturing the presence of the distance effect in this analysis was significant.

Furthermore, the linear relation between the intrapair distances and response latencies can be calculated in terms of slope as:

$$ Slope = \sqrt {{\left( {\frac{{SSlinear}}{{s\sum {{W^2}} }}} \right)}}, $$
(3)

where W refers to the weights of the linear trend and s refers to the number of participants (see Appendix A for further details).

For example, in the present study there were 8 participants (s = 8), and the weights of the linear trend were −2, –1, 0, 1, and 2 for the intrapair distances of 1, 2, 3, 4, and 5, respectively. Furthermore, the sum of the squares of the linear trend equaled 23,985.43 (Table 2). Thus, the slope for distance could be calculated in terms of ANOVA as:

$$ Slope{3_{{dis\tan ce}}} = \sqrt {{\left( {\frac{{23,985.43}}{{8\left[ {{{\left( { - 2} \right)}^2} + {{\left( { - 1} \right)}^2} + {{(0)}^2} + {{(1)}^2} + {{(2)}^2}} \right]}}} \right)}} = 17.31 $$

Note that this is the same (absolute) averaged slope of distance calculated using the individual regression equations method above (Method I).

To complete the picture, it should be clear that since SSlinear is tested for significance against SSsubjects × linear, one can use partial eta-squared (η 2 p ; Cohen, 1965, 1973) to estimate the magnitude of the linear component as:

$$ \eta_p^2 = \frac{{SSlinear}}{{SSlinear + \left( {SSsubjects\, \times \,linear} \right)}}. $$
(4)

This statistic is usually used as a measure of effect size in higher-order designs because it “partials out” all nonrelevant sources of variance except for those that are specifically explained by the effect.

Estimating the SNARC effect

It should be clear that although our proposed Method II was demonstrated with a linear trend in a one-way repeated measures design, it can easily be implemented in more complex situations. Let us consider the case of the SNARC effect (e.g., Dehaene et al., 1993; Fias et al., 1996). As mentioned earlier, the SNARC effect can be estimated as an interaction between number magnitude and hand, calculated on RTs, or as a main effect of number magnitude calculated on the dRT between the right- and left-hand responses (Fig. 2a and b, respectively). The latter case simply follows our example of the distance effect. Thus, a repeated measures ANOVA could be used with dRT as the dependent variable and number magnitude as a within-participants variable. Following the finding of a main effect for number magnitude, a linear trend could be computed. Importantly, the SNARC effect size can then be easily evaluated in terms of a slope (Eq. 3), as it has previously been quantified through regression analysis. Moreover, the effect size can also be evaluated with common measures of the proportion of variability accounted for (i.e., η 2 p from Eq. 4).

In the case of describing the SNARC effect in terms of an interaction, a two-way ANOVA design should be used, with hand (factor A) and number magnitude (factor B) as within-participants factors (see Table 3). The SNARC effect can be obtained in such a design as the interaction of the comparisons between hand and the linear component of the number magnitude factor. In other words, the effect is estimated by interaction of linear trends (InterLcomp; see Table 4 for the detailed ANOVA table).

Table 3 Mean RTs as a function of hand and number magnitude
Table 4 Overview of the repeated measures ANOVA conducted to test the SNARC effect

Similar to the case of a one-way ANOVA, the SNARC effect size in terms of slope can be easily estimated as:

$$ Slope = \sqrt {{\left( {\frac{{2 * SSInterLcomp}}{{s\sum {{W^2}} }}} \right)}}, $$
(5)

where W refers to the weights of the linear trend and s refers to the number of participants (see Appendix B for further details).

For example, in the presented SNARC data (Table 3), there were 10 participants (s = 10), and the corresponding linear weights for the number magnitude levels of 1, 2, 8, and 9 were −4, –3, 3, and 4, respectively.Footnote 2 Furthermore, the sum of squares of the interaction of linear trends (i.e., SSInterLcomp) equaled 22,308.74 (Table 4). Thus, the slope for the SNARC effect can be calculated in terms of a two- way repeated measures ANOVA as:

$$ Slop{e_{{SNARC}}} = \sqrt {{\left( {\frac{{2 * 22,308.74}}{{10\left[ {{{\left( { - 4} \right)}^2} + {{\left( { - 3} \right)}^2} + {{(3)}^2} + {{(4)}^2}} \right]}}} \right)}} = 9.45. $$

If an analysis of individual regression equations (Method I) was conducted on the same data with the use of dRT as a dependent variable, the calculated averaged slope would have the same absolute value.

Finally, the SNARC effect could also be evaluated in terms of the variability accounted for using η 2 p :

$$ \eta_p^2 = \frac{{SSInterLcomp}}{{SSInterLcomp + \left( {{\text{SSsubjects}} \times {\text{Interaction}}\,{\text{of}}\,{\text{linear}}\,{\text{trends}}} \right)}}, $$
(6)

which in this case equals .70 (see Table 4).

Concluding remarks

Two ways of testing linear relationships were discussed. The first method, individual regression equations (Lorch & Myers, 1990), is simple but flawed, due to its inability to provide measures of the variability accounted for. The requirement to report measures of effect size in terms of variability accounted for (e.g., of η 2 and η 2 p statistics) is becoming more common among scientific journals and is strongly recommended in the Publication Manual of the American Psychological Association (2009). Thus, statistical techniques that do not easily provide such measures seem less appealing.

The second method, repeated measures ANOVA and linear trends, is simple and easy to use, on the one hand, and provides measures of effect size in terms of both the slope and the proportion of variability accounted for (i.e., η 2 and η 2 p ), on the other. Moreover, this method enables use of the same statistical analysis to answer other questions of interest, such as the effect of notation or the effect of presentation mode on performance. The latter advantage makes it not only more efficient, but also more statistically sound.

By demonstrating the use of these two methods on the same data sets, the present study has revealed the specific relations between regression and ANOVA methods when testing for a linear effect. Specifically, we have shown that the estimates of slope and of accounted variability derived within the ANOVA method are identical to the ones provided by the regression analyses. Thus, we conclude with a recommendation to use ANOVA with a linear trend analysis whenever there is a need to capture the presence of a linear relationship at the group level. Please note that this method is applicable for testing linear effects in terms of main effects or of interactions, and hence, can be used in higher-order ANOVA designs. We also wish to emphasize the generality of our argument beyond the numerical cognition domain. The method proposed is useful for any research that examines the presence of a linear relationship between variables. It should be applicable, for example, for research in perception that focuses on the presence of set-size effects on search latencies (e.g., Treisman & Gelade, 1980) or on performance in working memory tasks (e.g., Kessler & Meiran, 2008).