Introduction

Evidence from different lines of research suggests that readers access information during visual word identification. For example, we see orthographic neighbors embedded in simple words (e.g., HAT in THAT; Bowers, Davis, & Hanley, 2005) and we can extract stems and affixes in morphologically complex words (e.g., DEAL–ER, BASKET–BALL; for a review see Amenta & Crepaldi, 2012). These two lines of research have mostly run in parallel with little contact. However, orthographic activation mechanisms have recently been suggested to also drive stem recognition in morphologically complex words (Grainger & Beyersmann, 2017). Orthographic and morphological processing obviously interact during visual word identification (e.g., Grainger & Ziegler, 2011).

Studies on morphological processing in reading have typically used the masked priming lexical decision task to establish that suffixed words are automatically decomposed into stem and affix (Rastle & Davis, 2008; Rastle, Davis, & New, 2004). This evidence shows that suffixed words are efficient primes for their stems (farmer-farm). Interestingly, pseudo-suffixed words also act as primes for morphologically unrelated (pseudo)stems (corner-corn), while this is not the case for non-suffixed words (cashew-cash). This has led to the view that morphological decomposition is based on sublexical (morpho-orthographic) information, whether or not the (pseudo)stem actually contributes to the word meaning. Thus, morphological analysis seems to be triggered by the presence of an affix. However, the extension of the paradigm to suffixed and non-suffixed nonword primes (farmity-farm, farmekt-farm) has indicated priming effects even in the absence of suffixes (e.g., Beyersmann, Casalis, Ziegler, & Grainger, 2014; Beyersmann, Cavalli, Casalis, & Colé, 2016; Hasenäcker, Beyersmann, & Schroeder, 2016; Heathcote, Nation, Castles, & Beyersmann, 2018). To account for these findings, Grainger and Beyersmann (2017) put forward the idea of automatic embedded stem activation: priming is not only triggered by an affix, but also emerges because the stem contained in the word is activated directly from orthography and thus essentially identified as a word itself.

The idea of automatic activation of shorter words embedded within longer words has thus entered the morphological processing debate. However, it is not entirely new in the visual word recognition research. Embedded words can be seen as a particular case of orthographic neighbors, namely deletion neighbors (e.g., crow and crown). Within the orthographic literature, embedded word activation has also been examined with masked priming. Contrary to the findings from masked morphological priming, deletion neighbor prime-target pairs were shown to yield inhibitory effects, at least in the absence of a morphological or semantic relationship (Drews & Zwitserlood, 1995; De Moor & Brysbaert, 2000).Footnote 1

To examine whether automatic orthographic activation of embedded words directly spreads to semantics, Bowers, Davis, and Hanley (2005) used a semantic categorization task. Participants had to make judgements about the semantic category of words (i.e., is CROWN a bird?). Crucially, some items contained embedded words that actually fell into the category under question (i.e., CROW in CROWN). When this was the case and the embedded words were of higher frequency than the carrier words, decision times were significantly slower than when the semantic category was unrelated to the embedded word (i.e., is CROWN a vehicle?). This study provided compelling evidence that orthographically embedded words directly activate semantics, and embedded words and carrier words compete with each other (see Nation & Cocksey, 2009, for converging evidence from beginning readers).

With a focus on deletion neighbors, studies on the spread of activation from orthography to semantics have used carrier words that were usually only one letter longer than the embedded word (e.g., crow-crown). Thus, it is not clear how the effects would hold up when the carrier word features a longer extant orthographic chunk (e.g., car in cartoon). Moreover, previous studies have not examined the influence of the morphological status of the additional letters, i.e., whether they correspond to a suffix. In the morphology literature, a strong focus on masked priming has limited the investigation of how early morphological processing drives activation in the semantic system. However, it is crucial to better understand the dynamics of morphological processing. For example, Grainger and Beyersmann (2017) suggest that lateral inhibition arises between embedded stems and whole-words, like cash and cashew, but the inhibition is overcome and the stem activation is maintained in the presence of a pseudo-suffix, like -er in corner (cf. Taft, Li, & Beyersmann, 2018). Consequently, the spread of activation from orthography to semantics should depend on the morphological status of the non-overlapping letters.

In the present study, we aimed to bring together insights from the two lines of research described above. We used two semantic categorization tasks in Italian in order to examine the automatic semantic activation of embedded words in the presence and absence of (pseudo)suffixes, that is, with corner and cashew items. Words that embed a category-congruent stem (i.e., is CORNER a type of food? where the embedded word corn is a type of food, but the carrier word corner is not) were compared to two different control conditions: (1) the same word in a different, category-incongruent condition (e.g., is CORNER a type of food? vs. is CORNER an animal?) and (2) a different word with no embedded stem in the same category (e.g., is CORNER a type of food? vs. is TIMBER a type of food?). To date, effects of morphemic and pseudomorphemic structure have typically been investigated with (masked) priming lexical decision tasks, for which identification at a shallower level could suffice; and most theories of morphological processing in reading are based on this. In contrast to previous lexical decision studies, semantic categorization moves the focus to meaning, which has been shown to influence morphological processing and, more in general, what kind of information readers extract, even in the face of the exact same sensorial stimulation (Marelli, Amenta, Morone, & Crepaldi, 2012; Amenta, Marelli, & Crepaldi, 2015). Longer rejection times for words that embed a category-congruent word (i.e., is CORNER a type of food?) would indicate that not only the stem orthographic representation is activated (as in the case of masked priming LDT), but this activation is fed forward to the semantic level. By comparing embedded words in pseudo-suffixed and non-suffixed words, we can test whether embedded word meaning is activated to the same amount regardless of the presence or absence of a (pseudo)suffix. If embedded word orthography directly activates semantics (Bowers et al., 2005; Nation & Cocksey, 2009) without any recourse to morpho-orthographic decomposition processes, we should see equal effects for corner and cashew type words. If, however, the presence or absence of a (pseudo)suffix plays a role along the processing path and affects the extent to which the embedded word is activated relative to the carrier word, we should see differences between pseudo-suffixed and non-suffixed words.

Experiment 1

Method

Participants

Thirty-two native Italian speakers (22 female; age: M=23.63 years, SD=3.71, min=19, max=33) participated in the experiment for monetary compensation. All participants reported to have normal or corrected-to-normal vision and no history of cognitive or reading-related difficulties. All provided their informed consent to take part in the study.

Materials

We selected 40 Italian nouns as carrier words containing an embedded word that belonged to one of the following categories: animal (eight items), body part (11 items), food (eight items), house (three items), landscape (seven items), and person (three items). The embedded word was always a noun itself embedded at the beginning of the carrier word (e.g., burrone, ravine, containing burro, butter). Carrier words were six to ten letters long (M=7.35, SD=0.95), while embedded words were four to six letters long (M=4.70, SD=0.61). Each embedded word was of higher frequency (log-scale: M=3.35, SD=0.64) than its carrier word (log-scale: M=2.20, SD=0.62), because this has been shown to elicit more stable effects (Nation & Cocksey, 2009; see also Luke & Christianson, 2011, for the dynamic effects of stem and whole-word frequencies in different tasks). Half of the carrier words had a pseudo-suffix after the embedded word (e.g., -one in burrone), while half did not (e.g., -ace in rapace, rapacious).

Due to a linguistic trait of Italian, the word-final vowel can change or drop when a suffix is added. For example, when adding the diminutive suffix -ina to tazza (cup), one gets tazzina (little cup). This was also the case for some of our pseudo-suffixed and non-suffixed items. Therefore, we marked for each word whether only the stem (i.e., tazz-) or the complete singular form (i.e., tazza) was embedded, to check whether this may influence the embedded stem effect. Stem and complete embeddings were roughly equally distributed over conditions.

Two counterbalanced experimental lists were constructed from the carrier words, such that each embedded word was assigned to its congruent category in one list and to another category in the other. For example, from the eight carrier words containing an embedded animal, four were shown to half of the participants in the category “animal” (congruent condition) and four were shown in another category (incongruent condition). This was reversed for the other half of the participants. Thus, we had a 2 (Congruency: category-congruent vs. category-incongruent) × 2 (Ending: pseudo-suffix vs. non-suffix) design with Congruency as a within-items, across-participants manipulation, and Ending as an across-items, within-participants manipulation

In addition to the 40 carrier words, which were all NO-answers in the experiment, 40 words were selected as members of the relevant categories to serve as fillers requiring a YES-response. The fillers were matched to the carrier words on lexical characteristics (see Supplementary Material for details).

Procedure

The experiment was run using OpenSesame (Mathôt, Schreij, & Theeuwes, 2012). Participants were instructed to categorize each word as quickly and accurately as possible by pressing one of two buttons on an Arduino response box. Similar to Bowers et al. (2005), each word was preceded by a fixation cross (800 ms), and followed by a blank screen (350 ms), which was in turn followed by the target word that remained on the screen until keypress, or for a maximum of 2,500 ms. Feedback was given for 500 ms after each trial, in the form of a happy or sad smiley. The words were presented in blocks by category, and the order of blocks in the experiment was randomized across participants. Each block started with the presentation of the relevant category. During all trials in a block, the category label was displayed in the top left corner of the screen. Two practice blocks of six trials each were included, using the categories “vehicle” and “weather,” which were not used in the main experiment.

Analysis

All analyses were carried out using R (R Core Team, 2014). We first checked error rates on the carrier words in each category, to ensure that they were correctly rejected as non-members of the relevant category in at least 60% of the trials. This led to the exclusion of two items. For the response-time analysis, incorrect responses (5.18%) were excluded, as were response times faster than 200 ms (0.09%). Further outlier trimming was done following Baayen and Milin (2010); we fitted a simple model with only random effects and excluded all data points with residuals exceeding 2.5 SD (2.95%). Error data and cleaned response times were then analyzed using (generalized) linear mixed-effects modelling as implemented in the lme4 package (Bates, Mächler, Bolker, & Walker, 2015). Models included Congruency (category-congruent embedded word vs. category-incongruent embedded word), Ending (pseudo-suffix vs. non-suffix) and their interaction as contrast-coded, categorical fixed effects; and random intercepts for Subject, Item, and Category. Next, Embedding (stem vs. complete) was added as a further fixed effect, to test whether stem and complete embeddings evoked different responses. Overall effects were tested using Type III sum of squares and χ2 Wald tests.Footnote 2

Results

For the error-rate analysis, the model revealed a significant main effect of Congruency (χ 2=7.95, b=-0.43, t=-2.82, p=.005, ΔER=3.08%), indicating that more errors were made when the embedded word was category congruent. Neither the main effect of Ending (χ 2<1, b=0.17, t=0.85, p=.396, ΔER=1.79%) nor the interaction of Congruency and Ending (χ 2=2.59, b=0.25, t=1.61, p=.108) were significant. Descriptive statistics based on model estimates are depicted in Fig. 1 (upper panel). Adding Embedding as a main and interaction effect did not change the pattern of results – the core Congruency effect remained solid and significant (χ 2=7.90, b=-0.57, t=-2.81, p=.005, ΔER=3.23%). Embedding itself did not seem to play much of a role (χ 2=1.67, b=0.31, t=1.29, p=.196, ΔER=0.88%); only its interaction with Congruency came close to significance (χ 2=3.59, b=-0.39, t=-1.90, p=.058) in the direction that the congruency effect was larger for complete embeddings (ΔER=4.80%) than for stem embeddings (ΔER=1.66%). No other main or interaction effect reached significance (all t<1, all p>.300).

Fig. 1
figure 1

Error rates (upper plot) and response times (lower plot) in the different conditions based on model estimates. Error bars represent 95% confidence intervals

For the response-time analysis, the model revealed a significant main effect of Congruency (χ 2=6.44, b=13.34, t=2.54, p=.011, ΔRT=27 ms), indicating that words were rejected more slowly as non-members of a category when the embedded word was congruent with it. Again, neither the main effect of Ending (χ 2<1, b=-4.72, t=-0.45, p=.652, ΔRT=9 ms) nor the interaction of Congruency and Ending (χ 2<1, b=1.48, t=0.25, p=.800) were significant. Model estimates for RT in the design cells are depicted in Figure 1 (lower panel). The Congruency effect remained solid when Embedding was added to the model (χ 2=6.53, b=14.60, t=2.55, p=.011, ΔRT=29 ms) and no other main effects or interactions were significant (all t<1.40, all p>.100).

Experiment 2

Method

Participants

Another 31 native Italian speakers (22 female; age: M=23.03 years, SD=4.27, min=19, max=35; normal or corrected-to-normal vision; no history of cognitive or reading-related difficulties) participated in the experiment for monetary compensation after providing their informed consent for the study.

Materials

The items of interest were the same 40 words with embedded stems as in Experiment 1. We paired each of the 40 nouns with another Italian noun that contained no embedded stem, but was perfectly matched on length (M=7.35, SD=0.95), frequency (M=2.20, SD=0.62), old20 (M=1.85, SD=0.38), and type of ending (pseudo-suffix/non-suffix).

In this experiment, no list design was necessary: each embedded word was only shown in its congruent category (e.g., burrone, ravine, containing burro, butter, in the category FOOD). The paired no-embedding control item (e.g., balcone, balcony) was shown in the same category. Thus, we had a 2 (Embedding Presence: yes vs. no) × 2 (Ending: pseudo-suffix vs. non-suffix) design with Embedding Presence and Ending as an across-items, within-participants manipulation.

As in Experiment 1, equally many category-congruent words were included to serve as fillers requiring a YES-response. The fillers were matched on lexical characteristics (see Supplementary Material for details).

Procedure

The procedure was the same as in Experiment 1.

Analysis

The analyses were identical to those in Experiment 1. We excluded one item with an overall error rate exceeding 60%. Incorrect responses amounted to 5.43% of the datapoints. We further trimmed 2.72% of the data as outliers. The models included Embedding Presence (yes vs. no), Ending (pseudo-suffix vs. non-suffix), and their interaction as contrast-coded, categorical fixed effects; and random intercepts for Subject, Item, and Category.Footnote 3

Results

For the error rate analysis, the model revealed a significant main effect of Embedding Presence (χ 2=6.72, b=0.48, t=2.59, p=.010, ΔER=2.33%), indicating that more errors were made when words contained a category-congruent embedding. Neither the main effect of Ending (χ 2=1.20, b=0.20, t=1.10, p=.273, ΔER=1.35%) nor the interaction of Embedding Presence and Ending (χ 2=1.12, b=-0.20, t=-1.06, p=.290) were significant. Model estimates are depicted in Fig. 2 (upper panel).

Fig. 2
figure 2

Error rates (upper plot) and response times (lower plot) in the different conditions based on model estimates. Error bars represent 95% confidence intervals

For the response time analysis, the model again revealed a significant main effect of Embedding Presence (χ 2=5.51, b=-17.93, t=-2.35, p=.019, ΔRT=36 ms), indicating that words were rejected more slowly as non-members of a category when they contained a category-congruent embedded word. The main effect of Ending was marginally significant (χ 2=3.58, b=-15.39, t=-1.89, p=.058, ΔRT=31 ms), indicating slightly slower responses overall to pseudo-suffix compared to non-suffixed words. The interaction of Embedding Presence and Ending was again not significant (χ 2<1, b=5.87, t=0.77, p=.442). Descriptive statistics are depicted in Fig. 2 (lower panel).

Discussion

Two lines of research, on morphological and orthographic processing, have provided independent evidence that readers access sublexical information in visual word identification. Here, we made an attempt at bringing together these lines of research to shed light on what determines access to embedded stems, and what the role of affixes in this process is. The results from two semantic categorization tasks provide clear-cut evidence that the lexical identification system activates the meaning of embedded word stems regardless of the presence or absence of morphological structure. When a word contained an embedded (pseudo)stem that was category-congruent (e.g., corn in corner is a type of food), rejecting the carrier word as a non-member of that category (corner is not a type of food) was harder, as evidenced by both higher error rates and longer response times. This was true irrespective of what followed the embedded stem, either a (pseudo)affix (e.g., corner) or a non-morphological ending (e.g., peace).

Our results add to the growing pool of evidence showing that neither a true morphological relationship, nor the presence of a pseudo-suffix is necessary for the visual word identification system to spot an embedded stem (cf. Beyersmann et al., 2014; Beyersmann et al., 2016; Hasenäcker et al., 2016; Heathcote et al., 2018). This emerging evidence was captured in the model proposed by Grainger and Beyersmann (2017), which highlights the role of stem identification as an independent process unrelated to morphological decomposition. This would justify little role for affixes. However, the model predicts additional activation (and thus a larger effect) when the embedded stem is followed by a suffix. In our data, we did not find strong support for this prediction, suggesting that the feed-forward dynamics of activation to semantics do not depend on the presence of an affix.

Viewed from the perspective of the morphological processing literature, our study shows that (pseudo)stem activation is not a mere by-product of the masked priming lexical decision task, but can be observed across tasks. This is in agreement with the finding of Amenta et al. (2015) that stems are also accessed in sentence reading (see also Luke & Christianson, 2011, for converging evidence for inflections), whether or not they contribute to whole-word meaning. Amenta et al. (2015) eye tracked their participants as they read sentences including genuine derivations and, critically for a comparison with the present study, pseudo-complex words (e.g., copertina meaning book cover, not the diminutive of coperta, blanket). They found an inhibitory effect of stem word frequency for these latter items – stems attracted longer fixations when they didn’t contribute to word meaning, which mirrors the worse performance of our participants in words with conflicting embedded stems. This suggests that the identification of embedded stems is automatic, in the sense of involuntary and not task-dictated. What seems to be task-dictated, instead, is access to semantics, which is another similarity between our results and those of Amenta et al. (2015). Both tasks emphasize access to word meaning; they can only be accomplished if words are fully understood. Under these conditions, readers seem to access not only the form but also the meaning of the embedded stem, contrary to what is generally found in masked priming lexical decision experiments (e.g., Davis & Rastle, 2010; Longtin, Segui, & Halle, 2003; Rastle et al., 2004; but see Feldman, O’Connor, & del Prado Martín, 2009). So, essentially, it seems that we cannot avoid spotting an embedded stem; and we access its meaning when the task pushes for semantics, even in situations where the embedded stem meaning interferes with task requirements.

The finding that rejection of a word with an embedded category-congruent word was harder in comparison to the same word in an embedded-incongruent category (Experiment 1) as well as in comparison to a word without any embedding (Experiment 2) indicates that it is not the mere presence of an embedded word that slows down the response, but indeed the category membership. The two experiments together thus establish that the effect is indeed semantic rather than lexical.

Given that affixes did not modulate the results in the present experiments, a question arises as to whether stem activation in pseudo-morphological words and the identification of embedded word neighbors actually draw on the same mechanisms and are, in fact, the same phenomenon. Viewed from the perspective of the deletion neighbor literature, our results suggest that the identification of embedded word neighbors extends well beyond the close orthographic neighborhood investigated thus far. Embedded words in previous studies were typically one-letter deletion neighbors in previous studies (e.g., crow-crown). In the present experiments, the length difference between embedded and carrier words spanned from three to five letters. Orthographic similarity, however computed, was thus much weaker in our experiments and we found similar results nonetheless. How long the extant string of letters after the embedded word can be, and whether effects are graded depending on the number of additional letters/orthographic similarity – these are questions that we leave open to further research. Snell, Grainger, and Declerck (2018) made a step in this direction by providing evidence that the effect of embedded words in sentence reading depends on the length of the embedded word relative to the carrier word in Dutch, but not in English. Future research needs to address this issue in further detail to determine the boundaries of embedded stem/deletion neighbor effects and to, eventually, shed light on the orthographic coding schemes that the human lexical system adopts.

Finally, it is worth mentioning that our data were not affected by whether a whole word was embedded (e.g., burro in burrone) or rather its stem alone (e.g., latt(-e) in lattina). On the surface, this may point to a more prominent role for morphology. Alternatively, however, it could just be that lexical representations are fully accessed via their stems, particularly in Italian where content word stems are all bound to take in affixes (e.g., gatto, cat, gatta, female cat, gatti, cats, but gatt is not a word itself).

To conclude, the present study ties together orthographic and morphological research on visual word identification by comparing the effect of embedded stems in pseudo-suffixed and non-suffixed words using a semantic categorization task. We showed that words were harder to reject as non-members of a category when they contained embedded word stems that were indeed category-congruent, very similar to what has been reported before for closer deletion neighbors. Importantly, in our semantic task, the effects were not modulated by the morphological structure of the words, that is, whether they ended in a pseudo-suffix or a non-suffix. These findings provide evidence that the lexical identification system automatically identifies embedded word stems, and activates their meaning from orthography when the task requires semantic information.

Open Practices Statement

The data and materials for the experiment reported here are available at https://osf.io/f2rjd/. The experiment was not preregistered.