Complex meanings are often derived by simply combining the meanings of constituents in an expression. However, some meanings require comprehenders to add unexpressed meaning, as in (1), where it is necessary to determine what event is being begun:

  1. 1.

    The man began the book that same afternoon.

Numerous studies have shown that expressions that are hypothesized to require enriched compositional operations are indeed more costly to process than minimally contrastive control expressions (for a review, see Pylkkänen & McElree, 2006). In these studies, the coerced expressions all involve type mismatches, which we will term typing violations, in which the semantic type of a phrase (e.g., the book, which is an entity-denoting object) does not coincide with the semantic type expected in the sentence (e.g., that began is an event-selecting verb). As is discussed below, the processing cost for these constructions appears to lie in the need to introduce new semantic material so that the complement satisfies the semantic requirements of the element, such as the verb, that selects for the complement. In (1), the book is interpreted as “to read the book” (although other interpretations, such as “to write the book,” are possible), which refers to an event. This processing cost is commonly referred to as the coercion cost.

Now consider the noun phrase the difficult mountain in (2):

  1. 2.

    The man photographed the difficult mountain.

In (2), there is no type mismatch, since the verb selects for an entity and mountain satisfies this requirement. At the same time, adjectives such as difficult modify events. In order to satisfy this demand, people might add additional semantic content, interpreting the adjective–noun phrase as meaning a mountain that is difficult to climb. If this is true, the process of arriving at an enriched interpretation of the adjective–noun phrase appears to be quite similar to the process of interpreting other coerced constructions. If the coercion cost indeed reflects the introduction of new semantic material, we would expect adjective–noun combinations that require enriched interpretation to be harder to process than adjective–noun combinations that can be interpreted straightforwardly. Crucially, this would indicate that coercion effects can be found in the absence of a type mismatch and within a single noun phrase (rather than between two larger elements in a sentence, such as a verb and its complement).

Most research has addressed cases of complement coercion such as (1) and has provided a relatively consistent picture of the mechanisms underlying the resolution of coerced constructions. In (1), the noun phrase the book is the complement of the event-selecting verb began. Since the book is an entity-denoting object while the verb expects an event, strict compositional interpretation would lead to semantic mismatch (Jackendoff, 1997, 2002; Pustejovsky, 1995; Pylkkänen & McElree, 2006). In order to resolve this discrepancy, the interpretation of the complement is “enriched” by (mentally) adding semantic structure so that it refers to an event involving a book. For example, an event often associated with books is that they are read, so a likely interpretation of (1) is that the man started to read the book. Since “to read the book” is an event, it now satisfies the selection criteria of the verb, and compositional interpretation can proceed.

Results from a range of different methodologies clearly indicate that sentences containing complement coercion are more costly to process than noncoerced constructions, such as . . . read the book or . . . began the fight, and that this cost is evident while, or soon after, the critical word (book) is processed. These methodologies include self-paced reading and eye-tracking (Frisson & McElree, 2008; Lapata, Keller, & Scheepers, 2003; McElree, Frisson, & Pickering, 2006a; McElree, Traxler, Pickering, Seely, & Jackendoff, 2001; Pickering, McElree, & Traxler, 2005; Traxler, McElree, Williams, & Pickering, 2005; Traxler, Pickering, & McElree, 2002), magnetoencephalographic (MEG; Pylkkänen & McElree, 2007), ERP (Baggio, Choma, van Lambalgen, & Hagoort, 2010; Kuperberg, Choi, Cohn, Paczynski, & Jackendoff, 2010), and speed–accuracy trade-off (SAT) (McElree, Pylkkänen, Pickering, & Traxler, 2006b) studies.

Significant progress has been made in pinpointing the source of the extra cost, and it has been argued that it reflects the application of complex composition operations to generate unexpressed semantic structure (see Frisson & McElree, 2008, and Pylkkänen & McElree, 2006; although cf. Kuperberg et al., 2010, for a slightly different account). Concretely, in the case of complement coercion, this means that an event sense is selected or generated for the complement if it is of a nonevent type, such that the object interpretation of book is superseded by an event interpretation (verbing the book). Since this process results in an interpretation that goes beyond a simple, compositional interpretation, it is referred to as an enriched composition (Jackendoff, 1997). In short, we argue that the “coercion cost” found in complement coercion constructions originates in the need to introduce new semantic material so that the complement satisfies the semantic requirements from the verb.

Small clause constructions also involve the generation of extra semantic content. McElree, Pylkkänen, Pickering, and Traxler (2006b) and Pylkkänen, Martin, McElree, and Smart (2009) investigated the processing of adjectival predicates, in which the adjective is derived from an event-selecting verb. In (3), the subject NP ice refers to an object, while in (4) the subject NP fall refers to an event:

  1. 3.

    The climber proved the ice survivable.

  2. 4.

    The climber proved the fall survivable.

The adjectival predicate, survivable, requires a governing eventive NP. When this is already the case, as in fall, processing will be straightforward. However, when the NP is of a different type, it needs to be coerced into an eventive interpretation, which entails the introduction of additional content (e.g., [climbing of the ice]). Results from both SAT (McElree et al., 2006b) and MEG (Pylkkänen, Martin, McElree, & Smart, 2009) studies indicate that the enriched composition necessary to interpret noneventive NPs results in increased (or dissimilar) processing.

The final construction to mention, aspectual coercion, is more controversial, from both a theoretical and a processing perspective. In (5), there is a mismatch between the default telic (i.e., bounded) reading of the verb hopped and the unbounded, iterative interpretation suggested by the durative phrase for 10 minutes. This mismatch is absent for verbs such as glide in (6), whose default meaning is atelic:

  1. 5.

    The insect hopped effortlessly for 10 minutes.

  2. 6.

    The insect glided effortlessly for 10 minutes.

Experimental evidence of how language users process aspectual coercion is equivocal, with a processing cost found in certain tasks (Brennan & Pylkkänen, 2008; Piñango, Zurif, & Jackendoff, 1999; Piñango, Winnick, Ullah, & Zurif, 2006; Todorova, Straub, Badecker, & Frank, 2000), but not in tasks more closely resembling normal reading (Pickering, McElree, Frisson, Chen, & Traxler, 2006).

There might be several reasons why processing costs for aspectual coercion are inconsistent. First, as was pointed out by Pylkkänen and McElree (2006; see also Pickering et al., 2006), an alternative interpretation not involving aspectual coercion is possible for (5); it could, in theory, be one very long hop, and the felicity of such an interpretation will depend on the individual items used. Second, the reason to change the verb from a bounded to an unbounded interpretation does not come from semantic considerations but depends on our real-world knowledge that the insect is more likely to have made a series of hops rather than one long one. Third, the generation of additional content is minimal in aspectual coercion, since the only change is the adoption of the iterative interpretation of the verb.

Coercion in adjective–noun phrases

In adjective–noun phrases such as (7), the adjective difficult calls for an event to modify, which may require enriched composition to interpret the expression in a manner such as the mountain being difficult to climb (Pustejovsky, 1995).

  1. 7.

    The athlete is convinced that the difficult mountain will require all his strengths.

To explain the eventive interpretation of adjective–noun combinations such as difficult mountain, Pustejovsky (1995) proposed a generative mechanism called selective binding. According to this approach, adjectives such as difficult are treated as event predicates that modify a noun by selecting some part of its semantic content.Footnote 1 Concretely, the adjective selectively binds with an activity stored in one of the qualia of the noun (a structure of defining attributes that is part of a word’s lexical representation).Footnote 2 In this case, the telic quale contains information about the typical purpose and functions of the entity denoted by the word, and the activities the object is typically utilized for, which, in the case of mountain, might be climbing it.Footnote 3 Hence, in order for the event-selecting adjective to be interpreted correctly, it modifies one particular, eventive aspect of the noun’s lexical representation, although it does not change the noun’s semantic type (which can be seen in its ability to be in conjunction with an intersective adjective: the high and difficult mountain). When the noun itself already refers to an event (e.g., exercise in difficult exercise), selective binding is not necessary, and normal composition can take place. In short, adjective–noun combinations such as difficult mountain are examples of enriched composition, since the interpretation depends on more than simple composition.

There are, however, different views of how adjective–noun combinations are interpreted. For example, adjectives tend to be highly polysemous, as can be seen in the different interpretations of difficult in difficult mountain, difficult child, and difficult road, or of fast in fast audition, fast typist, and fast service. It is possible that all these different senses of an adjective are enumerated in the lexicon and are activated upon reading the word (see Klein & Murphy, 2002, for arguments in favor of such a lexicon). If so, then the interpretation of the adjective–noun phrase would not involve coercion processes but, rather, a standard composition of the appropriate sense of the adjective with the meaning of the noun. If so, one would not expect a processing cost for expressions such as difficult mountain. In contrast, if the interpretation of the adjective is largely determined by the head noun, comprehenders need not activate all the individual stored senses (if they are indeed lexically stored) but, rather, can activate a monosemous or underspecified meaning (see Frisson & Pickering, 2001, for a discussion of underspecification). The specific interpretation is then achieved by its interaction with the head noun. This view is in line with the assumptions behind selective binding.

In short, adjective–noun phrases such as difficult mountain might show a cost related to the activation or generation of additional unexpressed semantic content (such as “to climb”) in order to arrive at an enriched compositional interpretation. If this is the case, this would provide evidence that costly enriched compositional processes also operate in the absence of a type mismatch and within a single noun phrase. Alternatively, assuming that all the different senses of the adjective get activated immediately, straightforward composition can take place, and no cost is expected.

Method

Participants

Thirty-six students from the University of Massachusetts at Amherst took part for course credit or financial reward. All spoke American English as their first language, and all had normal or corrected-to-normal vision. None had participated in the pretest (see below).

Materials

The materials were 20 item pairs, containing either a coerced or a noncoerced adjective–noun combination (see Examples 7 and 8 and the Appendix):

  1. 7.

    The athlete is convinced that the difficult mountain will require all his strengths and extra precautions. As far as I know, he would be one of the first to undertake this feat.

  2. 8.

    The athlete is convinced that the difficult exercise will require all his strengths and extra precautions. As far as I know, he would be one of the first to undertake this feat.

The target nouns were controlled for length (both coerced and noncoerced: 6.9 characters, t(19) < 1) and frequency (coerced, 36.8 per million; noncoerced, 37.3 per million; t(19) < 1; based on the CELEX database, Baayen, Piepenbrock, & Van Rijn, 1993). We also checked co-occurrence frequencies for the adjective–noun combinations in the 100-million-word British National Corpus (Burnage & Dunlop, 1992). Co-occurrence frequency was extremely low, with only three coerced and four noncoerced phrases appearing in the corpus. On average, co-occurrence frequency for coerced phrases was 0.40 per 100 million and 0.95 for noncoerced phrases, t(19) < 1. In addition, we log-transformed the number of “hits” from a Google search of the adjective–noun combinations and calculated the difference score. These scores were then correlated with the difference scores for all reading measures between the coerced and noncoerced conditions. None of the correlations approached significance (all ps > .14). Lastly, we checked the latent semantic analysis (LSA; Landauer & Dumais, 1997) scores of the adjective–noun combinations, using document-to-document analysis. The LSA scores were low overall, and there was no significant difference between the noncoerced and coerced phrases (.02 vs. .01), t(19) < 1.

In a plausibility pretest, 42 undergraduates from New York University rated the actual target sentences, together with 113 other sentences from different experiments and fillers, on a scale of 1 to 7. They were asked to circle “1” if they “think the sentence is absolutely implausible or doesn’t make sense at all” and “7” if they “think the sentence is perfectly plausible or acceptable,” and to use the other numbers to make finer distinctions. Coerced sentences were rated 5.2 (SD = 0.92), and noncoerced sentences 5.3 (SD = 0.70), t(19) < 1.

The experimental items were divided over two lists, with 10 items of each condition in each list, and mixed with 103 filler items of different types. The items were placed in a fixed random order.

Procedure

We recorded eye movements using a Fourward Technologies Dual Purkinje Generation 5.5 eyetracker. This eyetracker records eye position every millisecond and has a resolution of less than 10 min arc. Stimuli were presented on a 15-in. color monitor at a distance of 61 cm from the participants’ eyes. Participants used both eyes to read the sentences, but only the position of the right eye was recorded. In order to minimize head movements, each participant was fitted with a bite bar. In addition, a forehead rest was used, and participants’ heads were held in place with Velcro straps.

Before the start of the experiment, a calibration procedure was performed. Calibration was checked before the presentation of each new stimulus, and recalibration was carried out whenever the experimenter deemed necessary. The experiment took about 45 min, with a short break halfway through.

The participants were asked to read at a normal pace. The items appeared on the screen when the participants looked at a box coinciding with the first letter of the first sentence. Once they finished reading, they pressed a button to make the item disappear. In order to ensure that they read for understanding, yes/no comprehension questions, balanced across conditions, followed 50% of the items. Accuracy for the questions was 91.9%.

Analyses

An automatic procedure combined short fixations of less than 80 ms and within one character of another fixation into one larger fixation. Fixations longer than 1,200 ms and any remaining fixations shorter than 80 ms were excluded. Items with blinks in the critical regions or which showed misalignments were removed. In total, about 2.4% of the data were eliminated.

We report analyses for three regions: the adjective (including the preceding determiner: the difficult), the noun (mountain/exercise), and a spillover region, defined as the next two words, or the next three words if one was a determiner, following the target noun (will require).

We report the following measures: first-pass duration (the sum of all fixations in a region before that region is left to the left or right; for single-word regions, first-pass duration is equivalent to first-gaze duration), first-pass regressions (the percentage of regressions following a first-pass fixation), regression-path duration (the sum of all fixations starting from the first fixation in a region until the region’s right-hand boundary is crossed; this can include rereading of earlier parts of the sentence), second-pass duration (the time spent rereading a region after having crossed its right-hand boundary on first pass), regressions into a region (total percentage of regressions into a region), and total gaze duration (the total amount of time spent reading a region).

Results

We analyzed the data using a multiple regression mixed-effects models with participant and item as random effects, using R (R Development Core Team, 2007) and the lme4 package (Bates, Maechler, & Dai, 2008). For all analyses, we calculated the null model and a model including the coercion (coerced vs. noncoerced) predictor term and then compared both models in order to see whether the addition of the term improved the model significantly. Estimated marginal means can be found in Table 1.

Table 1 Reading times (in milliseconds) and percentages of regressions for the three regions of interest

Adjective region

Adding coercion to the model resulted in a marginally better model for the percentage of regressions into this region: likelihood ratio chi-square, χ 2(1) = 2.67, p = .10. The means indicate that this was caused by a higher percentage of regressions to the adjective for the coerced condition. No improvement was found for any of the other measures, all χ 2s(1) < 1.74, all ps > .18.

Noun region

No significant improvement for the coercion predictor was found in the early measures (first-pass duration, first-pass regressions, and regression-path duration), all χ 2s(1) < 1.00, all ps > .36. Coercion did improve the model for second-pass durations, χ 2(1) = 6.40, p = .01, with longer durations in the coerced than in the noncoerced condition. There were also more regressions into the noun region in the coerced than in the noncoerced condition, χ 2(1) = 5.95, p < .02. The coercion predictor had no effect on the total gaze measure.

Spillover region

No significant improvement was found for the first-pass duration measure, χ 2(1) < 1.00, but the first-pass regression data showed a marginal effect, χ 2(1) = 3.62, p < .06, with more first-pass regressions triggered in the coerced condition than in the noncoerced condition. The regression-path effect was significant, χ 2(1) = 5.14, p = .02, with longer times in the coerced condition than in the noncoerced condition. No other improvements to the model were found for any of the other measures, all χ 2s(1) < 2.06, all ps > .15.

Discussion

The results reveal that processing is more effortful when the adjective–noun phrase does not have a straightforward compositional interpretation (e.g., difficult mountain), than when it does (e.g., difficult exercise). The first evidence of this cost is slightly delayed (as is the case in most eye-tracking experiments on coercion; e.g., McElree et al., 2006a; Traxler et al., 2002, 2005), occurring on first-pass regressions and regression-path durations from the spillover region and on the second-pass measure on the noun itself and regressions into that region.

These results are important for two reasons. First, these results present the first evidence that costly coercion processes can also operate within a single noun phrase. While parallelism between and within different components is widely assumed in linguistic theory (see van der Hulst, 2006), this idea has not been systematically researched from a processing perspective, and it might be that what looks similar at different levels is nevertheless processed differently. For example, while strong incrementality is assumed at the phrasal level, with each phrase being interpreted immediately, there is evidence that this does not necessarily occur at the phrasal level. Frisson, Pickering, and McElree (2003) found that adjectival phrases with a strong subsective bias (with the adjective picking out a subset of the set denoted by the noun, as in strong candidate) were processed at the noun phrase level first before being interpreted with respect to the wider sentence context. Hence, even if coercion rules hold between different components of a sentence (e.g., a verb and its complement), one cannot merely assume that these rules will engender equivalent processing profiles at other levels of the semantic composition.

Second, the observed cost suggests that a specific, fine-grained sense of the adjective was not used (for example, senses such as “difficult to climb,” “difficult to handle,” or “difficult to negotiate” for difficult mountain, difficult child, and difficult road, respectively). If that had been the case, enriched composition would not have been necessary, and no cost would have been expected. The processing cost therefore suggests that, initially, a semantically underspecified meaning of the adjective is activated (something like “hard to do something in relation to X,” which implies that the adjective will modify an action associated with the noun phrase) and that the correct interpretation comes from the interaction with its head noun. According to this view, rather than storing all possible senses of a word (comparable to storing the different meanings of homonyms), only the semantically underspecified meaning needs to be stored. As a result, there is no possible competition between different activated senses, and the systematicity between the different interpretations of the adjective remains apparent.

One inherent difference between the two noun sets is that the event nouns are either nominalized activities (e.g., exercise) or derived from activities (e.g., meeting). It is conceivable that event nouns are easier to process than object nouns. However, there are at least two reasons why we think this is unlikely. First, Traxler, Pickering, and McElree (2002) compared event (e.g., fight) and object (e.g., puzzle) nouns and found increased processing times for object nouns only when preceded by a coercing verb (e.g., started), but not when preceded by a neutral verb (e.g., saw). Second, Farmer, Christiansen, and Monaghan (2006) found that verblike nouns presented unambiguously as a noun were processed more slowly than nounlike nouns in the same sentence context. Using their database, we obtained noun- and verblike scores for 11 event nouns and 7 object nouns from our experiment. On average, the event nouns were more verblike and the object nouns more noun-like, and they differed significantly in verblikeliness, t(16) = 3.25, p < .01. Hence, if anything, we would expect a processing cost for the event nouns, which is contrary to our results. It might also be that the two types of nouns combine differently with adjectival modifiers. We do not know of any evidence suggesting that adjective + object noun constructions are more difficult to process than adjective + event noun constructions in general. Given the absence of any correlation effect with two measures of co-occurrence frequency and with the LSA scores, there is no reason to suspect that the adjective + object noun combinations used in our study are particularly hard to process either.

In conclusion, we found a processing cost for adjective–noun constructions (such as difficult mountain) that contain a conflict between the expectation or requirement of the adjective to be followed by an event and the semantic type of its head noun. We argued that this cost shows that comparable compositional processes also operate at the noun phrase level and are not restricted to cases where there is a conflict between different elements in a sentence. Finally, we have argued that our results are more compatible with the idea of the activation of an underspecified meaning of the adjective, rather than one that is fully specified.