Effects of phonological neighbourhood density and frequency in picture naming

https://doi.org/10.1016/j.jml.2021.104248Get rights and content

Highlights

  • Phonological neighbourhood effects depend on target frequency in picture naming.

  • The frequency of phonological neighbours, less so their number, affects performance.

  • Phonological neighbours can both inhibit and facilitate picture naming.

  • An interactive activation model was used to simulate picture naming.

  • A slow rise in lexical activation is required to successfully simulate the effects.

Abstract

Speaking involves selecting a word among co-activated words in the lexicon. The factors determining which potentially co-activated words affect the production of spoken words remain underspecified. This research investigated the influence of words that sound similar to a target word (phonological neighbours) on the picture naming latency and accuracy of young English-speaking adults. Response time analyses showed a significant interaction between the frequency of the target and the frequency of those phonological neighbours that were higher in frequency than the target. Analysis of a published picture naming dataset gave similar results. The mechanisms underlying these results were explored using computational modelling. The critical interaction observed in the human data was successfully reproduced in analyses of the output of some versions of an interactive activation model. This model featured a relatively slow rise of activation in the phonological lexicon nodes, resulting in an increase in the effect of frequency. Overall, results show that phonological neighbourhood effects are tightly related to frequency effects.

Introduction

Every act of oral communication requires retrieval of a phonological word form: We need to select the word form that corresponds to the meaning we wish to convey, from a range of words, including those with a similar phonological form. One common metric to characterise form similarity between words in the lexicon is phonological neighbourhood density (PND). The phonological neighbourhood density of a given word refers to the number of words in the lexicon that only differ from that word by one phoneme, either substituted, added, or deleted (Luce, 1987). For example, under this definition, ‘fat’, ‘kit’, ‘cab’, ‘at’, ‘scat’ and ‘cats’ all count as phonological neighbours of ‘cat’. This definition of neighbours is widely accepted and used in the speech production literature (e.g., Sadat et al., 2014, Vitevitch, 2002) and in programmes that allow the calculation of phonological neighbourhood density (PND, e.g., N-Watch (Davis, 2005) and Clearpond (Marian, Bartolotti, Chabal, & Shook, 2012)). Consequently, this is the definition used in this paper1. Some words have many phonological neighbours (high PND, or ‘dense’ phonological neighbourhoods; e.g., cat, 50 neighbours; man, 51 neighbours), others have few (low PND, or ‘sparse’ neighbourhoods; e.g. inch, 5 neighbours; elk, 6 neighbours). Each of these neighbours has its own specific frequency value and hence a word’s phonological neighbourhood can be of overall high or low frequency depending on the frequency of the neighbours (high or low phonological neighbourhood frequency (PNF)).

It is generally assumed that when a word is activated in the lexicon, the phonological neighbours of this word are also activated (e.g., Luce, Pisoni, & Goldinger, 1990). This is true when we recognise spoken words, with consistent inhibitory effects of more dense phonological neighbourhoods (e.g., Luce & Pisoni, 1998). Despite less consistent effects in spoken word production (see below for review), some theories also hypothesise that phonological neighbours are activated in spoken word production and that this activation affects the lexical selection process (e.g., Chen and Mirman, 2012, Dell & Gordon, 2003). The more phonological neighbours that are active, the greater the influence is hypothesised to be.

In models with interactivity between word nodes and phoneme nodes, phonological neighbours are generally assumed to be activated in spoken word production. For example, within the interactive activation account proposed by Dell et al. (1997), activation of phonological neighbours occurs by feedback from the phoneme level to the word level: activation flows back, not only to the target lexical item, but also to its phonological neighbours (see Fig. 1). Then, if (many) neighbours are active, they in turn will further activate the target’s phonemes, leading to facilitation of target production (e.g., Dell & Gordon, 2003). This model does not feature competition or inhibition within or between levels. However, even within this account, the effect of neighbours need not be facilitatory. If, for example, neighbours are strongly activated at the word level, approaching the level of activation of the target, then these neighbours might yield inhibitory effects. For example, if there is noise or damage to the system, such as weakened lexical connections following brain damage or in the case of healthy ageing, a phonological neighbour could be selected in the place of the target, therefore affecting accuracy. In addition, more time steps may be required for the target to reach a level of activation sufficiently superior to the level of activation of the phonological neighbours to be selected (see e.g., Gordon & Kurczek, 2013).

Chen and Mirman (2012) also modelled the influence of phonological neighbourhood density in a model where interactivity between levels is complemented with bi-directional inhibitory connections within the word level. The general principle here was that weakly active neighbours should exert facilitative effects while strongly active neighbours should be inhibitory. Chen and Mirman suggest that in spoken word production, phonological neighbours are weak neighbours and should, therefore, exert a facilitatory effect on response time, while semantic neighbours, on the other hand, are strong competitors and should therefore induce inhibitory effects.

It is, indeed, uncontroversial that producing a word to speak involves selecting the target word from a set of co-activated semantically related candidates. However, existing studies of “simple” picture naming do not find effects of the number of semantic competitors a word has in the lexicon. This can be seen through investigations of the influence of “semantic neighbourhood density” (the number of words that are close in meaning to a given target) that has not been shown to predict picture naming behaviour in unimpaired subjects (for a review, see Hameau, Nickels, & Biedermann, 2019). This is not to say that there is no competition between semantically related words in simple picture naming: the presence of semantic competition has been demonstrated in a range of paradigms and in a considerable number of studies (see Abdel Rahman & Melinger, 2009, for a review). However, because of these null findings, and in order not to lose the focus of the present study by considering this issue in detail, semantic neighbourhood density was not included as a predictor in the present study.2

Returning to focus on phonological neighbours, it seems clear from the literature, that current theories hypothesise that there may be a critical role of the degree of activation of both target and neighbours in the effects of neighbourhood on production. One of the most well-known causes of variation in lexical activation is word frequency. Lexical frequency can be represented either as different resting activation levels (e.g., Dell, 1988) where, by virtue of their higher resting levels of activation, higher frequency words are given a “head start” in the selection process, or as different connection weights between lexical and sublexical units (e.g., Chen & Mirman, 2012), where more frequent use results in stronger connection weights between a higher frequency word’s lexical representation and its segments, compared to a lower frequency word. Both mechanisms result in faster and higher activation of phonemes leading to greater accuracy and shorter latencies in word production for higher frequency words. If the frequencies of the phonological neighbours of the target are also taken into account, one would expect that, first, the higher the frequency of phonological neighbours, the stronger their effects on target word selection; and second, the lower the frequency of a target, the stronger the effects of its phonological neighbours. Hence, maximal effects of phonological neighbours would be expected on words that are lower in frequency but have many phonological neighbours of higher frequency. However, the precise balance between overall facilitation and inhibition in these scenarios is unclear. The present study aims to shed light on these patterns through both behavioural and computational modelling experiments. We first review the current literature on phonological neighbourhood effects in spoken word production, and then return to this issue in more detail.

In spoken word production, the majority of research has focused on phonological neighbourhood density (PND), with less of a focus on effects of neighbourhood frequency. In general, there is a lower likelihood of phonological errors for words of high PND compared to low PND, in spontaneous speech (e.g., malapropisms in an English speech error corpus: Vitevitch, 1997), or in paradigms designed to induce speech errors experimentally (the SLIPs, or Spoonerisms of Laboratory-Induced Predisposition technique: Stemberger, 2004, Vitevitch, 2002). In naming to definition, high PND targets seem to elicit more correct responses and fewer tip-of-the-tongue states than low PND targets (e.g., Vitevitch & Sommers, 2003). However, in contrast to these facilitatory effects of PND in some tasks that incorporate spoken word production, there is not yet any clear consensus regarding the effects of PND on a particular task used to investigate spoken word production processes: picture naming. Findings in English picture naming differ with respect to the presence and the direction of any effect3 (e.g., facilitation: Vitevitch, 2002; no effect: Vitevitch, Armbrüster, & Chu, 2004; inhibition: Newman & German, 2005), and effects seem to depend on the age of the participants (e.g., inhibitory effects on accuracy in children: Newman & German, 2002; no effects on accuracy in young adults: Vitevitch, 2002). The relevant literature is summarised in Table 1. In English speaking young adults, PND seems to exert either a facilitatory effect on latency (Vitevitch, 2002: Experiments 3, 4, and 5; see also Newman & Bernstein Ratner, 2007, for a marginally significant facilitatory effect) or no significant effect (Gordon and Kurczek, 2013, Vitevitch, Armbrüster, & Chu, 2004); while the effect on accuracy has also been either facilitatory (Newman & Bernstein Ratner, 2007) or non-significant (Gordon and Kurczek, 2013, Vitevitch, 2002: Experiments 3–5; Vitevitch et al., 2004: Experiment 3). A different pattern of results has been found in other age groups: in children, Bernstein Ratner and colleagues (2009) found no significant effect of PND on latencies despite facilitation on accuracy. In contrast, Arnold, Conture, and Ohde (2005), found inhibitory effects for both children’s latencies and their accuracy, and Newman and German, 2002, Newman and German, 2005 observed a detrimental effect of high PND on accuracy. This was for three different PND measures: “standard” PND, and in the 2005 study, also for the number of phonological neighbours of higher frequency than the target, and frequency-weighted PND. Finally, in older adults, Gordon and Kurczek (2013) found inhibitory effects on latency (but not accuracy) of a measure that consisted of the residuals obtained by regressing PND on length (thereby removing the shared variance attributable to length).

Turning to those studies that have investigated the effect of phonological neighbourhood frequency (PNF), the effects mostly seem to be facilitatory. In young adults (Newman & Bernstein Ratner, 2007) and in older adults (Vitevitch & Sommers, 2003), facilitatory effects of PNF were found on both accuracy and response latency, and on accuracy in children (Bernstein Ratner et al., 2009, Newman and German, 2002). However, these results need replication given the small number of available studies, in particular for the young adult group.

An important consideration here, is how phonological neighbourhood frequency is calculated. As noted above, it refers to the overall frequency of a word’s neighbours. Some studies have used the average of the frequencies of each neighbour as a measure of neighbourhood frequency (e.g., Baus et al., 2008, Chan and Vitevitch, 2010, Vitevitch, 2002, Vitevitch and Sommers, 2003), while others have used the summed frequency of the neighbours (e.g., Coady and Aslin, 2003, Mirman and Graziano, 2013). In the computational implementation of Levelt, Roelofs, and Meyer (1999) theory, WEAVER++, the probability of lexical selection is determined by the Luce ratio, which refers to the activation of the target divided by the sum of the activation of the competitors and the target. Hence for this theory, what is important is the sum of the frequency of the phonological neighbours. In contrast, we are unaware of a theory that would predict the average frequency to be the relevant factor. Consequently, in the research presented here we used summed frequency as the measure of neighbourhood frequency. The use of summed PNF rather than average PNF, does result in a stronger confound with the number of phonological neighbours, compared to the use of average PNF (perhaps why, for example, Vitevitch and Luce (1998) refer to summed PNF as “frequency-weighted similarity neighborhood”)4. However, it has the advantage of being less affected by the presence of neighbours that are potentially very low in frequency than a metric based on average frequency of neighbours.

While previous research has examined effects of PNF, we have argued that the prediction from current theories is that what is more likely to be important is not the “main” effect of PND or PNF, but the relative strength of activation of neighbours relative to the target. The influence of neighbours is predicted to be more influential on production of low frequency targets than on high frequency targets, and neighbours of higher frequency than the target to have stronger effects than neighbours in general. If this prediction is correct, then one would expect effects of PND or PNF to vary, depending on the frequency of targets relative to the frequency of these items’ phonological neighbours.

This idea of different effects of phonological neighbours depending on the relative levels of activation of a target and its phonological neighbours is not new. In auditory word recognition, the Neighbourhood Activation Model (Luce & Pisoni, 1998) has very similar predictions: Within this model, the Neighbourhood Probability Rule states that the absolute frequency of a given target word may have different effects on word recognition depending on the frequency of this target word’s phonological neighbours. Luce and Pisoni predicted, for instance, that the words that would be the most difficult to recognise would be low frequency target words with neighbours that are high in frequency. Similarly, Newman and German (2002) directly targeted the frequency of the target and the frequency of its phonological neighbours in spoken word production by investigating the effect of the number of neighbours of higher frequency than the target (and found an inhibitory effect of these neighbours). However, no study has, to our knowledge, looked at the interaction between target frequency and phonological neighbourhood density or phonological neighbourhood frequency in picture naming (density or frequency of either all phonological neighbours, or of neighbours of higher frequency than the target only). This is a focus of the present study and will allow a better specification of the dynamics at play during spoken word production.

All the studies reviewed above but one (Gordon & Kurczek, 2013) used a factorial design, that is, controlled sets of stimuli with a dense/sparse neighbourhood or high frequency/low frequency neighbourhood condition. Because of the problems in precisely matching the item sets, this type of design usually leads to small numbers of items, resulting in a reduced number of trials. It can be as few as eight items (Arnold et al., 2005), and up to 72 (Newman & German, 2002), and is in contrast with, for example, Gordon and Kurczek (2013) who used 200 items in a continuous design (See Rabovsky et al. (2016), for a discussion regarding issues relating to the dichotomisation of continuous variables). Consequently, in the present study, we used a continuous design (i.e., without matching sets with manipulated variables) with a larger number of trials (more participants and more items) to increase power in the determination of which aspects of PND/ PNF are most critical in predicting picture naming behaviour.

Hence, in Experiment 1, we used simple picture naming as a tool to investigate the influence of several measures of PND and PNF on spoken word production in a group of Australian English speakers, using a large number of stimuli. We used linear mixed effect modelling to take into account individual variation induced by different participants and different items. In Experiment 2 we replicated our latency analysis with a published set of picture naming data in British English. Finally, in Experiment 3, we used computational modelling to explore the characteristics of the language system that can replicate the effects found across Experiment 1 and 2. Computational modelling is a powerful tool for theory building. By adjusting the parameters of a computer program that aims to simulate a certain behaviour, and comparing the outcomes of simulations that use different parameter settings, to the corresponding “real-life” behaviour, it is possible to test and refine theories. In Experiment 3, we ran a series of simulations in some versions of an interactive activation model (DRC-SEM, an extension of the Dual Route Computational (DRC) model of reading: Coltheart et al., 2001 that enables simulation of spoken word production from semantics) in order to explore the necessary features of the language production system required to simulate our behavioural effects.

Section snippets

Experiment 1: Picture naming in Australian English speakers

The goal of this experiment was, using a range of phonological neighbourhood measures, to determine:

  • (a)

    how phonological neighbourhood measures predict picture naming behaviour (response time and accuracy) in English speaking young adults, while controlling for other variables that have proven to be influential in picture naming,

  • (b)

    whether measures of phonological neighbourhood density and phonological neighbourhood frequency (including measures of neighbours of higher frequency than the target)

Method

This experiment uses the picture naming data from Johnston et al. (2010).

Participants

Johnston et al. (2010) report response times from 25 native English speakers, 21 female (so 84%) aged 18 to 27 years (mean age 20.08 years), who had lived continuously in the United Kingdom. This sample appears similar to the Australian sample in Study 1 with respect to age, but with a higher proportion of female participants.

Stimuli

Participants named 539 black-and-white line drawings, mostly selected from Szekely et al. (2004).

Experiment 3: Computational modelling

This experiment aimed to explore the characteristics of the language production system that allow the effects of neighbourhood found in both Experiments 1 and 2 to emerge, using an adaptation of the DRC model (Coltheart et al., 2001). The DRC model of reading has been demonstrated to simulate aspects of human reading behaviour that are relevant to the present study. For instance, the frequency effect on DRC reading speed is remarkably similar to the frequency effect seen in human reading data.

General discussion

We have reported two picture naming experiments in English, one with a population of Australian English monolingual speakers, the other using a published dataset of picture naming latencies from British English monolingual speakers. Given the inconsistencies in the previous literature, our aim was to examine the effects of several phonological neighbourhood measures, and in particular, focus on neighbours of higher frequency than the target. Motivated by the literature on word recognition, our

Conclusion

In this investigation of effects of phonological neighbourhood on spoken picture naming, we identified a critical interaction between the summed frequency of phonological neighbours of higher frequency than the target and the log frequency of the target word. This phonological neighbourhood measure exerted inhibitory effects on low log frequency targets, but facilitatory effects on high log frequency targets. We argue that this observation may underpin the inconsistent findings in previous

Data availability

Raw data for this manuscript can be downloaded from https://dx.doi.org/10.17632/68f3fg56ff.1.

Funding

This research was supported by a Cross-program Grant under the Australian Research Council Centre of Excellence in Cognition and its Disorders (CE110001021), an Australian Research Council Discovery Project grant (DP190101490), an international Macquarie Research Excellence Scholarship (iMQRES) to SH, financial support from the Macquarie University Centre for Reading (MQCR) for SR, and an Australian Research Council Future Fellowship (FT120100102) to LN.

CRediT authorship contribution statement

Solène Hameau: Conceptualization, Methodology, Formal analysis, Data curation, Writing - original draft, Writing - review & editing, Visualization. Britta Biedermann: Conceptualization, Methodology, Writing - review & editing, Supervision. Serje Robidoux: Formal analysis, Visualization. Lyndsey Nickels: Conceptualization, Methodology, Formal analysis, Writing - review & editing, Supervision.

Declaration of Competing Interest

None.

Acknowledgements

We thank Max Coltheart and Steven Pritchard for their contribution to computational modelling training of the first author. Both provided valuable discussions during the preparation of this paper. We also thank three reviewers including Steve Lupker for their valuable comments and suggestions that have led to improvements of this manuscript.

References (77)

  • J.P. Stemberger

    Neighbourhood effects on error rates in speech production

    Brain and Language

    (2004)
  • A. Székely et al.

    A new on-line resource for psycholinguistic studies

    Journal of Memory and Language

    (2004)
  • K.I. Taylor et al.

    Contrasting effects of feature-based statistics on the categorisation and basic-level identification of visual objects

    Cognition

    (2012)
  • L.H. Wurm et al.

    What residualizing predictors in regression analyses does (and what it does not do)

    Journal of Memory and Language

    (2014)
  • R. Abdel Rahman et al.

    Semantic context effects in language production: A swinging lexical network proposal and a review

    Language and Cognitive Processes

    (2009)
  • F.-X. Alario et al.

    A set of 400 pictures standardized for French: Norms for name agreement, image agreement, familiarity, visual complexity, image variability, and age of acquisition

    Behavior Research Methods, Instruments, & Computers

    (1999)
  • F.-X. Alario et al.

    Predictors of picture naming speed

    Behavior Research Methods, Instruments, & Computers

    (2004)
  • P. Allison

    When can you safely ignore multicollinearity?

    Statistical Horizons

    (2012)
  • S. Andrews

    The effect of orthographic similarity on lexical retrieval: Resolving neighborhood conflicts

    Psychonomic Bulletin & Review

    (1997)
  • H.R. Baayen et al.

    Analyzing Reaction Times

    International Journal of Psychological Research

    (2010)
  • Baayen, R. H., Piepenbrock, R., & Rijn, van H. (1993). The {CELEX} lexical data base on...
  • C. Barry et al.

    Naming the Snodgrass and Vanderwart pictures: Effects of age of acquisition, frequency, and name agreement

    The Quarterly Journal of Experimental Psychology Section A

    (1997)
  • D.M. Bates et al.

    Parsimonious mixed models

    ArXiv

    (2015)
  • C. Baus et al.

    Neighbourhood density and frequency effects in speech production: A case for interactivity

    Language and Cognitive Processes

    (2008)
  • P. Bonin et al.

    The determinants of spoken and written picture naming latencies

    British Journal of Psychology

    (2002)
  • M. Boukadi et al.

    Norms for name agreement, familiarity, subjective frequency, and imageability for 348 object names in Tunisian Arabic

    Behavior Research Methods

    (2016)
  • Ceccherini, L. (2015). The effects of a concomitant distractor on word reading aloud and picture naming tasks [Doctoral...
  • K.Y. Chan et al.

    Network structure influences speech production

    Cognitive Science

    (2010)
  • G. Chedid et al.

    Norms of conceptual familiarity for 3,596 French nouns and their contribution in lexical decision

    Behavior Research Methods

    (2019)
  • Q. Chen et al.

    Competition and cooperation among similar representations: Toward a unified account of facilitative and inhibitory effects of lexical neighbors

    Psychological Review

    (2012)
  • J.A. Coady et al.

    Phonological neighbourhoods in the developing lexicon

    Journal of Child Language

    (2003)
  • M. Coltheart

    The MRC psycholinguistic database

    The Quarterly Journal of Experimental Psychology Section A

    (1981)
  • M. Coltheart et al.

    DRC: A dual route cascaded model of visual word recognition and reading aloud

    Psychological Review

    (2001)
  • M. Coltheart et al.

    A position-sensitive stroop effect: Further evidence for a left-to-right component in print-to-speech conversion

    Psychonomic Bulletin and Review

    (1999)
  • C.J. Davis

    N-Watch: A program for deriving neighborhood size and other psycholinguistic statistics

    Behavior Research Methods

    (2005)
  • G.S. Dell et al.

    Neighbors in the lexicon: Friends or foes?

  • G.S. Dell et al.

    Lexical access in aphasic and nonaphasic speakers

    Psychological Review

    (1997)
  • A.W. Ellis et al.

    Real age-of-acquisition effects in lexical retrieval

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (1998)
  • Cited by (3)

    View full text