Lexical representation and processing of word-initial morphological alternations: Scottish Gaelic mutation

When hearing speech, listeners begin recognizing words before reaching the end of the word. Therefore, early sounds impact spoken word recognition before sounds later in the word. In languages like English, most morphophonological alternations affect the ends of words, but in some languages, morphophonology can alter the early sounds of a word. Scottish Gaelic, an endangered language, has a pattern of ‘ initial consonant mutation’ that changes initial consonants: Pòg ‘kiss’ begins with [ph], but phòg ‘kissed’ begins with [f ]. This raises questions both of how listeners process words that might begin with a mutated consonant during spoken word recognition, and how listeners relate the mutated and unmutated forms to each other in the lexicon. We present three experiments to investigate these questions. A priming experiment shows that native speakers link the mutated and unmutated forms in the lexicon. A gating experiment shows that Gaelic listeners usually do not consider mutated forms as candidates during lexical recognition until there is enough evidence to force that interpretation. However, a phonetic identification experiment confirms that listeners can identify the mutated sounds correctly. Together, these experiments contribute to our understanding of how speakers represent and process a language with morphophonological alternations at word onset.


Introduction
This paper explores how a morphophonological alternation in Scottish Gaelic known as initial consonant mutation affects speakers' lexical representation and listeners' processing of morphologically related word forms. This alternation is very different from those found in better-studied languages, such as English, in that it affects the beginning rather than the end of the word, and thus poses a challenge for theoretical models that depend crucially upon the word onset as a primary factor in word recognition.
Scottish Gaelic is a Celtic language spoken in the Highlands and Islands of Scotland. Although it is endangered and all speakers are bilingual in English, there are fluent L1 speakers of Scottish Gaelic-especially on the Hebridean Islands-who still use Gaelic daily, and who were monolingual in Gaelic until starting school. There is a Gaelic college on the Isle of Skye (Sabhal Mòr Ostaig) and many native speakers are associated with the college. The last author, Fisher, is a native speaker of Gaelic from Skye who teaches language courses at Sabhal Mòr Ostaig. Through her knowledge of the community, our group was able to test relatively large numbers of fluent L1 speakers who are literate in Gaelic. This allowed us to perform experiments on spoken word recognition on this language, with its typologically rare morphology.
In spoken word recognition, a variety of models have been developed to account for how morphologically complex forms are represented in the lexicon, ranging from models advocating for a holistic storage approach (e.g., Tyler et al., 1988) to those supporting morphological decomposition during word recognition (e.g., Marslen-Wilson et al., 1994). Additionally, hybrid models (e.g., Balling & Baayen, 2008) propose a role for morphological structure along with full listing.
Many of these models are based on studies limited to processing of languages such as English, German, and Dutch. In general, the role of morphology in spoken word recognition remains relatively less explored, although for example Mauth (2002) examines the question of how and whether listeners break words into morphemes during spoken word recognition. Some recent work has tested differing predictions with respect to morphological complexity in under-studied languages that exhibit relatively unusual morphological structure (e.g., Ussishkin et al., 2015 for Maltese root-and-pattern morphology; Schluter, 2013 for Moroccan Arabic root-and-pattern morphology), showing that for these languages as well there is support for models in which words are recognized on the basis of their constituent morphemes.
Also highly relevant to our questions regarding the relationship between mutated forms and their unmutated counterparts in Scottish Gaelic is a study by Boyce et al. (1987) on the related language Welsh, also known for its initial consonant mutation system. Based on evidence from three experiments in the auditory modality (see below for further details), Boyce et al. (1987) propose a lexical structure in which there are two levels. In their model for Welsh, mutation variants of a form are listed at one level, and are connected to an underspecified representation at a second level, allowing for an explanation of the symmetrical priming effects found between mutated vs. unmutated Welsh forms. This is similar to the model proposed by Meunier and Segui (1999) for processing of spoken words in French, in which suffixed words in the same morphological family share a lexical entry.
Another aspect of the role of morphology in spoken word recognition concerns when in the time-course of a word lexical competitors are eliminated from consideration. Models that include continuous parsing, like the Cohort model of Marslen-Wilson and Welsh (1978) and Shortlist B (Norris & McQueen, 2008), claim that with each incoming phoneme or shorter stretch of time during the unfolding of the speech signal, whole-word competitors not matching the onset are eliminated from consideration. In the Cohort model, this is a categorical match/mismatch, while Shortlist B evaluates gradient degree of mismatch. However, Balling and Baayen (2012) showed that in addition to whole-word competitors, morphological competitors also play a role in the identification of complex words, illustrating that morphological structure is in fact relevant to word recognition. This is especially important when the morphological variation is realized through the first segment of the word (Stewart, 2004, p. 7).
Here, we look at a form of morphological relationship that is more unusual than, for instance, the relationship between a base form such as the English verb cook and its past tense counterpart cooked, derived by a straightforward linear suffixation process. We examine, in a series of three experiments, initial consonant mutation in Scottish Gaelic (Stewart, 2004), a phenomenon found in Celtic languages whereby morphologically related words share all but their initial consonant. Under mutation, word-initial consonants alternate predictably with other consonants; when this occurs the contrast between these initial consonants is the sole phonological realization of the morphological difference between these forms. For instance, the imperative form of the Scottish Gaelic verb pòg [p h ɔːk] 'kiss' is related to its past tense form phòg [fɔːk] 'kissed' via initial consonant mutation. In addition to mutation that is syntactically conditioned and may occur without an overt 'trigger,' the language also has cases in which the mutated counterpart of an unmutated word obligatorily co-occurs immediately adjacent to an overt trigger. Some possessives in Scottish Gaelic are marked this way, such as the third person masculine possessive (bò 'cow'/a bhò 'his cow'). See Stewart (2004) and references cited therein for additional details on Scottish Gaelic mutation.
The consonants subject to mutation in Scottish Gaelic and their mutated counterparts are listed below in Table 1. Marslen-Wilson et al. (1994) tested how prefixes in English affect word recognition, pointing out that many words begin with the same prefix (e.g., re-or dis-in English), delaying the point at which the lexical cohort starts to narrow. The Gaelic case is different: Mutation affects the word onset, but by altering a segment and thus changing the potential lexical candidate list, not by adding segments. In the Gaelic case, a listener might assume a mutated word beginning with [f-] belongs to a different set of lexical candidates (the ones beginning with underlying /f/) than it actually does.
Before we proceed further, a discussion of the Scottish Gaelic consonant inventory and its relationship with initial consonant mutation is necessary. Analyses of the phoneme inventory of Gaelic vary considerably, depending on dialect. The consonant inventory presented in Table 2 is based on the analysis presented in Ladefoged et al. (1998) concerning the Bernera variety of Lewis Gaelic, and that in Gillies (2009). This inventory is also representative of Fisher's dialect. Readers interested in other dialects are referred to e.g., Dorian (1978) for East Sutherland Gaelic, Ternes (1973) for Applecross, or Borgstrøm (1940) for other dialects of the Outer Hebrides.
In most dialects of Scottish Gaelic, there are two sets of voiceless oral stops which contrast in aspiration. In such dialects, aspiration is realized as post-aspiration in initial position and as preaspiration in medial and final positions (Nance & Stuart-Smith, 2013;Clayton, 2010;Ní Chasaide, 1985). Plain coronal consonants typically have dental articulations [t ̪ t ̪ʰ s̪ n̪ r̪ l ̪]; however, the dental diacritic is generally omitted in the remainder of our discussion for typographical simplicity. Sonorant consonants in many dialects exhibit a three-way contrast between plain, velarized, and palatalized variants (cf. Nance 2014), though in some dialects this number is reduced to a palatal/velarized opposition (Ó Maolalaigh, 2008;Stewart, 2004).
The relationship between non-mutated consonants in Gaelic and their mutated counterparts, while complex, does not seem to involve two distinct consonant inventories. Instead, the set of consonants resulting from mutation seems to be a proper subset of the non-mutated inventory; that is, consonants that arise through mutation also exist underlyingly as non-mutated forms. Thus, mutation involves the neutralization of contrasts between two or even three phonemes e.g., /p m/ → [v], /t k/ → [ɣ], /tʲ kʲ/ → [j], and /tʰ s/ → [h], except in the case of /f/, which is deleted outright in mutation contexts.
One may ask whether initial consonant mutation in Scottish Gaelic can be viewed as a strictly phonological process, rather than as a regular, syntactically conditioned morphophonological process as we do here (cf. Stewart, 2004). However, a strictly phonological analysis of Gaelic mutation is challenging. There are several reasons for this. To begin with, there does not seem to be any consistent way to describe mutation in phonological terms, except very broadly as a lenition process. Mutation does not appear to involve the consistent loss or addition of certain phonological features, but rather varying and evidently arbitrary combinations of such features. Depending on the consonant, mutation may involve changes in voicing, continuity, place of articulation, or sonority, or some combination thereof. Nor does mutation appear to be triggered by any identifiable phonological conditioning factor or environment. This can be demonstrated in at least three ways. First, mutation may occur in a variety of phonological  contexts, yet still exhibit the same alternations. The past tense of the verb, for example, usually occurs in clause-initial position (because Gaelic is a verb-initial language), and thus there is no consistent preceding phonological environment. Second, mutation is predictably triggered by certain preceding morphemes, but not by other morphemes which are phonologically similar. Though mutation is evidently not a strictly phonological process, neither is it a purely morphological one, since there are certain restrictions on its operation which appear to be phonological (Hammond et al., to appear). First, the coronal consonants /t tʰ s/ do not mutate if the final sound of the preceding item is a coronal nasal, e.g., seann × 2 duine [ʃaʊn tɯnʲə] 'old man,' not *sean dhuine [ʃaʊn ɣɯnʲə]. Second, a set of onset clusters do not undergo mutation, which may be defined as those clusters whose second element would be subject to lenition if it occurred alone, i.e., /sp, st, sk, sm/. These facts suggest that mutation in Scottish Gaelic is best viewed as a morphophonological process, not as a strictly phonological one.
The relationship between Gaelic mutated forms and their unmutated counterparts is similar in some ways to English past tense verbs related to their present tense verbs via ablaut (e.g., strong verbs such as spoke-speak). Previous work in the auditory domain (Justus et al., 2008) has found both behavioral (facilitatory priming) as well as online (reduced N400 components in ERP measures) evidence for pairs related by ablaut, suggesting that even if irregulars are lexically listed they must nonetheless also be connected at some level of representation. In the Scottish Gaelic case, initial consonant mutation distinguishes different forms of the same stem in a similar, nonconcatenative way, but unlike English ablaut, applies regularly throughout the verbal and nominal systems. The other difference, of course, is that in Scottish Gaelic, it is the initial segment of the word where the morphological distinction is manifested. Thus, upon encountering a mutated form, a listener may need to entertain multiple hypotheses; the word being perceived could be a form beginning with an underlying /f/, for instance, or it could be a form beginning with an /f/ resulting from initial consonant mutation. Many models of spoken word recognition place importance on the initial segment in narrowing the set of lexical candidates in lexical retrieval, but if listeners must consider a wider pool of candidates, specifically forms related by mutation, then it could be the case that the initial consonant does not automatically reduce this pool.   Connine et al. (1993) tested the notion that word-initial phonemes have a special status through a series of lexical decision experiments with priming, using auditory nonword primes with visual targets. In their Experiment 2, greater facilitation was found for primetarget pairs in which the nonword prime's initial segment differed by two or fewer features from its target, compared with nonword primes in which the initial segment differed by at least four features from its target. However, in a subsequent experiment, the same priming effects were observed when nonword primes differed in a medial, rather than the initial, segment, thus raising questions for models that depend crucially on word onset. Connine et al. (1993) conclude that their findings support models such as TRACE (McClelland & Elman, 1986) in which cohorts based on word beginnings are not relevant to word recognition.
As mentioned above, initial consonant mutation is also found in the Celtic language Welsh, and was explored using auditory priming in a lexical identification task by Boyce et al. (1987). Welsh has three types of initial consonant mutation, though only two types (aspirate and soft mutation) were used in their study. All of their stimuli contained prime and target items embedded within a triggering context. Across a series of three experiments, native Welsh-speaking listeners first listened to a series of spoken primes and then were asked to identify spoken target items embedded in noise. Primes and targets were presented in several priming conditions: Primes and targets were either identical, or related to the primes via a morphophonological initial consonant mutation, or unrelated to the targets. The hypothesis being tested in their experiment was whether exposure to the primes would facilitate identification of the targets. Their dependent variable was accuracy: How accurately did listeners identify the target items? Boyce et al. (1987) reported that there was priming between base forms and their mutated counterparts, and vice versa, based on significant differences in lexical decision accuracy. Their results showed equivalent priming regardless of whether the prime and target were mutated or unmutated forms, and in addition that the priming obtained was due not to the close phonological relationship between a base form and its mutated counterpart but rather due to the morphological structure shared by a base and its mutated form. They proposed a model of the lexicon with two levels: One for word forms and another at which forms sharing morphological structure are co-listed.
In the current work, we use three psycholinguistic methods to study the representation and processing of Gaelic initial consonant mutation. Because Gaelic is an endangered language, there are some limitations on what methods can be used that do not apply to studies on English, Dutch, etc. For example, it is not possible to recruit upward of 60 native listeners, especially if they must be literate in Gaelic in order to perform a given experimental task. However, the native speaker author (Fisher) was able to recruit a larger number of native speakers than is typical for experiments on endangered languages, particularly native speakers who use Gaelic in their daily life and who are literate in Gaelic. We begin with the task that addresses lexical representation most directly: Lexical decision with auditory masked priming. After using that task to examine relatedness of the forms in the lexicon, we turn to two speech perception tasks, gating and phonetic identification, to examine how listeners recognize words that might contain a mutated consonant while hearing speech. Thus, we begin with the lexical representations of the related forms, and then turn to how the forms are accessed during spoken word recognition. Kouider and Dupoux (2005) developed a technique for studying word recognition in the auditory modality using masked auditory primes. Masking is achieved by durationally compressing the prime and embedding it within a sequence of forward and backward auditory masks. Primes sound like noise with a small amount of unrecognizable voice-like sound. Masking the primes allows us to examine potentially early and automatic stages of word recognition, with as little effect from working memory as possible given that the masked primes are not consciously perceived by participants. At a 35% compression rate, Kouider and Dupoux (2005) reported significant repetition priming, but no phonological, morphological, or semantic priming for French listeners in a lexical decision task with French primes and targets; across conditions, no prime awareness was found at this compression rate. For example, the masked prime cousine 'cousin, f.' facilitated recognition time for the repetition target cousine, but the morphologically related masked prime cousin 'cousin, m.' did not; similarly, no priming was found between phonologically related pairs such as devis 'estimate, quote' and devise 'motto,' nor between semantically related pairs such as lapin 'rabbit' and carotte 'carrot.'

Introduction
The same technique was used to explore a different type of typologically rare morphology in an auditory lexical decision task in Maltese (Ussishkin et al., 2015), where both repetition priming and morphological priming by consonantal roots was found for verbs. As in other Semitic languages, morphologically related Semitic words in Maltese share a sequence of three non-contiguous consonants that typically signal the contentful meaning; this sequence of consonants is known as the 'consonantal root.' In Ussishkin et al. (2015), facilitatory priming was found between the masked prime giddem 'to gnaw' and the repetition target giddem, as well as between the masked prime ngidem 'to be bitten' and the morphologically-related target giddem, which shares its consonantal root with the prime. Similar root priming effects were found using the same method in Moroccan Arabic as well (Schluter, 2013). In both the Maltese (Ussishkin et al., 2015) and the Moroccan Arabic (Schluter, 2013) studies, root priming effects were shown to be morphological, and independent of any possible contribution from phonological form overlap between prime-target pairs sharing a consonantal root. Here, we apply lexical decision with auditory masked priming to Scottish Gaelic verbs to test whether the mutated form of a verb (e.g., bhuail [vuəʎ] 'hit, past') will facilitate lexical access to its unmutated counterpart (e.g., buail [puəʎ] 'hit, imperative').

Materials
Sixty real word targets were chosen. All targets were imperative forms of Scottish Gaelic verbs. Imperatives were chosen because their mutated counterparts-the past tense forms of the verbs-require no overt trigger for mutation, so that mutated and non-mutated forms are both words that can stand alone, unlike cases of mutation with an overt preceding trigger (e.g., the third person masculine possessive as in bò [poː] 'cow'/a bhò [ə voː] 'his cow'). Since plain imperatives and their corresponding mutated past tense forms are identical apart from their initial consonants, some degree of form priming due to the shared components may very well be expected (e.g., Dufour & Peereman, 2004). Previous work in the auditory domain has established that form priming can have different effects on target recognition, depending upon the position of form overlap between the prime and the target. When prime and target overlap at word onset, target recognition is inhibited, but when prime and target overlap at the end of the word (e.g., when prime and target rhyme) target recognition is facilitated (Radeau et al., 1995;Slowiaczek et al., 2000). We used three types of controls to parse out the effect of form priming, and quantify the amount of further priming that the morphophonological relation provides: A repetition priming condition, a rhyme-overlap phonological priming condition, and an unrelated/control priming condition.
The target stimuli were chosen with the native speaker author's assistance to enable a Latin square counterbalanced design. Sixty sets were chosen, each with an imperative verb target, and four primes (as well as six unrelated words for use as "masks"; see examples in Table 3): An identity prime (prime and target are identical); a morphological prime (prime is mutated past tense form of target); a phonological prime (prime shares all but the first segment with the target, but not the segment related by mutation); and a control prime (prime and target are unrelated words).
The phonological priming condition served as a secondary type of control condition. In this condition, the prime and target shared all but the first segment, without sharing any morphological or semantic relationship, in order to test for strictly form-based or phonological priming. This was in contrast to the morphological priming condition, in which the prime and the target are mutated and unmutated counterparts of each other, so in addition to sharing their form they also share a close morphophonological relationship. If we were to find morphological priming, it would be necessary to rule out a strictly form-based effect, and including the phonological condition allowed for this comparison to be made. In this condition, we expect that recognition of targets may be somewhat faster than in the control condition (Slowiaczek et al., 2000), but not as fast as the morphological condition. This is because faster recognition of targets in the morphological condition would be ascribed to a closer lexical relationship between morphologically related forms relative to forms related merely by form overlap. The identity and unrelated priming conditions bookend the range of priming effects, and allow a calibration of the magnitude of the morphological effect under examination. Table 3 illustrates the four priming conditions using the real word target buail 'hit (imperative).' A table listing all primes and targets is provided in Appendix A.
In addition to the 60 real word targets and their primes, we created 60 nonword targets and paired them with real word primes as well, so that half of the items in each list were nonwords. Nonword targets were derived from the 60 real word targets by shuffling their initial consonants such that none of them was listed in our dictionaries, but they nonetheless formed phonotactically legal nonwords. During recording, a small number (~5) judged similar to words by the native speaker author were changed by again modifying the initial segment to render them more obviously nonword-like.
All participants received identical prime-target pairs for nonword targets, unlike real word targets which were rotated to counterbalance all targets among the four priming conditions. So for example, while the unmutated target buain [puəɲ] 'harvest (imperative)' was primed with itself (identity condition) in list one, it was primed with bhuain [vuəɲ] 'harvest (past)' (the morphological condition) in list 2, and with cuain [kʰuəɲ] 'ocean (gen)' (phonological condition) in list 3, and with the unrelated word ainleag [aɲʎak] 'swallow (unrelated condition)' in list 4. Nonword targets were also paired with real word primes, which are not consciously perceived (Ussishkin et al., 2015).
All real word and nonword items were recorded by our native speaker author in a sound-isolating Whisper Room recording booth at a sampling rate of 44,100 Hz with 16-bit quantization. The speaker was instructed not to use list intonation, and to pronounce each item clearly but at a normal speech rate. Each item was read three times from a printed list in Scottish Gaelic orthography, using a head-mounted Countryman Associates microphone connected to an Alesis Masterlink 9600 with a Symetrix Audio 302 pre-amplifier.  A trained phonetician research assistant selected the best of the three tokens for each item (the one with the most neutral intonational contour and no non-linguistic intrusions like coughs). This was typically the second of the three tokens for each item.

Identity
To create the stimuli, each prime-target pair was matched with an auditory mask composed of six unrelated Gaelic words with a very low degree of segmental overlap with the prime and target. These mask component words were recorded as part of the recording list mentioned above. All stimulus components (targets, primes, and masks) were scaled to the global average RMS levels for all recordings (roughly 82dB SPL) prior to the combination into stimuli. Primes and masks (but not targets) were then compressed to 35% of their original duration, and their intensity downscaled by 15dB, and masks were temporally reversed. The components were then combined such that onset of the target followed a forward mask and the prime, and was itself masked by five consecutive masking words. A typical item is illustrated in Figure 1 below to show the occurrence of each component over the timecourse of a trial. In this figure, the word "mask" is backward, compressed, and in smaller print, to visually represent that it has been reversed, compressed, and reduced in amplitude; the word "prime" is likewise prepared and represented, except that it has not been reversed.

Participants
Twenty native speakers of Scottish Gaelic (6 males, 14 females; aged 22-66, mean age 47, SD 13) participated in the priming experiment. Nineteen grew up in the Hebridean Islands, one in Glasgow. All but one were raised with little or no non-Gaelic exposure until going to school. In terms of formal education level, two participants had completed secondary education (high school), and the remainder had attended university or other types of post-secondary education. All participants report spending most or all of their time using Gaelic in their current personal daily life and at work. Only one participant reports partial hearing loss, but her accuracy rate was above the mean across the participants, so her data were not excluded.

Procedure
The testing occurred in a quiet classroom at Sabhal Mòr Ostaig, a Gaelic language university in Sleat on the Isle of Skye. Participants were seated in front of a button box wearing headphones during the experiment. Each participant was orally instructed to listen to the stimuli and decide, as quickly and as accurately as possible, whether each target was a word they knew in Scottish Gaelic by pressing a button marked 'yes' or 'no' on a response box connected to a computer. The software randomized the order of stimuli each time the experiment was run, so each participant responded to items in a different, randomized order. Participants were assigned to the four counterbalanced lists in a rotating order. The experiment was run using E-Prime 2.0 Professional, which measured response accuracy and reaction time (RT). Before the testing began, participants were given two practice blocks in order to familiarize them with the slightly strange stimuli and the task. They were told by the experimenter, "Imagine you are at a party. There will be some noise from other people talking, but you will hear something that is louder than the noise. Listen for that thing, and decide whether it is a word you know or not." They then passively listened to four stimuli (both words and nonwords, with mutated consonants and nonmutated consonants), after which they were asked to practice responding to a further four practice items. At this point, they were invited to ask questions before beginning the experiment proper. The experimental test items themselves were presented in four blocks (30 items in each block), and participants were invited to take a short break in between. After the four blocks, the participants were debriefed and this completed the procedure. Participants were compensated monetarily for their participation at the rate of UK £50 each.

Results
First, a d-prime analysis was conducted to assess whether participants were responding as expected in the task, to what extent they were able to detect the difference between words and nonwords, and how much bias they displayed. All d-primes were positive (mean 1.58; SD 0.33; range 1.08 -2.34), indicating participants were able to discriminate words from nonwords overall. All betas were also positive (mean 0.74; SD 0.30; range 0.11 -1.56) indicating that they had a nonword bias, preferring to make errors responding "nonword" to words rather than false positives responding "word" to nonwords. We take this to mean that participants are able to do the task as expected.
We measured reaction time in milliseconds from onset of target, log-transformed the durations to ensure a near-Gaussian distribution, and used this (LogRT) as the dependent measure in our analyses. 1 Mean reaction times, error rates, and outlier rates are given in Table 4. A linear mixed effects regression model was fitted to determine whether there was an effect of priming condition and whether the control condition was significantly different from the other three conditions. Some responses were considered ineligible for this modeling. Only correct responses given after the onset of a real word target were included in the analyses, and 3 responses greater than 2.5 standard deviations outside the mean LogRT were considered outliers and consequently excluded. In total, 1001 of 1200 total observations were used in modeling.
Having determined the optimal model, we turn to the fixed effects. RT means, differences, and effects to follow are given back-transformed from the LogRTs of the model. The grand mean RT was 1.155 s; the model intercept RT was 925 ms. Target Duration relates significantly to response time (extending RTs by a factor of 1.27 s per 1.00 s of target duration, t = 5.07, df = 58.2, p < 0.001), with longer targets eliciting longer RTs. Within the Priming Conditions, the Unrelated Control condition was coded as the reference level. The Identity and Morphological priming levels showed significant facilitation of responses, while Phonological priming did not. These results are summarized in Table 5.

Discussion
These results are consistent with the hypothesis that the mutated form of a verb facilitates lexical access to its unmutated counterpart, and that this facilitation effect cannot be due solely to phonological overlap between two verbs related by initial consonant mutation. What's more, the facilitation effect we found must take place relatively early during processing, since the primes were not consciously perceived by native speaker participants due to being masked as described above.
This leaves open the question of how listeners process the sounds of a mutated or unmutated form. When listeners hear a sound that could be a mutated consonant, do they assume it is the result of mutation? When listeners are hearing Scottish Gaelic connected speech, do they consider mutation forms of words as possible words immediately in the spoken word recognition process, equally to words that are not the result of mutation? To address these questions, we turn to a gating task.   An open-response gating task provides an initial way to assess what words listeners are considering as potential words to recognize at a given point in the acoustic signal (Grosjean, 1996), and this method can be used in a field setting. In this task, a recording of each word is gated, that is, cut off at a particular time point in the signal. For example, listeners might hear the beginning of a recording of phòg [fɔːk] 'kissed' or pòg [pʰɔːk] 'kiss' or foghlam [fʊɫəm] 'education,' gated to end at the end of the first consonant, or the end of the first vowel, or the end of the second consonant. In the current study, the stimulus always begins at the onset of the word (or phrase when there is a preceding particle) up to the gate point. Listeners respond with a whole Gaelic word that the stimulus might have been the beginning of (or phrase beginning with the particle), writing their response with pencil and paper in Gaelic orthography. Across the pool of listeners, the set of words given as responses gives an indication of what set of words native listeners consider as candidates for lexical recognition (Grosjean, 1996;Warner, 1998 'open!,' this shows that listeners are considering at least these words as possible lexical items to recognize based on hearing that portion of the signal.
To answer our question about processing of mutation during spoken word recognition, we can examine what proportion of the listeners' responses are words where the initial consonant is caused by mutation, regardless of whether the response is the word that the stimulus was made from or not. For example, if a listener hears [f] from phòg [fɔːk] 'kissed' and responds with phaisg [faʃkʲ] 'wrapped,' this indicates that this listener was considering a mutation form at this point, since this word is derived by mutation from the imperative form of the verb paisg [pʰaʃkʲ] 'wrap!' However, if a listener responds to the same stimulus with fad [fat] 'length,' this indicates consideration of a non-mutation form, since the /f/ in this word is underlying, not caused by mutation. Since each listener gives only one response, a mutation response does not mean that the listener is not also considering non-mutation forms or vice versa, but the total set of responses given by listeners does indicate at least some of the words listeners are considering at a given time point in the signal. This task avoids asking listeners directly whether the stimulus matches a form derived by mutation or not. Furthermore, because the task is open response and responses are whole words, this task requires spoken word recognition, and does not prompt the listener to give a metalinguistic judgment about a given sound.

Materials
We chose nine sets of three Scottish Gaelic words each to use as the target stimuli (examples in Table 6). In each set, one word began with an unmutated consonant (e.g., the pòg [pʰɔːk] 'kiss' example above), another was the mutated form of the same word (e.g., phòg [fɔːk] 'kissed'), and the third was an unrelated word beginning with the same phoneme as the mutated word, but with that phoneme present underlyingly, not derived by mutation (e.g., foghlam [fʊɫəm] 'education'). For three sets, including the pòg set, there was no preceding context and the mutation was triggered by past tense or another morphological category that mutates the target without an overt particle. For another three sets, the mutation was triggered by a preceding particle, but a homophonous particle that does not trigger mutation can also occur in the same environment (the a peann set in Table 6). In this condition, the particle a [ə] can mean either 'his' (triggering mutation) or 'her' (not triggering mutation). The third word of this set, a feannag [ə f j aʊnˠak] 'her crow,' has initial [f j ] underlyingly, not caused by mutation. One might think these two item sets (no particle and ambiguous particle) would be equivalent, since neither gives the listener grammatical information about mutation, but it is not known whether one might offer listeners more information about mutation than the other, or whether these conditions might differ in listeners' bias toward or against mutation. We included both in order to determine whether presence of a preceding particle might make any difference. The final three sets of words also have a particle before the target consonant, but in this case, the particle specifies unambiguously whether mutation is expected on the following word-initial consonant or not. For example, am [am] 'the/their' and gu [ku] 'to' cannot be followed by mutation, while mo 'my' must be. In this condition, if listeners recognize the particles and treat them grammatically as expected, then once listeners hear the particle, the mutation status of the following consonant should be clear. For all word sets, the vowel after the target consonant was matched as closely as possible, but an exact match is not necessary for this task. All words were chosen to have at least some other lexical items in Gaelic that begin with the same sound sequence, so that there would be more than one possible response.  Table 6: Examples of stimulus conditions. The target consonant (word-initial) is the initial consonant of the noun, whether there is a preceding particle containing a consonant or not.
In addition to the three items in each of the nine conditions in Table 6, we also included three words beginning with consonants or consonant clusters that are not subject to mutation (e.g., /l/, /sp/), to provide a small number of fillers. Furthermore, if listeners were to perform very poorly at giving responses to the stimuli based on words in Table 6, then we could use the filler non-mutatable words to assess ability to do the task when word-initial morphophonology is not at issue at all. All items, including these, appear in Appendix B. All items were chosen by first developing lists of candidate items from dictionaries, after which Fisher, the native speaker author, screened potential items and suggested alternatives. Because Gaelic has considerable dialectal variability and is endangered, many words listed in dictionaries are low frequency, archaic, or from a different dialect, and thus not familiar to at least some fluent Gaelic speakers. Fisher selected items that would be readily familiar to most fluent speakers on the Isle of Skye, where we conducted the experiments.
For each item, we created three gated stimuli. Fisher was recorded reading all of the target words as part of a longer list including the stimuli for Experiment 3 below. The recording was made at the University of Arizona under the same conditions as the recording for Experiment 1. Using Praat (Boersma & Weenink, 2016), we positioned gate onset points for each item at the beginning of any speech sound visible in the waveform or spectrogram, including the preceding particle in the conditions with particles.
We positioned gate end points at gate 1: The end of the target consonant, gate 2: Two thirds of the duration of the vowel after it, and gate 3: Either two-thirds of the way through or at the end of the next segment after that (consonant or vowel), using the following criteria to determine boundaries between segments. The boundary between a voiceless consonant and a following vowel or voiced consonant was positioned at the onset of voicing. The few low amplitude periods between onset of voicing and onset of clear formant structure were considered to be part of the vowel, in order to avoid including potential vowel quality and onset of voicing cues in the consonant-initial gate 1 stimuli. Thus for example, the boundary between [f j ] and its following vowel in mo phiuthar [mo fʲuɛr] 'my sister' was placed at onset of voicing, which occurs slightly before onset of F2 and F3 for the vowel. The boundary between a vowel and a voiced obstruent, or between a vowel and a following voiceless consonant, was considered to be onset/offset of clear F2 of the vowel. (This is not symmetric with the boundary for voiceless consonant followed by vowel because of the tendency of low-amplitude voicing to continue into a following phonemically voiceless consonant for a relatively large portion of the consonant.) For boundaries between vowels and nasals, the sudden change in distribution of energy as visible in the spectrogram was used as the boundary. For boundaries between vowels and approximants, if there was a sudden decrease in amplitude for the approximant, that was used as the boundary, otherwise, the point halfway through the duration of the formant shift was used. Because many orthographic consonants of Gaelic are realized as approximants and it was difficult to find enough items for some conditions, we could not avoid vowel-approximant boundaries. Similarly for vowel-vowel boundaries, the half-way point of the formant shift was used.
These boundaries were then used, for the vowel (gate 2) and following segment (gate 3), to locate the point at two-thirds of the duration of that segment. For following consonants, we placed the end of the third gate at two-thirds of the duration of the consonant, except if that consonant was a stop, in which case the end of the consonant was used in order to avoid having some gate end points fall before and others during or after the burst. The gate end point for vowels (gate 2) and for the next segment if not a stop (gate 3) was placed at two-thirds of the duration, not the end of the segment, in order to lessen coarticulatory cues so that the following segment would not already be clearly perceptible. However, for the gate end point in the target consonant (gate 1), we wanted to make sure that all necessary perceptual cues to that consonant were included in the stimulus, regardless of manner and voicing of that consonant. Thus, gate 1 provides the listener with information about the initial consonant of the target word and any preceding particle, as well as some coarticulatory information about the following vowel. Gate 2 provides information up through the following vowel, and gate 3 through the segment after that. Figure 2 shows an example of the time spans included in each gate.
The stimuli were extracted from the recordings, and the amplitude was ramped down over the final 5 ms before the gate end point, to avoid introducing artifactual cues by cutting the waveform off suddenly (Smits et al., 2003). When producing gated stimuli, some type of non-speech noise such as a square wave is often used after the speech, with the amplitude of the square wave being ramped up while the amplitude of the speech is ramped down (Smits et al., 2003). However, in previous work with Gaelic, we found that such square wave beeps at the end of each stimulus made the experiment difficult and irritating for this listener population, so as in previous work (Hammond et al., 2014), we gated to silence.
The nine conditions (Table 6), with three gates for each word and three items per condition, along with three filler words, resulted in a total of 90 items. Five additional similar items using different words were constructed as practice items.

Participants
Twenty-five native speakers of Scottish Gaelic participated in this experiment. They have the same characteristics regarding language background and current Gaelic usage as the participants in Experiment 1 above, and in fact, 13 people participated in both experiments. Experiment 1 was conducted one year after Experiments 2 and 3, which were conducted during the same visit to Skye. Because Gaelic is an endangered language, we took care to invite participants who both acquired Gaelic as their first language as children, and are highly fluent Gaelic speakers now. The participants' age range at the time of the experiment was 19 to approximately 70. All but 3 report that they spoke and learned exclusively Gaelic until they went to school. The remaining 3 estimate their childhood language exposure at 50-70% Gaelic. Only 3 participants report any agerelated hearing loss. All participants continue to use Gaelic in some capacity in their daily lives, whether at work or with family, etc. All participants are literate in Gaelic, although only the youngest participants received any Gaelic-medium education. Many of the participants are active with the Gaelic language in their work, or were before retiring, in fields such as language teaching, broadcast media, or the arts. Thus, the participants are highly fluent native Gaelic speakers who are readily able to think of lexical items as responses to an open-response gating task, and to write their responses on an answer sheet.

Procedures
The experiment was conducted in a quiet room 3 at the Gaelic language college Sabhal Mòr Ostaig or the Columba 1400 community center in Staffin, both on the Isle of Skye, Scotland. Participants were instructed to listen to the stimulus, and to write down on a numbered worksheet a whole Gaelic word that the stimulus could have been the beginning of. Gaelic examples were given in the instructions. The EPrime software ( Psychology Software Tools) was used to present the stimuli. The stimuli were randomized and presented over high-quality enclosed headphones, in the same random order for all participants, after first presenting the five practice items and pausing to allow participants to ask any questions. After hearing each stimulus, participants wrote their response on a paper answer sheet. After 10 seconds, the program presented the next stimulus automatically. In pilot testing, 10 seconds from stimulus onset was determined to be enough to comfortably think of a response and write it. The entire experiment took approximately 15 minutes.

Coding of data
The open-response gating task typically leads to somewhat noisy data. Hand-written responses and the fact that many fluent speakers of Gaelic do not have many opportunities to write Gaelic also add noise to this data. In this task, despite the instructions, it is typical that some participants sometimes fail to write a complete word, write a nonsense word that perhaps sounds word-like or might be a word in that speaker's lexicon, or fail to respond at all (cf. Warner, 1998, describing this for similar English and Japanese experiments). Each response was coded for real-word status. Responses of nonce words, partial responses giving only some sounds, or no response at all were all counted as non-word responses. An example of a partial response is bh or bh i given to stimuli beginning with orthographic bh /v/. An example of a nonce word response is ath fhluinn (approximately [a ʎuiɳ]), which one listener gave to two different first gate stimuli. For this coding, minor departures in spelling from the prescriptive standard, such as omission of accent marks on vowels, reversals of letters, etc. were ignored. For example, a response of spog was assumed to be the real word spòg [spɔːk] 'claw,' a response of morphological alternations Art. 8, page 17 of 34 tiut was assumed to be the real word tuit [tʰuhtʲ] 'drop!,' etc. In any questionable cases and any cases of difficult handwriting, Fisher judged the response. This method of collecting and analyzing results is subject to a small amount of experimental error through handwriting and spelling difficulties, but the alternative of recording responses out loud would require judging after the fact what was said and intended for each response, and we feel that would introduce more error. Furthermore, especially at gate 3, many listeners responded with the same word, which facilitates identifying the intended word among spelling variations.
Each response was classified by whether the initial consonant of the response word (after the particle, if any) was the result of mutation (e.g., orthographic ph by mutation, pronounced /f/ in the response word, formed from a related p-initial root), or a 'matched underlying' sound (e.g., orthographic f /f/ in the response word, not formed by mutation). The consonant might also be the unmutated consonant related to the target consonant set. For example, because the stimulus word pòg forms part of a set with phòg, a response word beginning with an orthographic p (pronounced as /pʰ/ in the response word) would be counted as an unmutated response if it was in response to a stimulus beginning with any of the p/ph/f set. To be classified as mutated, unmutated, or matched underlying, the response consonant had to be from the same set (e.g., p/ph/f or d/dh/dh) as the target consonant of the stimulus, as shown in Table 6. Finally, it was also possible for the initial consonant of a response word to be completely unrelated to the entire target set of consonants of the stimulus. For example, one participant responded to the first gate of bhèothaich /vʲoːiç/ 'revived' (where only the /v/ is presented, and it could be misheard) with the word ruig [rɯkʲ] 'reach!,' a real word that begins with a completely unrelated consonant.
A consonant was judged as being the product of mutation according to several criteria. The most important criterion was whether the response corresponded transparently with an unmutated counterpart. For example, a response bheò [vʲoː] 'alive' corresponds to the unmutated form beò [pʲoː], so bheò would be labeled as a mutated response. By contrast, a response bheil [veλ] 'am/are/is' would be labeled as matched underlying, because bheil has no unmutated counterpart; it is not the mutated form of *beil [pʲeλ]. Occasionally, orthography offered a reliable clue to mutated status as well. For instance, the spellings f and ph both represent the pronunciation /f/, but ph represents a /f/ produced through mutation of an underlying /pʰ/, while f represents an underlying /f/. Thus, the response pheann [fʲaʊnˠ] 'pen' would be labeled as mutated, while the response feannag [fʲaʊnˠak] 'crow' would be labeled as matched underlying. Finally, a response was judged to be mutated if its orthography made clear that the participant was considering a mutated form whose initial consonant was identical to that of the stimulus, but derived through mutation from a different underlying source consonant. For instance, the spellings mh and bh both represent the sound /v/, but are mutated forms of different underlying consonants: The unmutated counterpart of mh is /m/, while the unmutated counterpart of bh is /p/. Thus, the response mheall [vʲaʊɫ] was judged to be a mutated response to the stimulus bheòthaich [vʲoːiç] 'revived,' because mheall corresponds transparently to an unmutated form meall [mʲaʊɫ] 'hill,' and has the same initial surface consonant as bheòthaich.
To summarize the classification of mutation status of the response, each response was classified by whether the consonant in the position of the target consonant was the correct consonant set (e.g., p/ph/f) and a result of mutation, the correct consonant set and a 'matched underlying' consonant, the correct consonant set but the unmutated member of the set, substitution of an unrelated consonant, or no response at all. This classification was done even if the response consisted of a few sounds rather than a full word, as long as a consonant was present in the relevant position (initial or after a particle). For example, a sounds-only response of a fa-to stimuli created from a facal [ə faxcaɫ] 'her word' was classified as a matched underlying response, based on the orthographic "f" of the response.
Two participants' data were excluded from both Experiments 2 and 3 after coding of data because these participants showed much higher error rates than other participants on either this experiment or Experiment 3. One of these participants mentioned problems with hearing loss. The other gave anomalous responses that did not match the phonemes of the stimuli more often than other participants. All results below for both Experiments 2 and 3 are reported without these two participants.

Results
Participants wrote no response at all or wrote only the initial particle for 6.1% of all stimuli. Sounds-only responses were given to 3.8% of all stimulus presentations, and nonsense words constituted 3.3% of all responses. For 86.7% of all stimulus presentations, participants gave a real word of Gaelic (allowing for spelling variation, as explained above). Thus, the participants were for the most part able to do the task.
Turning to what type of consonants participants used in their responses, 6.7% of all stimulus presentations received either no response at all or one containing no consonant in the position of the target consonant. Consonant substitutions (e.g., a response with r to a stimulus with bh /v/) accounted for only 2.7% of all responses. Across all conditions, the remaining responses were distributed as 27.3% mutation responses (e.g., a word beginning with ph in response to a word in the p/ph/f set), 33.6% matched underlying responses (e.g., a word beginning with f to the same stimuli), and 29.7% unmutated responses (e.g., a word beginning with p to the same stimuli). This suggests that the task was successful in eliciting a variety of types of lexical responses with regard to mutation. Figure 3 shows the proportion of responses with mutated consonants for each of the nine conditions. For this and all further analyses, non-responses, responses without any consonant in the target position, and responses with substitution of an unrelated consonant were excluded, so figures show the proportion only out of responses with some consonant of the relevant set in the target position (e.g., one of p, ph, or f to a word in the pòg item set). The top panel of Figure 3 shows that if the stimulus contains an unmutated consonant (e.g., [p h ]), listeners do not respond with mutated or matched underlying words (e.g., [f]-initial words either from underlying /f/ or from a word mutated to [f] from /p h /). This is not surprising: Native Gaelic listeners can accurately distinguish [p h ] from [f], and the existence of a morphophonological alternation between the two sounds does not confuse them into responding with the opposite member of a mutation pair. Since the proportion of mutation responses to unmutated stimuli is so near 0, the results in the top panel will not be included in statistical analyses.
The small number of items per condition, which was necessitated by the difficulty of the task, a lexicon with few near-minimal pairs, and the endangerment situation, creates problems for statistical analysis. A Generalized Linear Mixed Models analysis with Participant and Item as random factors may encounter problems because of the low number of items. Averaging over items to conduct by-participants ANOVAs is also not ideal, not only because averaging over a second random factor is dispreferred, but because with only three items per participant per condition, the dependent variable can only have one of four possible values (proportions of 0, one-third of items, two-thirds of items, or 1). Thus, this is not continuous data. However, it is not possible to conduct this experiment with 10 or more items per condition. Therefore, within the limits of the possible data, we use by-participants ANOVAs in conjunction with fitting a generalized linear mixed model. We use the two statistical methods as convergent evidence. For the by-participants ANOVAs, two participants failed to give a response containing a target consonant for any of the three items in at least one condition. Therefore, these two participants were excluded from the statistical analysis for the ANOVAs only (and for this experiment only).
We conducted an overall by-participants ANOVA with target Consonant type (mutated, matched underlying), Context (no particle, ambiguous particle, unambiguous particle), and Gate (1-3) as factors. All factors were within-participants factors. The dependent variable was proportion of responses containing a mutated consonant in the target position (Figure 3) Because of the significant interactions, we conducted tests of the simple effect of Gate for each combination of Consonant type and Context, and also conducted specific planned comparisons. Mutated stimulus consonants showed a significant increase in the proportion of mutated responses as the gate endpoint moved further into the word for no-particle stimuli (F(2,40) = 69.83, p < .001) and ambiguous-particle stimuli (F(2,40) = 19.39, p < .001), but no change over gates for stimuli containing an unambiguous particle before the target consonant (F < 1). We do not make any predictions about whether the increase in mutation responses happens primarily from the first to second gate, or second to third, because this depends on the lexical competitors available for the various items. Therefore we do not conduct any pairwise comparisons between neighboring gates. The important information is rather in the overall effect of Gate.
For the stimuli in the matched underlying condition (target consonant not a result of mutation, but phonetically the same as the mutated condition, e.g., f [f]), only for items with unambiguous particles, there was a small but significant increase in the proportion of mutated responses at later gates (F(2,40) = 7.08, p < .005). Neither of the other context conditions showed a significant change over gates (no particle: F < 1, ambiguous particle: F(2,40) = 2.20, p > .10). This increase in proportion of responses with a mutated consonant at later gates for the unambiguous particle condition is the opposite of the predicted direction of effect, as these items do not contain mutated consonants, and later gates should, if anything, make this even clearer than it already is from the preceding particle. This will be discussed below.
In order to investigate how the Consonant type of the stimulus influences listeners' use of mutation in word recognition, we compared responses to the mutated vs. the underlying Consonant types at only the first gate, for each Context separately. For both the no-particle context and the ambiguous-particle context, listeners responded equally often with a mutated consonant at gate 1 regardless of whether the consonant actually came from a mutated or an underlying word type (both Fs < 1). Thus, the mutation status of the consonant itself did not affect listeners' choice of whether or not to respond with a mutated word. Only with an unambiguous preceding particle, which specified whether the target consonant should be mutated or not, did the listeners respond more often with mutated consonants to mutated stimuli than to underlying stimuli (F(1,20) = 294.71, p < .001).
In order to confirm that the Context (preceding particle type, if any) affects how listeners determine whether a consonant might be mutated or not, before they hear enough segments after the consonant to use the lexical identity of the word, we compared only the gate 1 stimuli made from words with mutated consonants. There was a significant difference in the proportion of mutated responses across the three Context conditions (F(2,40) = 91.17, p < .001). Pairwise comparisons within these three conditions showed that the no-particle and ambiguous-particle conditions did not differ (F < 1), while the unambiguous-particle condition received significantly more mutation responses than the ambiguous-particle condition at this gate (F(1,20) = 120.00, p < .001). This confirms that for mutated stimuli, it is only the unambiguous-particle condition that differs from the other two.
For the generalized linear mixed models analysis of Experiment 2, a model was fit to the dataset without the unmutated stimuli, because of the near-categorical lack of mutated responses in those conditions. Mutated response was used as the categorical dependent variable. Model selection was performed using ANOVA comparison for nested models with the same random effects structure, and comparison of AIC for comparison of models with differing random effects structure. The model that obtained the lowest AIC was one with the fixed factors Consonant type (mutated as reference level), Context (ambiguous particle as reference level) and Gate (gate 2 as reference level), and allowing all interactions of the fixed factors, with random by-participant and by-item intercepts (mutatedResp ~ Cons * Context * gate + (1 | participant) + (1 | item_set) in R). This model, like others attempted, returned "failure to converge" warnings, but was accepted despite the warnings. A more complex model with several random slopes would be motivated based on the experimental design (Barr et al., 2013), but the increase in AIC, as well as the small number of items, argue against including these in the model. One might also consider omitting the items random factor from the design because of the small number of items, although this would violate the independence assumption. This also increased the AIC. The results for the selected model are as follows: To confirm the origin of the significant 3-way interactions, models with the same random effects structure (random by-participant and by-items intercepts but no random slopes) were fit to subsets of the data defined by Consonant type and Context. This is equivalent to testing the simple effect of Gate for each Consonant type by Context combination, and the only fixed effect included was Gate, with the second gate as the reference level. These analyses confirmed that either gate 1 or gate 3 differed significantly from gate 2 for the mutated no-particle condition, the mutated ambiguous-particle condition, and the underlying unambiguous particle condition. This mirrors the results obtained with byparticipants ANOVA. Figure 4 shows the proportion of responses containing an unmutated consonant of the correct set in the target consonant position (for example, responses with p to stimuli made from a p/ph/f set). Examining this proportion as the dependent variable confirms that Gaelic listeners have no difficulty distinguishing unmutated consonants (e.g., [p h ]) from their corresponding mutated or matched underlying consonants (e.g., [f]), and no difficulty responding appropriately with unmutated words. If a stimulus contains an unmutated consonant, listeners give almost exclusively responses with an unmutated consonant, and if a stimulus contains a mutated or matched underlying consonant, they almost never do. Because the data in Figure 4 is so nearly categorical, it will not be analyzed statistically, but this completes the picture together with the responses in Figure 3. (The remaining proportion of responses not graphed are those containing a matched underlying consonant, with the three types summing to 1.0.) Figure 5 displays the proportion of responses that consisted of the same word as the stimulus or a morphological form of it (allowing for spelling variations). This data indicates that even when listeners heard the third gate, which ended late in the third segment of the target word, they were often not able to recognize the whole word the stimulus was made from. However, the progression across the gates shows that listeners did choose the actual stimulus word as their response more often as they heard more of the acoustic signal. A by-participants ANOVA on the same subset of data used for the statistical analyses above   the conclusion we wish to draw from this dependent variable involves only Gate, and the direction of the effect of Gate is consistent in almost all conditions, we will not pursue these interactions further.

Discussion
Comparing listeners' responses to stimuli containing unmutated consonants vs. the other two conditions (e.g., pòg vs. phòg and foghlam), it is clear that the existence of a morphophonological alternation between /p h / and /f/ in the language does not create any difficulties for listeners in distinguishing the acoustic difference between [p h ] and [f] or in activating words that begin with the appropriate sound. This is as expected: The mutated and unmutated sounds that form mutation pairs are acoustically quite distinct (not just the [p h /f] pair, but also other pairs, such as [tʲ/j]), as well as being distinct phonemes in Gaelic apart from mutation. Gaelic listeners, like listeners of any language where these sounds are phonemically distinct, should be able to distinguish them accurately and use that information in spoken word recognition. These results confirm that the presence of a pervasive word-initial morphophonological alternation involving these sounds does not hinder the use of the distinct phonemes for word recognition. The middle panel of Figure 3, the proportion of responses with a mutated consonant to mutated stimuli, provides the most important information about how listeners recognize words with initial mutation. In words with no preceding particle or an ambiguous preceding particle (the third-person singular possessive particle a [ə], which triggers mutation if it is masculine but not if it is feminine), participants rarely gave mutated forms as responses at the first gate, where they heard only up through the target consonant. However, at the third gate, when they could hear enough of the segments of the word to narrow the lexical candidate set, they were much more likely to respond with a word containing a mutated consonant. When they heard a preceding particle that must occur with mutation (the unambiguous condition), they almost always responded with a mutated consonant by the first gate. Taken together, the results for mutated stimuli show that participants did not assume that the consonant was caused by mutation unless or until lexical/grammatical evidence was available to suggest a mutated consonant. That evidence could come either from a preceding particle that triggers mutation, or from hearing enough sounds of the word to narrow the set of lexical candidates. However, what is striking here is how rarely listeners gave a mutated response if such evidence was lacking: At the first gate, for no-particle and ambiguous-particle conditions, listeners only gave mutated responses approximately 20% of the time. The responses to the same items at gate 3, however, verify that listeners do have mutation forms in their lexicon that they could have given to these items at gate 1. This suggests that unmutated words are the primary lexical candidates activated during spoken word recognition until more evidence accumulates for a mutated word. The preference for non-mutated responses when either type is possible could reflect higher frequency of unmutated forms than mutated forms (which by definition are derived and typically lower frequency). Given that we cannot obtain reliable frequency measures for mutated and unmutated forms of all relevant words in each environment in a given Gaelic speaker's dialect, we cannot test whether this effect is purely frequency-based, or based on listeners' grammatical knowledge, or both. Regardless of whether this effect stems from frequency differences, the results show that listeners do not favor mutated forms unless there is evidence that makes them the only possibility.
The results for matched underlying stimuli (made from words with the consonant phonetically matched to the mutation consonant, but not caused by mutation, bottom panel in the figures) also show that when there is no specific evidence for a mutation form, listeners only rarely give responses containing mutation. Thus regardless of preceding particle, at gate 1, listeners give responses using mutation only about 20% of the time for these stimuli, even though there are possible lexical candidate words with mutation. In these items, mutation responses remain rare even at gate 3, because more acoustic information will only help to confirm that the word does not contain mutation.
We should note here that the unambiguous particle condition for these matched-underlying words shows a surprising reversal of the expected effect: When hearing a stimulus like gu Fionnlagh [ku fʲunɫa] 'to Finlay,' where the preceding particle specifies that the upcoming word cannot contain mutation, listeners actually become significantly more likely to give a mutation response over time, rather than less so. We believe this is because these phrases, consisting of a non-mutation-triggering particle followed by a word beginning with underlying /f, v, h/, are quite rare in the language. (Items also had to match the other words in the set on following vowel.) We had difficulty finding appropriate items for this condition. Of the three items, one is an idiom (mar thalla [mar haʊɫa] 'go away!'), and the other two are gu 'to' followed by a placename. The item gu Bhatarsaigh [ku vahtɛrsəɪ] 'to Vatarsay' with non-mutation /v/ only occurs because this /v/-initial placename was borrowed from North Germanic from the Vikings. Thus, listeners had difficulty thinking of responses for these items, and if they wrote only gu bh-this was coded as a mutation response, because orthographic bh typically indicates mutation. Thus, we believe the anomalous result for this condition reflects the rarity of such items.
In total, the results of Experiment 2 show that native Gaelic listeners tend not to consider mutated forms very often during the process of spoken word recognition until there is evidence in the surrounding context to indicate mutation. Mutated forms seem to be less activated than forms that are not caused by mutation. However, when there is evidence that the stimulus contains a mutated consonant, listeners are indeed able to recognize appropriate words. One thing this open-response whole-word task cannot address, however, is how well Gaelic listeners perceive acoustic cues in the target consonant. It is possible that there are acoustic differences between an [f] that results from mutation and an [f] that is underlying, in which case these acoustic differences might provide perceptual cues. Furthermore, the open-response gating task provides a window into what words are activated as candidates for word recognition at a given time, but a different task may provide different insights into how Gaelic listeners perceive the sounds needed to recognize these words. Therefore, we conducted a phonetic identification task.

Introduction
A phonetic identification task, where listeners hear a stimulus such as [f] and are asked whether it best matches the word pòg, phòg, or foghlam, can provide insight into Gaelic listeners' uptake of acoustic cues from the signal for words with or without mutated consonants. This emphasis on perception of sounds rather than words complements the information on what lexical candidates are being considered from Experiment 2. For Experiment 3, we chose additional item sets of the same type as were used in Experiment 2, except that the unambiguous particle conditions were omitted. (If listeners were to hear [ku v], it would be pointless to ask them whether this is a better match to gu Bhatarsaigh [ku vahtɛrsəɪ] or do mhàthair [to vaːhərʲ] 'your mother,' even though the target consonant of both is [v], since they would also hear the preceding particle.) Listeners heard stimuli which were produced in the same way as the stimuli for Experiment 2, cut off during the first, second, or third segment of the word. However, they were asked to press a button to say which of the three stimulus words in a matched set they felt the stimulus was a good match to, rather than being asked to think of a word the stimulus could come from. Thus, this is a three-alternative forced choice phonetic identification task.

Materials
The current experimental design is a 3 (Consonant type) by 2 (Context) by 3 (Gate) design, with the same factors and levels as in Experiment 2, except for the omission of the unambiguous particle context. We selected sets of matched items for each of the Context conditions (6 sets for no-particle and 5 sets for ambiguous-particle). As for Experiment 2, they were matched on the target consonant and as closely as possible on the following vowel. For the ambiguous-particle condition, because of difficulties finding enough matched sets of words that were acceptable with this particle (which is not the most basic word for 'his/her'), the items from Experiment 2 had to be included along with two additional item sets. Table 7 provides examples of the stimulus items, and the complete list appears in Appendix C. It was not possible to match the items in the matched underlying condition more closely because of limitations of what pairs were available in the lexicon.
Fisher was recorded reading the items in the same way (and in the same recording session) as the items for Experiment 2. The gated stimuli were created in the same way. One item, tha [ha] 'be' (irr. pr.), had only two segments, so only two gates were used for it. Four similar practice items were created from other words.

Participants
The participants were the same as for Experiment 2, except that one participant did not participate in Experiment 3 because of time constraints. The two participants whose data were excluded from Experiment 2 because of apparent difficulties with either the task or hearing were also excluded here. This leaves 22 participants for Experiment 3.

Procedures
Experiment 3 was conducted in the same locations, during the participants' same visit, as Experiment 2. Experiment 3 was conducted after Experiment 2 in order to avoid participation in Experiment 3 influencing lexical activation during Experiment 2. For Experiment 3, the 98 stimuli were randomized, using a different random order for each participant, with the four practice items preceding the rest of the experiment. Three response alternatives were displayed on the laptop screen for each stimulus, written in Gaelic orthography. These were the three words of the relevant matched set (e.g., pioc, phioc, feòrag). The response alternatives were always displayed with the unmutated category on the left of the screen, the mutated category in the middle, and the matched underlying category on  Table 7: Example words from which phonetic identification stimuli were made. The target consonant is the first consonant of the non-particle word.
the right, in order to reduce error and make the task easier. Participants were instructed to press a button on a button box indicating which word the stimulus matched best with, or could be the beginning of. There was a six-second time-out, so that if a participant did not respond to an item at all the experiment advanced to the next item. However, this occurred for less than 1% of trials, which were treated as missing data. The EPrime software was used to present stimuli and record responses. The entire experiment took approximately five minutes. Figure 6 shows the proportion of responses of the mutated response option for each condition (e.g., the proportion of participants choosing the phioc-type response, regardless of stimulus type).

Results
The near-absence of mutation responses to unmutated stimuli again confirms that Gaelic listeners can distinguish [p h ] from [f] (etc. for other consonant pairs), as in Experiment 2. As in Experiment 2, no statistical analysis will be conducted with the unmutated stimulus category, as the lack of mutated responses to it is so nearly categorical. Therefore, statistical analyses are conducted with the factors Consonant type (mutated, matched underlying), Context (no particle, ambiguous particle), and Gate. For the same reasons as with the data in Experiment 2, the same statistical approach of by-participants ANOVA in conjunction with generalized linear mixed models will be used with this data. The problem of the proportion of mutated responses being non-continuous is slightly less severe in the current data, because the proportion is calculated over 5-6 stimuli rather than 3.
The overall by-participants ANOVA showed significant main effects of all three factors, as well as several interactions (Consonant type : F(1,21)   p < .001). There were significant simple effects of Gate for each Consonant type and Context separately, with the proportion mutated responses increasing at later gates for mutated stimuli and decreasing for underlying stimuli (mutated, no particle: F(2,42) = 34.78, mutated, ambiguous particle: F(2,42) = 16.48, Underlying, no particle: F(2,42) = 59.43, underlying, ambiguous particle: F(2,42) = 50.25, all ps < .001). The effects of Gate indicate that listeners become more sure of whether the word-initial consonant comes from a mutated or underlying source as they hear more segments after the consonant, which narrow down the possible set of lexical items.
To determine whether listeners already perceive a difference between mutated and underlying source consonants from acoustic cues in the consonant itself, when little or no lexical information beyond the consonant is available, we performed planned comparisons of the two consonant types at the first gate only. For the no particle items, listeners chose the mutated response equally often regardless of whether the stimulus contained a mutated or underlying consonant (F < 1). For the ambiguous particle items, listeners were slightly more likely to choose the mutated response for stimuli that actually were made from words with mutation (F(1,21) = 5.27, p < .04).
For the generalized linear mixed models analysis of Experiment 3, as in Experiment 2, a model was fit to the dataset without the unmutated stimuli. Mutated response was used as the categorical dependent variable. Model selection was performed as described for Experiment 2. The model that was chosen had the fixed factors Consonant type (mutated as reference level), Context (no-particle as reference level), and Gate (gate 2 as reference level), and allowing all interactions of the fixed factors, with random intercepts for participants and items and random slopes for all within-participant and within-item factors, LenitResp ~ Cons * Context * Gate + (1 + Cons + Context + gate2 | Participant) + (1 + Gate + LenStatus | item_set) in R. A simpler model with no random slopes had lower AIC, but since the AIC difference was less than 1, the model with random slopes was chosen, as motivated by the within-participants design (Barr et al., 2013). This model and others it was compared to received "failure to converge" warnings, but was used even though it may not be the optimal solution. The results for the selected model are as follows: As for Experiment 2, models with the same random effects structure as far as possible were fit to subsets of the data defined by Consonant type and Context in order to explore the 3-way interactions. For these models, the only fixed effect was Gate (gate 2 as reference level), and the random effects structure used random by-participant and by-items slopes for Gate as well as random intercepts. These analyses confirmed that either gate 1 or gate 3 differed significantly from gate 2 for each Consonant type by Context condition. This mirrors the results obtained with by-participants ANOVA. Figure 7 shows the proportion of trials on which the unmutated option (e.g., pioc [pʰʲoxk] 'pick') was chosen. Examining this dependent variable confirms that listeners are able to correctly perceive the distinction between unmutated (e.g., [p h ]) and mutated or matched underlying (e.g., [f]) consonants, and are able to correctly identify which words contain these consonants. The remaining proportion of responses not included in Figures 6 and 7 were choices of the matched underlying response option.

Discussion
The results of the phonetic identification task in Experiment 3 show that listeners are able to acoustically distinguish the mutated and matched underlying consonants (e.g., [f] as orthographic ph or f) from the unmutated consonant of the same pair (e.g., [p h ], orthographic p). In this task, listeners are solely asked to identify the sounds, although we did use real words as response alternatives. They do not have to think of responses from the lexicon themselves; they only have to identify which word matches the sounds of the stimulus. This verifies that the wide-spread alternations in word-initial consonants caused by the mutation pattern do not hinder listeners' uptake of acoustic information from the speech signal.
More importantly, there is little or no difference in how often listeners choose the mutated vs. the matched underlying consonants (e.g., phioc [fʲoxk] vs. feòrag [fʲoːrak] 'squirrel') at gate 1, which allows the listener to hear only up to the end of the target consonant itself. This shows that based on just acoustic cues during the consonant, listeners are at or near chance in distinguishing these two word types. Although there is a significant difference for the items with a preceding particle, it is small, and the item sets are not perfectly matched, so differences in the following vowel could cause coarticulatory differences in the target consonant that could lead to this effect. For the words with no preceding particle, listeners respond in the same way to both consonant types. However, at later gates, as the segmental information after the target consonant becomes available, listeners become more accurate at identifying which stimuli form part of which word. This is particularly true by gate 3, where the third segment differs for most of the item sets. Thus, acoustic information within the target consonant itself does not provide sufficient cues to disambiguate whether a consonant has come about through mutation vs. being underlying. There may be small acoustic differences between underlying vs. mutation consonants (e.g., [f] from the two sources), but they are not sufficient for clear perception.
If there are such acoustic differences, this would be a case of incomplete neutralization (cf. Warner et al., 2004); however, we have no clear evidence here that the neutralization is not complete. Listeners must decide whether a word contains mutation or not based on other cues outside the consonant itself, such as a preceding particle or the lexical identity based on later segments of the word. The fact that listeners make this distinction accurately at gate 3 shows that later segments of the word do disambiguate the source of the consonant.

General Discussion
The results of Experiment 1 showed that mutated past tense verbs prime their unmutated imperative counterparts, and this priming effect is independent of phonological overlap between mutated and unmutated verbs. Recall that in the phonological priming condition, targets trended toward faster response times, but this trend did not reach significance compared with the control condition. In contrast, targets in the morphological priming condition were recognized significantly faster than in the control condition.
We interpret this to mean that the mental representations of mutated and unmutated forms are connected at the morphological level, independently of their form-based relationship. A plausible further interpretation of these results is that while all words may be lexically listed as word units, mutated forms and their unmutated counterparts share the same lexical entry. This explains why listeners are faster to recognize an unmutated form when it has been primed by its mutated counterpart: The processing of the prime activates its lexical entry, which is the same as that for the unmutated target. What's more, these results are consistent with morphological priming effects found for well-studied languages like English. For instance, in English, Emmorey (1989) reported facilitated recognition of target words sharing a bound morpheme with their prime in a lexical decision task with audible primes (e.g., prime = receive, target = deceive). In less well-studied languages (e.g., Ussishkin et al., 2015 for Maltese andSchluter, 2013 for Moroccan Arabic), similar results were found, all of which implicate a lexical relationship between morphologically related forms in auditory word recognition, even when the realization of the morphemes at issue is typologically unusual, as in initial consonant mutation.
In Scottish Gaelic, if both forms are listed in the lexicon, then this suggests that both should be available as potential competitor words during the process of spoken word recognition. (If both forms are listed in the lexicon, we assume that they are linked in the lexicon in such a way that speakers and listeners are aware of the mutated and unmutated form of a word being forms of the same word. The conclusions here do not depend on whether the forms have separate lexical listings or not.) Experiment 2 showed that in fact, mutated forms are available as candidate words for recognition, but listeners disfavor them, using them less than 20% of the time until there is clear evidence that the word being heard must be a mutated form.
In Experiment 2, using a lexical open-response task, listeners were unlikely to give words containing mutation unless the signal supplied direct evidence for a mutation form (e.g., subsequent segments, or a preceding particle that required mutation). One might then wonder whether there was something stopping the listeners in Experiment 2 from giving words with mutation as responses, such as the fact that mutated forms probably have lower frequency than unmutated ones, or that mutated forms are derived and therefore more complex than unmutated roots. Experiment 3 verified that there is at least nothing in the acoustics stopping them from doing so. When the mutated forms are presented as one of the three response alternatives, listeners readily choose that response. Thus, it must be a difference between the open-response task, which requires accessing the lexicon, and the three-alternative forced choice phonetic identification task that explains this.
The direction of bias in responses is an especially noteworthy difference between Experiments 2 and 3. In Experiment 2, listeners were so unlikely to give mutation responses unless there was specific evidence for mutation that there was no significant effect of Gate on the proportion of mutation responses for stimuli in two of the Underlying conditions (no particle and ambiguous particle conditions). In these conditions, at gate 1 listeners already gave very few mutated responses, and later gates simply supplied more evidence for them to maintain that behavior. Experiment 2 showed an overall bias in the mutated and underlying conditions toward underlying responses, away from mutation responses. However, in Experiment 3, the direction of bias was reversed: At gate 1, the average proportion of mutated responses to mutated or underlying stimuli across preceding particle conditions was 62%. The corresponding percentage in Experiment 2 was 19%. Thus, in the phonetic identification task, if there is no acoustic or lexical information that would disambiguate the choice, Gaelic listeners are biased toward the mutated consonant, while in the open-response whole word task, they show the opposite bias.
We believe the direction of bias in Experiment 2, against mutated forms, shows that during the process of spoken word recognition, listeners do not assume that morphophonology has applied to the speech sounds they are hearing until information in the signal makes it necessary to assume that. This is somewhat similar to Grosjean's (1980) finding (also mirrored in Warner's [1998] results) that listeners do not respond with longer, suffixed forms in an open-response gating task until they hear evidence for them. For example, Grosjean found that listeners responded uniformly with stretch to stimuli made by cutting off the word stretcher until the stimulus actually included the second syllable vowel, when they switched their responses to stretcher. This was true regardless of whether the longer material was a derivational or inflectional suffix or part of the root, so listeners also gave cap instead of captain and parse instead of parsley until acoustic information for the longer form became available. Listeners also did not respond with for example gulps or gulping instead of gulp, even though the suffixed forms would also be consistent with gated stimuli made from gulp. These results suggest that at least when listeners are asked what word they might be hearing, they do not consider all the longer forms such as those with suffixes that could be consistent with what they have heard so far. This is similar to our Gaelic results in that listeners respond with unmutated forms, but it differs in that the Gaelic mutation case involves substitution of a segment rather than a choice between a longer and shorter segment.
Our results show that during the process of spoken word recognition, listeners stick closer to the surface form until other sounds lead to an interpretation that the surface form results from the morphophonological alternation of mutation. However, in the phonetic identification task of Experiment 3, since all three response options appear on the screen and are already activated, listeners are not hindered from choosing the mutation response. As for why listeners choose it more than 50% of the time in Experiment 3, there may be a simple explanation: Two of the three response alternatives are related words (e.g., pioc phioc), which may facilitate choosing phioc over feòrag, and additionally the mutation response was displayed in the center of the screen.
Regardless of why listeners are biased toward mutated forms in the acoustic-level task (Experiment 3), it is clear that they are at least not hindered from choosing the mutation response. Thus, their tendency not to use mutated forms as responses in Experiment 2 must reflect spoken word recognition processes. It may be that potential mutated lexical candidates have lower word frequency than matched underlying candidate words do. Or it may be that the spoken word recognition process involves first accessing words that are similar to their underlying representations, that have minimal morphophonological derivation. Listeners do not often consider words that are created by morphophonological processes until there is evidence that forces this interpretation. Since there are some mutation responses even at the first gate with no mutation information from a previous particle, it is clear that listeners are able to access mutated forms as well. It seems that listeners' initial strategy during spoken word recognition is to assume that words have not been affected by morphophonology.
Gaelic word-initial mutation makes it possible to test this at word-onset, when listeners have not yet heard any other sounds of the word and therefore have to decide without any context whether to interpret the sounds they hear as the result of morphophonological processes or not. In terms of a model of Spoken Word Recognition such as Shortlist-B (Norris & McQueen, 2008) or other models, as Gaelic listeners begin to hear a word onset that is actually a mutated word, they activate words where the underlying form matches the speech more strongly (e.g., foghlam [fʊɫəm] 'education' when hearing /f/), but they do activate words that only match in their mutated forms to some degree as well (e.g., phòg [fɔːk] 'kissed'). As more acoustic information becomes available and forms where the underlying form matches (without mutation) cease to match the speech input, items like foghlam drop out of the candidate set, and the more weakly activated mutation forms are the only remaining candidates, so they become more strongly activated. Because mutation occurs word-initially, it has an especially strong influence on what the possible candidate words for recognition are. The typologically rare morphology of Scottish Gaelic allowed us to gain a better understanding of the organization of the lexicon. Complex words are stored holistically and related to their simple forms.

Additional Files
The additional files for this article can be found as follows: • Appendix A. Real word primes and targets used in Experiment 1 (lexical decision with auditory masked priming