Morphophonological gradience in Korean n-insertion

This study addresses the questions of what factors may have a gradient effect in application of a morphophonological process, how they interact, and which of the gradient effects speakers are aware of, by investigating the variation patterns of Korean n-insertion. An analysis is performed on the results of two surveys on speakers of two dialects of Korean, Seoul and Kyungsang, one using existing words and the other using novel Korean words. From the results of the survey on n-insertion in existing Korean words, I have found several gradient tendencies involving a variety of factors including phonological and morphological ones and their interactions. Such factors include morphological category, etymology and length of component morphemes, sonorancy and place of articulation of the consonant preceding the insertion site, height of the initial vowel of the morpheme following the insertion site and dialects of speakers. None of these factors are absolute conditions for the occurrence of n-insertion, contrary to the previous studies on Korean n-insertion, and they have gradient effect, contributing to the overall probability of n-insertion. Consequently, Korean n-insertion provides a clear case where previous categorical proposals do not match with gradient reality, lending support to quantitative theories of morphophonology.


Introduction
A morphophonological process is typically conditioned by phonological and morphological factors. Such factors categorically distinguish words that can undergo the process from those that cannot. Some other factors, mainly non-linguistic ones such as word frequency, may have a gradient effect. They affect the process only probabilistically within the set of potential target words. However, recent research (Zuraw 2000(Zuraw , 2002(Zuraw , 2010Hayes et al. 2009;Coetzee & Pater 2011;Hayes & White 2013;McPherson & Hayes 2016;Jurgec 2016;Moore-Cantwell & Pater 2016;Zuraw & Hayes 2017;and Zymet 2018) shows that phonological and morphological factors may have a gradient effect, contributing to the overall probability of the occurrence of a morphophonological process. The present study addresses the questions of what factors may have such a gradient effect in application of a morphophonological process, how they interact, and which of the gradient effects speakers are aware of, by investigating the variation patterns of Korean n-insertion.
For this purpose, I investigated the results of two surveys on speakers of two dialects of Korean, Seoul and Kyungsang, one using existing Korean words and the other using novel Korean words. The results of the survey with existing words show that a variety of factors including phonological and morphological ones interact to give rise to several interesting gradient tendencies involving n-insertion. This is consistent with the recent research on gradient morphophonology, cited above, but contrary to the traditional literature on Korean n-insertion which argue that most of such factors are absolute conditions for the occurrence of n-insertion.
In order to establish whether Korean speakers are aware of the tendencies observed in existing words, I explored the results of a novel word survey. Several tendencies observed in existing words were mirrored in the results of a novel word survey. Thus, Korean speakers are aware of the differential influence of the relevant factors on the probability of the application of n-insertion. There were, on the other hand, some apparent mismatches between the results of the surveys on existing and novel words, which I attribute to the lack of phonological naturalness involved.
The present work seeks to complement that of Jun (2015) by further exploring what factors affect the distribution of n-insertion in Korean and how they interact. Although this study can be considered as an extension of Jun's study on Seoul Korean n-insertion, it would make a distinct contribution by using a larger set of data and providing a more elaborate data analysis. Specifically, the database of the present study consists of the results of two surveys on Korean n-insertion, one on Seoul Korean speakers, conducted by Jun, and the other on Kyungsang Korean speakers, conducted in the present study. I will analyze the data, using an expanded set of factors including the dialect of the participants and morphological categories of component morphemes, which were not considered in Jun's work. Thus, the present study provides a more comprehensive experimental investigation of Korean n-insertion, addressing a wider set of empirical and theoretical issues. A variety of factors, which have been proposed to have a categorical effect in the previous studies on Korean n-insertion, turned out to have only a gradient effect in this study. In addition, the present analysis of Korean n-insertion can lead to better understanding of whether and how speakers learn gradient lexical patterns.
The organization of this paper is as follows. Section 2 provides background information on Korean n-insertion, beginning with a description of the basic patterns of Korean n-insertion. It then discusses conditioning factors for the occurrence of n-insertion which previous studies argue to have categorical effects. Sections 3-4 describe surveys on Seoul and Kyungsang Korean speakers. Section 3 concerns the survey on existing Korean words whereas section 4 the survey on novel Korean words. Effects of conditioning factors will be examined both separately and in combination. Section 5 provides discussions of how to analyze Korean n-insertion, the extension of existing word patterns to novel words, dialectal variation and remaining problems. The final section concludes the present study.
The basic conditions for application of n-insertion, illustrated above with (1), are not enough to figure out when to apply and when not to apply n-insertion. There are a number of words which meet the basic conditions but do not undergo n-insertion. Many additional conditioning factors have been proposed in the previous literature for the purpose of properly restricting the domain of application of n-insertion. The crucial factors include morphological category of component morphemes, syllabicity of M 2 -initial vocoid, and sonorancy of C 1 . I will now provide a brief review of relevant previous studies.

Morphology
It has been reported in the literature that n-insertion does or does not apply depending on the type of morpheme involved. Morphemes preceding and following the insertion site, called here M 1 and M 2 , have been argued to be subject to different restrictions on the occurrence of n-insertion. I will first discuss previous proposals on the restriction of M 1 and then those of M 2 .

Morphology of M 1
n-insertion is more pervasive and widespread in Kyungsang Korean than Seoul Korean. The main difference lies in the type of triggering M 1 morpheme. According to Han (1994: 114-130), as summarized in (3), n-insertion in Seoul Korean occurs when M 1 is a free stem (3a, b) or a prefix (3c), but does not occur when M 1 is a bound root (3d, e) which is mostly  (3) Category of M 1 vs. composition (O: n-insertion, X: non-n-insertion, based on Han 1994 In contrast, n-insertion in Kyungsang Korean occurs even after a root (3d, e), as well as after a stem and a prefix (3a-c). Thus, the morphological domain of n-insertion is narrower for Seoul Korean compared to Kyungsang Korean.
To summarize, the dialectal difference with respect to morphological category of M 1 suggests that a root M 1 is a weaker trigger of n-insertion than a non-root, namely, stem or prefix, M 1 .

Morphology of M 2
It has been mentioned in many previous studies on Seoul Korean n-insertion that n-insertion occurs only when M 2 is a stem (or root) which can be an independent word (Huh 1984;Han 1993Han , 1994Ko 1992;Kim et al. 2002;Hong 2006;Cho 2016;and others). Some of those previous studies (for instance, Ko 1992: 37) consider this free M 2 requirement to hold only for native Korean words. This etymological restriction makes sense given that n-insertion may occur before Sino-Korean /j/-initial suffixes, as illustrated in (3b) where /-ju/ 'oil' is a Sino-Korean suffix. To summarize, the previous studies on the role of morphological category of M 2 in Korean n-insertion suggest that n-insertion occurs only before a free morpheme, at least in native Korean words.

Syllabicity of M 2 -initial vocoid
It is widely assumed in the traditional literature on Korean phonology and morphology that n-insertion occurs not only before a high front vowel /i/ but also before a palatal glide /j/. However, some previous studies have argued either that the vowel /i/ is not a trigger of n-insertion, or that /i/ is less likely to trigger n-insertion than the glide /j/. Lee & Lee (2006: 422-4), Hong (2006: 397) and Ahn (2009: 263) deny the existence of synchronic pre-/i/ insertion in Korean, thus claiming that /j/ is the sole trigger of n-insertion. Lee (1996: 168), Bae (2003: 241), and Oh (2006: 125-9) acknowledge the existence of pre-/i/ insertion, but describe it as less frequent, or less naturally occurring, than pre-/j/ insertion. To summarize, /j/ is a stronger trigger of n-insertion than /i/, although it is controversial whether the difference in strength is categorical or gradient.

Sonorancy of C 1
Some previous studies have argued that the sonorancy of C 1 determines the likelihood that n-insertion will apply. According to Cho (1995: 610-11) and Cho & Iverson (1997: 702), Korean n-insertion applies obligatorily after a sonorant C 1 and optionally after an obstruent C 1 .
To summarize, the previous studies on the role of sonorancy of C 1 in the application of Korean n-insertion suggest that a sonorant C 1 is a stronger trigger of n-insertion than an obstruent C 1 , although the exact strength and domain of the sonorancy asymmetry still need to be explored.

Exceptions and variation
As discussed above, traditional analyses of Korean n-insertion have attempted to explain the variation in n-insertion by restricting the domain of rule application in terms of the morphological category of M 1 and M 2 , syllabicity of M 2 -intial vocoids, and sonorancy of C 1 , which may interact with additional factors such as dialect, etymology and/or length of the morphemes involved, as summarized in (4). They mostly propose such factors to have categorical effects on variation while describing relevant patterns based mainly on the authors' intuition as a native speaker of Korean.
However, some exceptions to each of those factors have already been pointed out in the previous studies based on the authors' intuition, and even a wider range of exceptions were reported in experimental and survey studies on Korean n-insertion (Kim 2000;Choi 2002;Kim 2003). For instance, the free M 2 requirement (i.e. n-insertion applies to words with a free M 2 ) is neither a sufficient nor a necessary condition for the occurrence of n-insertion. Not all /i, j/-initial M 2 stems trigger n-insertion: e.g. /mas-is'-ta/ *[mannitt'a], [masitt'a] 'delicious (taste-exist-se)'. On the other hand, certain (/j/-initial) native Korean M 2 suffixes may trigger n-insertion: e.g. /camsi-man-jo/ [camsimannjo] 'Wait a moment (momentonly-se)' (Bae 2003;Oh 2006). It seems that none of the conditioning factors proposed in the previous studies completely determine the occurrence or absence of n-insertion. In addition, as mentioned in a number of previous studies on Korean n-insertion (Kim-Renaud 1974/1991Ko 1992; and many others), n-insertion is often optional. The probability of n-insertion may vary greatly across speakers and words, as can be seen in the results of previous experimental/survey studies on Korean n-insertion (Choi 2002;Kim 2003;and Jun 2015), and as is also evident from the data of this study presented in sections 3-4. The widespread exceptions and variation involved make Korean n-insertion look quite irregular. Nonetheless, it is still possible that the factors proposed in the previous literature may have gradient effect, contributing to the overall probability of n-insertion. In fact, such possibilities have been mentioned for some of the factors in the previous studies: for instance, a higher frequency or preference of n-insertion before /j/ than before /i/ (Lee 1996;Bae 2003;and Oh 2006) and a greater likelihood of n-insertion after a sonorant C 1 than after an obstruent C 1 (Han 1994). Unfortunately, the few claimed tendencies were mostly based on the authors' intuition, rather than the results of wide-scale experimental or survey studies.
In the next section, I will provide a comprehensive and systematic investigation of n-insertion in existing Korean words to establish what factors affect n-insertion and whether they have categorical or gradient effects. For this purpose, I consider the following expanded set of factors: (5) Potential factors affecting Korean n-insertion a. syllabibicity of M 2 -intial vocoid b. dialect c.
morphological category of component morphemes d.
etymology of component morphemes e.
type of C 1 f. length of component morphemes g.
height of a vowel following M 2 -initial /j/, called here V 2 h.
word frequency The type of C 1 , the length of component morphemes, and the height of V 2 are added to the list of factors, mainly because they have been shown to be active in the application of Seoul Korean n-insertion by some previous studies (Hwang 2008, Jun 2015. Note that the length of a word or morpheme has been identified as a factor affecting the rate of a phonological process such as Turkish final devoicing (Becker et al. 2011). It has, in fact, been shown in Jun (2015) that n-insertion in Seoul Korean is less likely with words with monosyllabic M 1 than those with longer M 1 . As discussed in section 2.2, several previous studies on Korean n-insertion have argued for the effect of C 1 sonorancy. However, the results of Hwang's (2008) experimental study and Jun's (2015) survey on Seoul Korean speakers show that this C 1 sonorancy effect differs depending on the place of articulation of C 1 sonorants. Specifically, n-insertion is less likely after a velar nasal C 1 than after other sonorants. Thus, the present investigation will involve a comparison not only between words with obstruent vs. sonorant (except /ŋ/) C 1 , but also between those with the velar nasal vs. other sonorant C 1 . Jun also reports that the height of V 2 plays a significant role in determining the rate of Seoul Korean n-insertion, and, specifically, that insertion is more likely before a high vowel than before a non-high vowel. Finally, word frequency is added to the list in (5) as it has often been shown to affect the application rate of an optional process (for instance, English final t/d deletion as discussed by Coetzee and Pater 2011).

Data and experimental procedure
In this section, I explore n-insertion in existing Korean words by investigating the results of the surveys on native speakers of two dialects of Korean, Seoul and Kyungsang. For Seoul Korean data, I use the results of the survey conducted by Jun (2015). In that survey, twenty-two Seoul Korean speakers participated. For Kyungsang Korean data, I performed a survey on Kyungsang Korean speakers, using the same method and the same word set as adopted by Jun.
Twenty-three paid Kyungsang Korean speakers participated in the survey (mean age = 25.9 years, eleven females and twelve males). Most of the participants (n = 20) were raised in Daegu, and the rest (n = 3) in other parts of Northern Kyungsang province. Sixteen participants were recruited from the community at Seoul National University through public advertising, and they took the survey in a quiet room (for about 40-50 minutes). The rest of the participants (n = 7), who were recruited through personal referrals, completed and returned the survey via email.
304 multi-morphemic Korean words with /j/-initial M 2 were employed as test words. These formed an exhaustive set of words with /j/-initial M 2 that Jun classified as multi-morphemic in a pilot investigation of his dictionary database. 4 No test words included /i/-initial M 2 , mainly because it was clear from the results of Jun's investigation of the data drawn from a dictionary and from previous surveys (Choi 2002, Kim 2003, Kook et al. 2005) that pre-/i/ insertion is attested, but less frequent in existing Korean words. Most of the test words were nouns (n = 286), and the exceptions included thirteen verbs and five adverbs.
A single survey form was used, and thus the tasks were administered in the same order for all participants. In the survey form, each test word along with its inserted and non-inserted forms were presented in standard Korean orthography. 5 For each test word, the participants were instructed to choose their pronunciation from the following three (or two) options:6 (6) Options given in the survey form 6 e.g. (i) com-jak (ii) sotok-jak (iii) t h aŋ-jak 'mothball' 'antiseptic' 'herbal decoction' a. insertion com.njak so.toŋ.njak t h aŋ.njak b. non-insertion (resyllabified) co.mjak so.to.kjak c. non-insertion (aligned) com.jak sotok.jak t h aŋ.jak 4 Jun's database consists of words which occurred at least once in the Sejong corpus and were also listed as standard Korean words in the SKD.

5
For the test words (n = 114) which might be potentially ambiguous or unclear in meaning, their dictionary meanings were also presented in the survey form.

6
In Korean orthography, the phonemic characters are grouped into blocks, each of which corresponds to a syllable. Both syllable divisions and constituency such as onset and coda can be seen in the written words. Jun Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1401 When C 1 was not /ŋ/ as in (6i-ii), two non-inserted forms (6b,c) were given in addition to an inserted one (6a). The resyllabified form (6b) is known as the standard surface form in Korean where an intervocalic consonant occupies an onset position. But, it has been argued in the literature (Lee 1992, Park 2001) that the aligned form (6c) is also possible when a morpheme boundary intervenes between a consonant and a following vocoid. When C 1 was /ŋ/ as in (6iii), only a single non-inserted option (6c) was given since there is no way to represent the resyllabified option (6b) in the standard Korean orthography, and it is generally assumed in Korean phonology that [ŋ] cannot be syllabified (at least exclusively) in onset position. When C 1 was underlyingly an obstruent as in (6ii), the option (6a) with insertion included nasalization of C 1 since obstruent nasalization is obligatory in Korean, as stated in section 2. The participants were allowed to choose more than one, each of which was counted as one data point in the analysis. Also, when their pronunciation was not given as an option, they were asked to write it in the blank space on the survey form. (See Jun (2015) for more details of the methods and the reason for conducting a self-evaluation survey using written forms.) In order to construct a database for the analysis of n-insertion in existing Korean words, I first combined results of Jun's survey on Seoul Korean speakers and the results of the present survey on Kyungsang speakers. The resulting data will be analyzed and discussed, using a statistical model and insertion rates (i.e. number of insertion responses divided by total number of responses).
I conducted data screening before the main analysis. Responses to certain sets of test words would cause difficulty in the analysis of the effects of the factors, presented above, on Korean n-insertion. First, I excluded the responses to a test word with a compound marker in C 1 position (/twi-s-jɛki/ 'backbiting (back-compound marker-story)') from the database. The phonological nature of the compound marker is controversial in the literature on Korean phonology and morphology: /s/, /t/, /ʔ/, empty skeletal slot and others. 7 It is thus unclear which category the compound marker belongs to. It should probably not be classified as either obstruent or sonorant, although it is represented with the letter <s> in Korean orthography.
An additional problematic set of test words has the same M 2 -initial suffix sequence /jʌ/, a contracted form of a causative suffix /i/ and a converbal suffix /ʌ/: e.g. /cuk-jʌ-cu-ta/ 'be killing (die-caus.cvb-give-inf)'. 8 This suffix sequence is the only native Korean M 2 -intial suffix included in the test word set. There is almost no variation among responses to the words with it. Most Korean participants did not choose the insertion option for them. The insertion rates for the ten test words with /jʌ/ are either zero or close to it: 3 words (0%), 4 words (4%), 2 words (6%), 1 word (13%). Thus, it seems that n-insertion is blocked almost categorically before this suffix. In the analysis of the database, I excluded responses to the test words with /jʌ/ so that they cannot weaken the effects of the factors adopted for investigation. The resulting data consist of 13,484 responses to 293 test words (285 nouns, five adverbs and three verbs).

Raw data analysis
This section considers only overall patterns of the data by calculating the mean insertion rates for each of the factors adopted in the experiment. In the next section, I provide a statistical analysis of the same data.

Dialect
The mean insertion rate for the entire data is 51.4%, and the rate is, on average, higher for Kyungsang speakers than for Seoul speakers, as shown in The observed higher rate of insertion for Kyungsang speakers is consistent with previous studies (section 2.1.1), but it will be shown that the observed difference is not statistically significant.

Morphological category
To establish the effect of M 1 morphology, insertion rates were calculated for different morphological categories of M 1 , as shown in Table 2.
Insertion rate was lowest for words with a bound root M 1 , highest for those with a free stem M 1 , and intermediate for those with a prefix M 1 . This observed ranking by M 1 morphological category, root < prefix < stem, did not differ depending on the dialect of the participants, as shown in Table 3 and Figure 2.   In addition, to establish the effect of M 2 morphology, insertion rates were calculated for different morphological categories of M 2 , as shown in Table 4.
Insertion rate was lowest for words with a bound root M 2 , highest for those with a suffix M 2 , and intermediate for those with a free stem M 2 . As can be seen in Table 5 and Figure 3, the observed ranking by M 2 morphological category, root < suffix < stem, did not differ across dialects.
In summary, n-insertion was more likely with a free stem or an affix than with a bound root, regardless of whether it precedes or follows the insertion site. In addition, the relative frequencies of n-insertion by morphological category did not differ by dialect.

Etymology
To establish the effect of M 1 etymology, insertion rates were calculated for different etymological origins of M 1 morphemes, as shown in Table 6.
Insertion rate was, on average, lower when M 1 is Sino-Korean than native Korean. Although the rate was highest for words with a loanword, the test word set has only four words with a loanword M 1 . As can be seen in Table 7 and Figure 4, the observed ranking by M 1 etymology, Sino-Korean < native Korean (< loanword), did not differ by dialect.   In addition, to establish the effect of M 2 etymology, insertion rates were calculated for different etymological origins of M 2 morphemes, as shown in Table 8.
Similarly to the results for M 1 etymology, insertion rate was lower when M 2 is Sino-Korean than native Korean. As can be seen in Table 9 and Figure 5, the observed relative rate difference between words with native and Sino-Korean M 2 did not differ by dialect.
In summary, n-insertion was less likely with a Sino-Korean morpheme than with a native Korean (or loanword), regardless of whether it precedes or follows the insertion site. This tendency was true for both Seoul and Kyungsang speakers.

C 1 type
To establish how n-insertion varies depending on the type of C 1 , insertion rates were calculated for different C 1 consonants, as shown in Table 10.
Excluding consonants with extremely low frequency (less than or equal to 3), insertion rates are plotted by C 1 consonants in Figure 6.
Insertion rate was lower after an obstruent or velar nasal than after a sonorant other than /ŋ/, as shown in Table 11.

Figure 5
Insertion rate (%) by M 2 etymology and dialect.  The observed lower insertion rates with an obstruent or /ŋ/ C 1 are true for both Seoul and Kyungsang speakers, as can be seen in Table  The relative rate difference between words with an obstruent and velar nasal C 1 was switched between the participants of the two dialects: obstruent > /ŋ/ for Seoul, but obstruent < /ŋ/ for Kyungsang.
In summary, n-insertion was less likely after an obstruent or velar nasal than after other sonorant consonants, regardless of the dialect of the participants.

Length
To establish how n-insertion varies depending on the length of M 1 , insertion rates were calculated for different numbers of component syllables of M 1 , as shown in Table 13.   Insertion rate was higher when M 1 is disyllabic than monosyllabic. The insertion rate is even higher when M 1 is trisyllabic, suggesting that as the M 1 length increases, the more likely n-insertion is. But, since words with trisyllabic syllables or longer are relatively few in the test set, I take the above rate distribution as indicating the difference between words with mono and polysyllabic M 1 , collapsing the results for words with polysyllabic M 1 , as shown in Table 14.
The observed higher rate of insertion with polysyllabic M 1 is true for both Seoul and Kyungsang participants, as shown in Table      In addition, to establish how n-insertion varies depending on the length of M 2 , insertion rates were calculated for different numbers of component syllables of M 2 , as shown in Table 16.
Insertion rate does not seem to vary systematically with the length of M 2 . There is almost no difference in average insertion rate between words with mono and disyllabic M 2 . Words with trisyllabic M 2 show somewhat higher rates, but words with trisyllabic M 2 or longer are relatively few in the test set. Thus, as in the analysis of M 1 length effects, I collapsed the results of all words with polysyllabic M 2 into a single category. Its average insertion rate is 51.8%, which is not substantially different from the rate for words with monosyllabic M 2 , i.e. 51.2%, as shown in Table 17.
The rate (non-)difference between words with mono and polysyllabic M 2 did not vary much by dialect, as shown in Table 18 and Figure 9.   In summary, insertion rates differed according to the length of M 1 , not M 2 , regardless of the dialect of the participants. n-insertion was more likely with a polysyllabic M 1 than with a monosyllabic M 1 .

V 2 height
To establish the effect of the quality of a vowel following /j/, called V 2 , insertion rates were calculated for different V 2 vowels, as shown in Table 19 and Figure 10.
Insertion rate was higher when V 2 is /u/, namely, a high vowel (62.7%), than when V 2 is nonhigh (49.9%), as shown in Table 20. 9 9 In Korean, /u/ is the only high vowel which can follow /j/ unlike the other high vowels, /i, ɨ/.   The observed higher rate of n-insertion with a high V 2 is true for both Seoul and Kyungsang participants, as can be seen in Table 21 and Figure 11.
In summary, n-insertion was more likely before a high vowel than before non-high vowels. This observed tendency was true for both Seoul and Kyungsang participants.

Frequency
To establish how n-insertion varies depending on the frequency of words, insertion rates were calculated for different (log-transformed) frequencies, and plotted in Figure 12.
Here we use word frequencies in the Sejong corpus, reported in Kang and Kim (2004). It seems that frequencies played no role in determining the rate of n-insertion, as suggested by an almost flat regression line in Figure 12, and by a statistically insignificant low correlation between insertion rates and (log-transformed) frequencies (r(291) = -0.040, p = 0.487).
The observed lack of frequency effect did not differ depending on the dialect of the participants, as can be seen in (7).

Figure 11
Insertion rate (%) by V 2 height and dialect.

Mixed effects analysis
In the previous section, we have considered only mean insertion rates, abstracting away from differences of individual participants and words. Here, we provide a more stringent test by taking into consideration individual participant and word differences. A mixed effects logistic regression model was fitted with the glmer function from the lmerTest package (Kuznetsova et al. 2017) in R (R Core Team 2020). Dependent variable is binary, i.e. n-inserted or not. Each subject and each test word were included as random intercepts. 10 In the model on the data of existing words, the following independent variables were adopted. 11 (sel = Seoul, ks = Kyungsang) • dialect (sel, ks) • 2-way interaction between M 1 origin and M 1 length • 3-way interaction between dialect, M 1 length and frequency • 3-way interaction between dialect, M 1 length and C 1 type • 3-way interaction between dialect, M 1 origin and C 1 type Categorical factors with three levels, M 1 , 2 morphology and C 1 type, were forward-difference coded in order to compare not only between the first and second levels (e.g. root vs. stem), but also between the second and third levels (e.g. stem vs. affix). All the remaining categorical factors were sum-coded so that coefficient estimates would represent the main effect.
The random effects of the mixed effects model show large variation depending on the participant (variance = 1.178) and test word (variance = 0.886), meaning that different participants and test words show greater and lesser average rates of insertion, as can be seen in Figures 13 and   14, respectively.
Even once these random factors are taken into account, as can be seen in (8), many fixed factors and their interactions still have a sizable effect on the insertion rate, which holds independently of the specific participant and test word.
10 More complex models with random slopes failed to converge.
11 In the morphological category distinction, stem refers to a morpheme which can stand alone whereas root a morpheme which cannot. An exception to this criterion is that a reduplicated adverb (e.g. /jakɨm-jakɨm/ [jakɨmjakɨm] ~ [jakɨmnjakɨm] 'bit by bit'), which is known to undergo n-insertion in the literature on Seoul Korean n-insertion, consists of stems. The distinction between root and affix was based on the SKD.
Recall in section 3.2.3 that there were only four words with loanword M 1 in the test word set, and their insertion rates were more similar to those with native Korean M 1 than to those with Sino-Korean M 1 . Here, I collapse loanword and native Korean M 1 morphemes into a single category.

20
Jun Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1401 Figure 13 The numbers of participants for different insertion rates (bin width = 10).

Figure 14
The numbers of test words for different insertion rates (bin width = 10).  The next section discusses the results of the present survey focusing on the significant effects in the mixed effects model, just presented. 13

Significant effects
The significant main effects in the mixed effects model, shown in (8), are summarized in (9).

13
The non-reference level of sum-coded binary factors and the two levels of forward-difference coded ternary factors relevant to the calculation of the corresponding estimate are shown in parentheses. For instance, the positive estimate for dialect (ks), 0.209, means that the average insertion rate for Kyungsang speakers is above the grand mean rate of insertion. The negative estimate for M 1 .morphology (root minus stem), -0.6, means that the average insertion rate is lower with a root M 1 than a stem M 1 . Jun Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1401 (9) Summary: significant main effects a. M 1 morphology effect n-insertion is less likely after a bound root than after a free stem (or prefix). b.
M 2 morphology effect n-insertion is less likely before a bound root than before a free stem (or suffix). c.
Obstruency effect n-insertion is less likely after an obstruent C 1 . d.
Velar nasal effect n-insertion is less likely after a velar nasal C 1 . e.
Length effect n-insertion is more likely with a polysyllabic M 1 than with a monosyllabic M 1 . f.
Height effect n-insertion is more likely with a high vowel following /j/.
All of these main effects can also be seen in the corresponding tables and plots presented in section 3.2. Many of them are in part or probabilistically consistent with previous studies on Korean n-insertion, discussed in section 2 and summarized in (4). Let us consider each of the significant main effects.
First, the M 1 morphology effect (9a) suggests that a bound root M 1 is less likely triggers of n-insertion than a free stem M 1 in both dialects. This and other related findings are only in part consistent with previous studies on Korean n-insertion. As discussed in section 2.1, Han (1994) argues that n-insertion may occur after a free stem, regardless of the dialect, but dialectal variation occurs with words with a bound root M 1 . It is argued that n-insertion is blocked after a bound root in Seoul, not Kyungsang, Korean. The results of the present study agree with Han's argument in that n-insertion is more frequent after a free stem in Seoul Korean. However, the present study differs from the previous studies in that n-insertion in Seoul Korean was not completely blocked after a bound root M 1 , as can be seen in Table 3 and Figure 2. Some test words with a bound root M 1 , for which Seoul Korean participants applied n-insertion frequently, are shown in (10). Consequently, the observed dialectal difference with respect to M 1 morphology is consistent with a probabilistic version of Han's argument on the difference between Seoul and Kyungsang Korean n-insertion.
Second, the M 2 morphology effect (9b) is similar to the M 1 morphology effect, just discussed, in that roots are less likely to trigger n-insertion than stems and affixes. Note that n-insertion occurred when M 2 is a bound root (and affix), as can be seen in Tables 4, 5 and Figure 3. This observation is in conflict with the free M 2 requirement proposed in many previous studies on Seoul Korean n-insertion, mentioned in section 2.1.2. Since all M 2 roots and suffixes in the present data are Sino-Korean, this study provides no counter-examples to the native Korean specific version of the free M 2 requirement. But, recall, in section 2.4, that cases of n-insertion before certain native Korean suffixes have already been reported in some previous studies. Consequently, the free M 2 requirement cannot be an absolute condition for n-insertion in Korean words, whether native or Sino-Korean. But, it seems true that n-insertion is blocked categorically before certain native Korean bound morphemes, as suggested by the results of the present survey involving the causative-converb suffix sequence /jʌ/, discussed in section 3.1. Third, insertion rates differed depending on the type of C 1 . Two observed relevant effects, obstruency (9c) and velar nasal (9d), suggest that sonorant consonants other than /ŋ/ are frequent triggers of n-insertion. This is only in part consistent with the previous studies on Seoul Korean n-insertion (Cho 1995, Cho & Iverson 1997 arguing that n-insertion applies obligatorily after a sonorant consonant, and optionally after an obstruent. One difference of the present study from the previous studies is that n-insertion did not always apply in words with a sonorant C 1 . As illustrated in (11), Seoul Korean participants rarely applied n-insertion in some words with a sonorant C 1 . Thus, the observed obstruency effect is consistent with a probabilistic version of the previous argument on the difference between sonorant and obstruent C 1 consonants. The velar nasal effect (i.e. /ŋ/ is not a likely trigger of n-insertion unlike other sonorant consonants) forms the second difference of the present study from the above-mentioned previous studies. The observed velar nasal effect is consistent with the results of Hwang's (2008) experimental study on Seoul Korean n-insertion.
Fourth, the length effect suggests that a polysyllabic M 1 is more likely triggers of n-insertion than a monosyllabic M 1 , regardless of the dialect of the participants. See below for a discussion of significant interaction effects of the length factor with some other factors.
Finally, insertion rates differed depending on the height of V 2 vowels following the M 2 -initial /j/. The height effect suggests that high V 2 vowels are likely triggers of n-insertion than nonhigh vowels.
Some of the significant main effects, just discussed, interact with other factors in the present data. Significant interaction effects in the mixed effects model, shown in (8) Let us discuss these significant interaction effects.
Four interaction effects in (12a-d) indicate how C 1 type effects, obstruency and velar nasal, varied according to dialect, M 1 length, and/or M 1 origin. The larger obstruency effect for Kyungsang speakers (12a) can be seen in Table 12 and Figure 7. This indicates that obstruent consonants are even weaker triggers of n-insertion for Kyungsang speakers than for Seoul speakers. But, the interaction effect in (12b) suggests that this is mainly true for words with monosyllabic M 1 since the obstruency effect for Kyungsang speakers is larger with monosyllabic M 1 than with polysyllabic M 1 , as can be seen in Table 22 and Figure 15.
The observed stronger obstruency effect with monosyllabic M 1 in the present Kyungsang data is probabilistically consistent with Lee's (2006)  sonorants, not obstruents, in Sino-Korean words consisting of monosyllabic root morphemes. 14 As stated in (12c), the velar nasal effect also interacts with dialect and M 1 length, but in the opposite direction. As can also be seen in Table 22 and Figure 15, the velar nasal effect is larger with polysyllabic M 1 morphemes in Kyungsang Korean data. As stated in (12d), the velar nasal effect also interacts with M 1 origin. The velar nasal effect is smaller with Sino-Korean M 1 in Kyungsang Korean data, as can be seen Table 23 and Figure 16. In addition, the interaction between M 1 length and M 1 origin (12e) indicates that the length effect was larger with a Sino-Korean M 1 than with a native Korean M 1 , as shown in Table 24 and Figure 17.
Note in fact that the M 1 length effect can be seen only among words with Sino-Korean M 1 which formed the majority of the test words.
14 Note that the majority of root M 1 morphemes adopted in the current survey, 55 out of 58, are Sino-Korean.    Finally, although frequency plays no role in predicting rates of Korean n-insertion in the entire data set, as shown in section 3.2.7, it is a significant predictor in a subset of the Seoul Korean data consisting of words with a polysyllabic M 1 (12f). For Seoul Korean participants, words with a polysyllabic M 1 tend to undergo n-insertion more frequently as their token frequencies increase, as indicated by the correlation presented in (13)

Summary
All the significant effects, main or interaction, discussed above, are summarized in (14) (14) Observations about n-insertion in existing words (A < B means "n-insertion is more frequent under condition B than condition A")   The probability of n-insertion in existing Korean words is significantly affected by various factors, as summarized above. Most of the factors do not completely determine the occurrence or absence of n-insertion, but they have gradient effect, contributing to the overall probability. The next section deals with whether, and which of, the gradient effects found in this section Korean speakers are aware of. We will focus on the effects of dialect, C 1 type, M 1 length, and V 2 height, which can readily be tested with novel words. The rest of the factors tested in the survey on existing words were excluded from the novel word survey mainly due to the difficulty in creating appropriate test items. For instance, it was unclear how to construct words with novel bound roots and novel Sino-Korean, as opposed to native Korean, words.

Data and experimental procedure
This section explores n-insertion in novel Korean words by investigating the results of the surveys on native speakers of two dialects of Korean, Seoul and Kyungsang. As in section 3, for novel word data of Seoul Korean speakers, I use the results of a novel word survey conducted by Jun (2015). The results are from responses of thirty-seven Seoul Korean speakers. For Kyungsang Korean data, I performed a survey on Kyungsang Korean speakers, using the same method and the same word set as adopted by Jun.
In total, thirty-two paid Northern Kyungsang Korean speakers participated in the test (mean age = 25.1 years, 17 females and 15 males). None of them participated in the survey with existing words. Most of the participants (n = 24) were raised in Daegu, and the rest (n = 8) in other parts of Northern Kyungsang province. All participants were recruited from the community at Seoul National University through public advertising and personal referrals. Twenty-eight participants took the survey in a quiet room (for about 25-35 minutes). The rest of the participants (n = 4) completed and returned the survey via email.
All test words consist of loanword M 1 and wug stem M 2 . M 1 , which is either mono or disyllabic, ends with one of seven consonants, /m, n, ŋ, l, p, s, k/. M 2 begins with one of /i, ju, ja/. The total number of test items is 84 (2 syllable counts × 7 codas × 3 vocoid types × 2 repeating blocks). The same number of control items (vowel-final M 1 or /a, e/-initial M 2 ) were adopted. The items were pseudorandomized in order, such that test items never followed each other and were always separated by a control item. Two sets of items were prepared: one set was the reversed version of the other.
In the survey form, two parts of a word, M 1 and M 2 , along with its inserted and non-inserted forms, were presented in standard Korean orthography. The experimenter told the participants that the combination of the two parts is a made-up compound noun for a new chemical product. The participants were instructed to choose their pronunciation of each of the given compounds from the following three (or two) options.
(15) Options in the test form (i) (ii) (iii) e.g. s'ʌm + jucenol t h ap + jucenol k h iŋ + jucenol 'some'+wug 'top'+wug 'king'+wug a. insertion s'ʌm.nju.ce.nol t h am.nju.ce.nol k h iŋ.nju.ce.nol b. non-insertion (resyllabified) s'ʌ.mju.ce.nol t h a.pju.ce.nol c. non-insertion (aligned) s'ʌm.ju.ce.nol t h ap.ju.ce.nol k h iŋ.ju.ce.nol Like in the survey of existing words, when C 1 is /ŋ/ as in (15iii), only two options (15a,c) were given. When C 1 was underlyingly an obstruent as in (15ii), the option (15a) with insertion included nasalization of C 1 due to obstruent nasalization. The participants were allowed to choose more than one, each of which was counted as one data point in the analysis. Also, when their pronunciation was not given as an option, they were asked to write it in the blank space on the test form.

Jun
Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1401 In order to construct the database for analysis of n-insertion in novel Korean words, I first combined results of Jun's (2015) survey on Seoul Korean speakers and the results of the present survey on Kyungsang speakers. Two Seoul Korean speakers inserted more frequently in control tokens than target tokens, which I think is beyond the permissible range of errors. Their responses were considered not reliable, and thus excluded from analysis. After further excluding responses to fillers with vowel-final M 1 , the resulting data consist of 9,814 responses.
An initial investigation of the novel word data shows clear syllabicity effect of M 2 -initial vocoids (i.e. n-insertion is less likely before /i/ than /j/), as can be seen in Table 25.
Insertion rate is obviously higher before a glide /j/ than before vowels. But, /i/ is not higher in insertion rate than control vowels /a, e/, suggesting that insertion before /i/ is no more productive than insertion before other vowels.
Given that only pre-/j/ insertion is productive, the remainder of the analysis will focus on the data for test words with /j/-initial M 2 , which consist of 3,957 responses. This will facilitate a direct comparison between the patterns of novel and existing Korean words. As in section 3, the novel word data will be analyzed, first using mean insertion rates and then using a mixed effects logistic regression model.

Raw data analysis
This section discusses only overall patterns of the data by calculating the mean insertion rates for each of the factors adopted in the experiment. The next section provides a statistical analysis of the same data.

Dialect
The mean rate of insertion for the entire data set is 31.4%, and the rate is higher for Kyungsang speakers than for Seoul speakers, as shown in Table 26 and Figure 19.   Table 26 Insertion rate (%) by dialect.

Figure 19
Insertion rate (%) by dialect. It will be shown that the observed difference is not statistically significant, as in the results of the existing word survey.

C 1 type
Insertion rates were calculated for different C 1 consonants, as shown in Table 27 and Figure 20.
Insertion rates were higher with all three sonorant consonants, /m, n, l/, than with obstruents and /ŋ/, although the difference between /m/ and /k/ is very small. To establish the effects of C 1 type, i.e. obstruency and velar nasal effects, which were significant in existing words, insertion rates were calculated for obstruents, /ŋ/ and other sonorant consonants, as shown in Table 28.  Table 27 Insertion rate (%) by C 1 consonants.

Figure 20
Insertion rate (%) by C 1 consonants.  Note that n-insertion in novel words is less likely after obstruents and /ŋ/ than after other sonorant consonants, just like in existing words. The observed lower insertion rates with an obstruent or /ŋ/ C 1 are true for both Seoul and Kyungsang speakers, as can be seen in Table 29 and Figure 21.
The ranking by C 1 type, son > obs > /ŋ/, in novel words did not differ by dialect, which is not consistent with the corresponding results of the existing word survey where insertion rate in Kyungsang Korean data was higher with /ŋ/ C 1 than with an obstruent C 1 , unlike in Seoul Korean data.
In summary, n-insertion was less likely after an obstruent or velar nasal than after other sonorant consonants, regardless of the dialect of the participants. These tendencies in novel words are consistent with the tendencies, called obstruency and velar nasal effects, found in existing words.

Length
To establish the effect of M 1 length, insertion rates were calculated for different lengths of M 1 , as shown in Table 30 Table 29 Insertion rate (%) by C 1 type and dialect.

Figure 21
Insertion rate (%) by C 1 type and dialect.  Insertion rate in novel words is higher with a monosyllabic M 1 , which is not consistent with the corresponding results of the existing word survey. The observed higher rate of insertion with monosyllabic M 1 is true for both Seoul and Kyungsang participants, as shown in Table 31 and In summary, n-insertion was more likely with monosyllabic M 1 than with polysyllabic M 1 , regardless of the dialect of the participants. This is the opposite of the tendency observed in existing words, that is, higher rates with a polysyllabic M 1 . Consequently, the results of the existing and novel word surveys are not consistent with each other with respect to the effect of M 1 length. (See section 5.2 for a further discussion.)

V 2 height
To establish the effect of the height of a vowel following /j/, called V 2 , insertion rates were calculated for high and non-high V 2 vowels, as shown in Table 32.
Insertion rate in novel words was higher before a high V 2 than before a non-high V 2 , which is consistent with the corresponding results of the existing word survey. This observed higher rate   of n-insertion with a high V 2 is true for both Seoul and Kyungsang participants, as can be seen in Table 33 and Figure 23.
In summary, n-insertion was more likely with a high V 2 than with a non-high V 2 , regardless of the dialect of the participants. This tendency in novel words is consistent with the tendency, called height effect, found in existing words.

Mixed effects analysis
As in the analysis of existing Korean words, a mixed effects logistic regression model was fitted with the glmer function from the lmerTest package (Kuznetsova et al. 2017) in R (R Core Team 2020). Dependent variable is binary, i.e. n-inserted or not. Each subject and each test word were included as random intercepts.
In the model on the data of novel words, the following independent variables were adopted.
• dialect (sel, ks) • M 1 length (mono, poly) • C 1 type (obs, ŋ, son) • V 2 height (high, non-high) The interactions of dialect with all the remaining factors were included to address the question of whether the effect of each variable varies across dialects. As in the analysis of existing  to determine which interactions are active in the novel word data. Although no interactions turned out to be active in the novel word data, I have added a following interaction which was significant in existing words and can be tested in novel words: a 3-way interaction between dialect, M 1 length and C 1 type. As in the analysis of existing word data, C 1 type with three levels was forward-difference coded, and all the remaining categorical factors were sum-coded.
As in the analysis of existing word data, the random effects of the mixed effects model show variation depending on the participant (variance = 2.814) and test word (variance = 0.124), meaning that different participants and test words show greater and lesser average rates of insertion. Even after these random factors are taken into account, as can be seen in (16) The next section discusses the results of the novel word survey focusing on the significant effects in the mixed effects model, just presented.

Significant effects
Unlike in existing word data in which several main and interaction effects were significant, only the main effects involving C 1 type and V 2 height turned out to be significant (a = 0.05) in novel word data, as summarized in (17) (17) Summary: significant main effects (novel words) a.
Obstruency effect n-insertion is less likely after an obstruent C 1 . b.
Velar nasal effect n-insertion is less likely after a velar nasal C 1 . c.
Height effect n-insertion is more likely with a high vowel following /j/.
All of these main effects can also be seen in the corresponding tables and plots presented in section 4.2. Let us discuss these significant effects.
Insertion rates differed depending on the type of C 1 . Two observed relevant effects, obstruency (17a) and velar nasal (17b), suggest that sonorant consonants other than /ŋ/ are frequent triggers of n-insertion in novel Korean words. Recall in section 3 that the two observed effects involving C 1 type, obstruency and velar nasal, were also found significant in the existing word data. This suggests that both obstruency and velar nasal effects were extended to novel words. Jun Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1401 In addition, the height effect (17c) suggests that a high V 2 vowel is a more likely trigger of n-insertion in novel words than non-high V 2 vowels. Recall in section 3 that this V 2 height effect was also significant among existing words, indicating that the V 2 height effect was extended to novel words.
To summarize the results of the survey with novel Korean words, three main effects, obstruency, velar nasal and V 2 height, were confirmed. None of the remaining effects, main or interaction, were significant. As will be discussed in section 5.2, these results suggest that Korean speakers can learn prominent patterns among existing words through phonological generalizations.

Discussion
This section discusses how to analyze the occurrence of n-insertion and the observed tendencies, the data vs. learning (mis)match, dialectal variation and remaining problems.

Motivation, constraint and P-map effects
To illustrate how to analyze the occurrence of n-insertion, consider inserted and non-inserted forms shown in (18) If /n/ is not inserted, C 1 consonants at the end of M 1 would be either resyllabified into the onset of M 2 -initial syllable as in (18a) or be aligned with the end of the M 1 -final syllable as in (18b). The non-inserted forms with resyllabification in (18a) involves a misalignment between morpheme and syllable boundaries. Many previous analyses of Korean n-insertion have argued, or assumed, that n-insertion is motivated by the requirement to align morpheme-syllable boundaries (Jun 2015 and references therein). In the non-inserted forms with alignment in (18b), vocoids /i, j/ initiate a syllable, leaving the preceding C 1 consonants at the end of M 1final syllable. This syllable structure is marked in Korean where an intervocalic consonant (with the possible exception of /ŋ/) is syllabified as an onset. In terms of Optimality Theory (Prince & Smolensky 1993, the forms with resyllabification and alignment violate the alignment constraint ("The right edge of a morpheme coincides with the right edge of a syllable.") and syllable structure constraints, respectively. The relevant syllable structure constraints here include Onset ("Syllables have an onset.") and Vocoid-Nucleus ("Every [−consonantal] segment must be in the nucleus."). The former penalizes aligned forms with a vowel-initial syllable (e.g. [som.i.pul]) whereas the latter penalizes those with a glide-initial syllable (e.g.
[com.jak]). In contrast, the forms with n-insertion in (18c) satisfy both the alignment and syllable structure constraints at the cost of violating the constraint militating against insertion of a segment, namely Dep. The epenthetic /n/ initiates M 2 -initial syllables, not only aligning between morpheme and syllable boundaries, but also avoiding syllables with no (or a bad, i.e. [-consonantal]) onset. To summarize, a consonant needs to be inserted at the beginning of M 2 to obey the alignment and syllable well-formedness constraints.
Let us now consider the questions of why /n/, not any other consonant, is inserted, and why insertion takes place only before high front vocoids. Jun (2015) answers these questions by arguing that /n/ in Korean is perceptually weak in the context of n-insertion, i.e. before high front vocoids /i, j/. Specifically, Jun's analysis of Korean n-insertion is couched within the framework of Steriade's (2001Steriade's ( , 2009) P-map theory, which proposes that it is segments with low perceptibility that are typically inserted or deleted, due to the fixed ranking of faithfulness constraints reflecting the relevant perceptibility scale. In Korean, if /n/ occurs before high front vocoids /i, j/, it undergoes allophonic palatalization, failing to induce coarticulatory changes on the following vocoids. Thus, the input-output pairs i-ni (or, more precisely, i-ɲi) and j-nj Recall in section 3 that all the three significant effects in novel word data were also significant in existing word data. It suggests that Korean speakers are aware of each of the corresponding tendencies in the lexicon, and they used this knowledge in the novel word survey.
What about the rest of the tendencies tested in the novel word survey? All of them, which turned out to be insignificant in novel words, involve M 1 length and dialect. In the remainder of this subsection, I will consider the M 1 length effect, postponing the discussion of dialect effects to the next section.
As shown in section 4.2.3, insertion rate in novel words was higher with monosyllabic M 1 . The relevant difference was not significant in the mixed effect model shown in section 4.3 (M 1 length (poly): b = -0.151, p = 0.098). This statistically insignificant tendency in novel words is the opposite of the tendency in existing words. Thus, this cannot be a case where the size of a lexical tendency is simply reduced in novel words. This suggests that Korean speakers failed to learn the tendency involving the M 1 length of existing words. This mismatch has been reported by Jun (2015), from which the Seoul Korean data of the present study are drawn. The insignificant interaction between M 1 length and dialect shown in (16), along with the insertion rates calculated in novel words for different lengths of M 1 , shown in Table 31 and Figure 22, suggest that the failure of extending the length effect to novel words was not confined to Seoul Korean speakers. Jun attributes the observed mismatch between data and learning to the lack of a phonetic/phonological motivation for the tendency related to the length of M 1 . Recall in the previous section that not only the occurrence of n-insertion but also most of the observed effects, except the length effect, can be explained in a phonetically and/or phonologically plausible way. It has been argued in the literature (Hayes et al. 2009, Becker et al. 2011, Hayes & White 2013) that lexical patterns lacking phonetic/phonological naturalness either cannot be learned or can be learned with much difficulty. Thus, the lack of phonetic/phonological naturalness seems to be a potentially plausible account of why the length effect in the lexicon failed to extend to novel words. To explain the failure of extending the length effect to novel words, one might consider a possibility that Korean speakers applied n-insertion in the novel word survey using their knowledge of patterns in existing native, not Sino-Korean, words. Recall, as shown in Table 24 and Figure 17 that the M 1 length effect was effective mainly in existing words with Sino-Korean, not native, M 1 . However, it is unclear not only why Korean speakers ignored distributional patterns in Sino-Korean words which form the majority of the Korean lexicon, but how they successfully distinguished native Korean morphemes from Sino-Korean ones.
To summarize, three tendencies, i.e. obstruency, velar nasal and V 2 height effects, observed from the novel words mirror those from existing words, indicating a match between learning and data. This suggests that speakers can generalize gradient statistical patterns in the lexicon, as has been confirmed in many previous studies (Jun & Albright 2017 and references therein). An obvious mismatch involves the length effect, which may possibly lack phonetic/phonological naturalness.

Dialectal variation and remaining problems
One of the aims of the present study is to explore dialectal variation in Korean n-insertion. In this section, I will first discuss the interaction effects of dialect with other factors, and then turn to the main effect of dialect.
In the analysis of the results of the survey on existing words, five out of six significant interaction effects involve the dialect of the participants as summarized in (12). All the significant interaction effects of dialect and other factors in existing word data shown in (12) were either not tested or insignificant in the novel word data. Frequency and M 1 origin factors including their interactions with dialect were excluded from the novel word survey mainly due to the difficulty in creating appropriate test items. The remaining three interaction effects (12a-c) were tested in the novel word survey, but none of them turned out to be significant in the mixed effects logistic regression model presented in (16). It seems that the lack of statistical significance of the relevant dialectal differences in novel words might be due to the general reduction of average insertion rate in novel words, mentioned at the beginning of section 5.2.
Let us now compare existing and novel words with respect to the main effect of dialect.
Recall that the average insertion rate was higher for Kyungsang speakers than Seoul speakers, regardless of whether test items were existing or novel Korean words. This might lead one to think that n-insertion is more likely to occur for Kyungsang speakers than Seoul speakers, and that the relevant lexical patterns were extended to novel words. However, given that the main effect of dialect was not significant in both statistical models for existing and novel word data, as presented in (8) and (16) respectively, neither Kyungsang speakers' higher rate of insertion nor the consistency between existing and novel words can be ensured.
Note that the method of the present study has some limitations in detecting the dialectal difference even if the insertion rate is truly different between Kyungsang and Seoul Korean. First, a majority of Kyungsang participants employed in this study were relatively young, in their 20s (mean age = 25.9 years in existing word survey, 25.1 years in novel word survey). It is well-known in the literature on Korean linguistics that younger Kyungsang Korean speakers are more familiar with Seoul Korean than older Kyungsang Korean speakers. It is thus possible that the Kyungsang Korean participants in the current survey might, at least sometimes, have relied on their knowledge of Seoul Korean during the survey. Second, as pointed out by an anonymous reviewer, the modality of the task, i.e. text, not audio/production, might reduce the likelihood of finding dialectal differences. Given that non-standard dialects are usually used in spoken, as opposed to written, forms, it is possible that the modality of the present task prevented the Kyungsang participants from relying on their Kyungsang Korean grammar. Finally, as mentioned in section 3, all 304 test words used in the existing word survey were standard, thus Seoul, Korean words. The same set of test words was used for Kyungsang and Seoul Korean speakers for the purpose of making a direct comparison between the dialects. It might be then possible that Kyungsang Korean participants used their knowledge of Seoul Korean during the survey, at least when the test words were not in their native word lexicon. A more reliable detection of the dialectal difference under consideration can be made by employing older speakers of Kyungsang Korean as participants and spoken Kyungsang Korean words as test stimuli. This is left for future research.
To summarize, the present study found several statistically significant dialectal differences in the patterns of n-insertion in existing Korean words. However, it is still not known what other dialectal differences are active in existing words, and whether the dialectal differences in the lexicon can generally be extended to novel words.

Conclusion
In this study, I have explored variation in Korean n-insertion for the purpose of finding out whether phonological and morphological factors may have gradient effects in application of a morphophonological process, how they interact, and which of the gradient effects speakers are aware of. From the results of the survey on n-insertion in existing Korean words, I have found several gradient tendencies involving a variety of factors including phonological and morphological ones and their interactions. Such factors include morphological category, etymology and length of component morphemes, sonorancy and place of articulation of the consonant preceding the insertion site, height of the initial vowel of the morpheme following the insertion site and dialects of speakers. None of these factors are absolute conditions for the occurrence of n-insertion, contrary to the previous studies on Korean n-insertion, and they have gradient effect, contributing to the overall probability. Thus, Korean n-insertion provides a clear case where previous categorical proposals do not match with gradient reality, lending support to quantitative theories of morphophonology.
Some, not all, of the observed tendencies were tested in the survey with novel Korean words. The tendencies involving the morphological category and etymology of component morphemes were excluded from the novel word investigation, mainly due to the difficulty to construct appropriate test items. Three phonological tendencies, obstruency, velar nasal and height effects, were mirrored in the results of a survey involving novel words, suggesting that Korean speakers are aware of the differential influence of such phonological factors on the probability of the application of n-insertion. No other effects, including the length and dialect effects, turned out to be significant in the novel word investigation. The statistical insignificance of some of these effects might be due to the general reduction of the average insertion rate in novel words. However, it seems clear from the relevant result of statistical analysis and insertion rates that