Skip to content
BY 4.0 license Open Access Published by De Gruyter Mouton September 22, 2022

What do complexity measures measure? Correlating and validating corpus-based measures of morphological complexity

  • Çağrı Çöltekin ORCID logo EMAIL logo and Taraka Rama ORCID logo
From the journal Linguistics Vanguard

Abstract

We present an analysis of eight measures used for quantifying morphological complexity of natural languages. The measures we study are corpus-based measures of morphological complexity with varying requirements for corpus annotation. We present similarities and differences between these measures visually and through correlation analyses, as well as their relation to the relevant typological variables. Our analysis focuses on whether these ‘measures’ are measures of the same underlying variable, or whether they measure more than one dimension of morphological complexity. Principal component analysis indicates that the first principal component explains 92.62 percent of the variation in eight measures, indicating a strong linear dependence between the complexity measures studied.

1 Introduction

Whether a given language is more complex than another is an intriguing question. It has been widely assumed that all human languages have more-or-less equal complexity.[1] Recent challenges (McWhorter 2001; Sampson et al. 2009) to this ‘equal-complexity’ hypothesis resulted in a large number of studies which aim to objectively measure complexities of human languages. In general, ranking languages of the world on a scale of complexity is not necessarily very productive or useful by itself. However, such measures are useful, and used for assessing effects of geographic, historical, social, cultural, political and cognitive variables on linguistic differences in specific domains of linguistic structure (for instance, Bentz et al. 2016; Berdicevskis 2018; Bozic et al. 2007; Brezina and Pallotti 2019; Bulté and Housen 2012; Chen and Meurers 2019; De Clercq and Housen 2019; Ehret and Szmrecsanyi 2019; Kusters 2003; McWhorter 2001; Mehravari et al. 2015; Michel et al. 2019; Miestamo et al. 2008; Szmrecsanyi and Kortmann 2012; Vainio et al. 2014; van der Slik et al. 2019; Weiss and Meurers 2019; Yoon 2017). Since these studies use complexity metrics as a reference to linguistic differences based on other variables, it is crucial to have objective, precise and well-understood metrics.

Complexity of a text from a single language can be determined relatively consistently by the speakers of the language. This notion of a complex language is also quantified by measures that have been developed in a long tradition of assessment of complexity of texts written in the same language (see DuBay 2004, for a survey). Creating objective measures for comparing the complexities of different languages, however, comes with multiple difficulties, ranging from the lack of a clear definition of complexity (Andrason 2014) to the fact that it is likely impossible to summarize the complexity of a language with a single number (Deutscher 2009). Measuring the complexity of subsystems of a language, particularly complexity of morphology, seems less controversial. Although it is not completely free of the issues noted for measuring overall complexity of a language, the intuition that the morphological complexity of Mandarin is less than the morphological complexity of Estonian is hardly open to debate. Quantifying this intuition has been an active strain of research yielding a relatively large number of measures of morphological complexity (Bentz et al. 2017; Berdicevskis et al. 2018; Cotterell et al. 2019; Dahl 2004; Juola 1998; Koplenig et al. 2017; Newmeyer and Preston 2014; Sagot and Walther 2011; Stump 2017, just to name a few).

Most complexity measures suggested in the literature are necessarily indirect, noisy, theory- or model-dependent and can often only be applied to a limited number of languages due to lack of resources or information. Furthermore, morphological complexity is argued to have multiple dimensions (Anderson 2015). As a result, understanding and validating these measures are crucial for research drawing conclusions based on them. Despite the large number of seemingly different measures proposed for quantifying linguistic complexity over the last few decades, there have been only a few attempts to compare and understand these measures (for instance, Bentz et al. 2016; Berdicevskis et al. 2018; Stump 2017). In this paper, we experiment with a number of measures of morphological complexity, investigating their similarities and differences as well as their relation to typological features obtained from grammar descriptions. In particular, given a set of measures of morphological complexity (described in Section 3), we focus on the question of whether these ‘measures’ address the same underlying concept and construct, and if they differ in (typographically) meaningful ways.

2 Measuring morphological complexity

Unlike other areas of inquiry in linguistics, morphology is a very popular domain for studying linguistic complexity. Measuring morphological complexity is often motivated by the fact that morphology is relatively straightforward and theory-independent, at least, in comparison to syntax (Juola 1998). Another motivating factor is the claim that the mere existence of morphology is complexity (Anderson 2015; Carstairs-McCarthy 2010), which is parallel to the claim that younger languages tend to have simpler morphologies (McWhorter 2001).

It is, however, often unclear what most studies mean by the complexity of (subsystems of) languages (Sagot 2013). Most typological studies quantify the morphological complexity based on counting a set of properties. Besides the strong consensus that different subsystems of languages may have different complexities, the concept of morphological complexity alone is probably a multi-faceted concept (Anderson 2015) which may be impossible or difficult to place on a single scale. A common distinction popularized by Ackerman and Malouf (2013) is between enumerative and integrative complexity. Enumerative complexity is based on the number of morphosyntactic distinctions marked on words of a language, while the integrative complexity is about the predictability of morphologically related words from each other. The former notion is similar to the notion of complexity in most typological studies, and it is in line with what computational linguists typically call ‘morphologically rich’ (for instance, Tsarfaty et al. 2013). The latter, however, with simplification, indicates what one would associate with ‘difficulty’ in processing and learning. A language may exhibit high complexity in one of these scales, while being less complex on the other. For example, an agglutinating language with many possible morphosyntactic alternations with a regular mapping between the functions and the forms may have a high enumerative complexity but low integrative complexity.

One of the aims of the present study is to provide evidence for the distinction between enumerative and integrative complexity. Given a large number of measures suggested in earlier literature, we perform a principal component analysis (PCA) to observe whether there is more than one meaningful independent dimensions measured by this seemingly diverse set of measures. Earlier work on quantifying ‘integrative’ complexity has been based on the paradigm cell filling problem (Ackerman et al. 2009). Both Ackerman and Malouf (2013) and Cotterell et al. (2019) calculate a version of this complexity using paradigms extracted from grammars and lexical databases, respectively. These studies are particularly interesting as they postulate multiple dimensions of morphological complexity, and methods of quantifying these dimensions. Cotterell et al. (2019) also report an inverse correlation between these two dimensions of morphological complexity.

Another aspect of a complexity measure is the resource needed for measuring it. The two studies listed above use morphological information extracted from grammar descriptions and crowd-sourced lexical data. Corpus-based approaches are also commonly used for quantifying linguistic complexity (e.g., Bentz et al. 2016; Juola 1998; Oh et al. 2013). A straightforward method of measuring morphological complexity from an unannotated corpus is based on a measure of lexical diversity. Since morphologically complex languages tend to include a larger number of word forms, they tend to exhibit a higher degree of lexical diversity. Another common approach is to utilize entropy of the text. In particular, individual words are expected to contain more regularities (lower entropy) in a morphologically complex language. Linguistically annotated corpora allows formulating more direct measures of the notions of complexity discussed above. For example, one can approximate the enumerative complexity by counting the available morphological features in the corpus, and integrative complexity by training a machine learning method to learn a mapping between the forms and the functions of words based on the information available in the corpus.

It is important to note, however, that there are certain differences in comparison to the methods that work on data extracted from grammar descriptions or lexical resources. The information in a corpus reflects language use, rather than a theory or description of the language. This means some word forms or paradigm cells will not be observed in a corpus. On the other hand, obtaining a corpus is often easier than extracting data from linguistic documentation. Hence, a corpus-based approach is more suitable for a wider range of languages. Furthermore, a corpus also provides frequency information which can be utilized instead of only type-based inferences one can make from lexical data. All measures we study in this paper are corpus-based measures.

3 Measures

The present study compares eight different corpus-based measures of morphological complexity. The annotation level required by each measure differs from none to full morphological (inflectional) annotations typically found in a treebank. Most of the measures we define in this section are used in earlier studies for quantifying morphological complexity. A few new measures are also introduced here, but all are related to measures from earlier literature. In some cases, we modify an existing measure to either resolve some methodological issues, or adapt it to current experimental setup. The remainder of this section describes the measures we study. The details of the experimental setup are described in Section 4.

3.1 Type/token ratio (TTR)

The type/token ratio is the ratio of word types (unique words) to word tokens in a given text sample. The TTR is a time-tested metric for measuring linguistic complexity. Although there have been criticisms of using TTR as a measure of lexical diversity (Jarvis 2002; McCarthy and Jarvis 2010), it is one of the most straightforward measures to calculate, and it has been used in a number of earlier studies for measuring morphological complexity, and showed rather high correlation with other, more complex methods (e.g., Bentz et al. 2016; Čech and Kubát 2018; Çöltekin and Rama 2018). Since morphologically complex languages have more diverse word forms, high TTR indicates rich or complex morphology. Since the TTR depends on the corpus size, it is a common practice to calculate the TTR using a fixed window size (Kettunen 2014).[2] We calculate the TTR on a fixed-length random sample, and report the average over multiple samples. The details of the sampling procedure are described in Section 4.1.

3.2 Information in word structure (WS)

A popular method for measuring morphological complexity is based on comparing the information content of an original text and a distorted version of the text where the word structure is destroyed. The measure was initially proposed by Juola (1998) but variations of the framework have been used in a large number of studies (Bentz et al. 2016; Ehret 2014; Ehret and Szmrecsanyi 2016; Juola 2008; Koplenig et al. 2017; Montemurro and Zanette 2011, just to name a few). The general idea of the measure is that the difference of entropy between the original version and the distorted version is related to the information expressed by the morphology of the language.

In this work, we follow Juola (1998) and use ‘compressibility’ as a measure of (the lack of) information. In other words, we expect worse compression ratios for a distorted text in comparison to its non-distorted version in a morphologically complex language. However, instead of compressed file ratios used by some of the earlier work, we take the difference of compression ratios between the original and the distorted text as the measure of complexity (similar to Koplenig et al. 2017 and Bentz et al. 2016). Crucially, we replace every word type in the corpus with the same random sequence with equal length. To preserve some of the phonological (more precisely orthographic) information, we do not generate the random ‘words’ uniformly, but from a unigram language model of letters estimated from the corpus. Note that the measure depends on the units used for measuring entropy. As a result, it is not meaningful to use this method for comparing texts written with different writing systems.

3.3 Word and lemma entropy (WH, LH)

The entropy of the word-frequency distribution has also been used as a measure of morphological complexity in the earlier literature. Bentz et al. (2016) motivates the measure as the average information content of a word in the text sample studied. Another interpretation of the score is based on the typical distributions of words observed in languages with varying morphological complexity. The word distributions are affected by two issues related to morphology. First, complex morphology creates many rare word forms, resulting in a longer tail of the word frequency distribution, and hence, less predictable words. Second, a morphologically ‘poor’ language typically uses more function words, resulting in more words with higher probabilities, and hence high predictability and low entropy. The measure of word entropy we use is similar to the one used by Bentz et al. (2016). However, we use maximum-likelihood estimates of the word probabilities in the entropy calculation.[3] Furthermore, similar to the other metrics, we calculate the word entropy on a fixed-sized sample for all languages to remove potential effects of the sample size.

The interpretation offered above for the word entropy suggests two separate effects. Since we work with a data set including lemma annotations, we also calculate the lemma entropy, which should be less sensitive to the information packed in the words on average, but should bring the effect due to the large number of function words to the fore. The frequencies of content lemmas are expected to be relatively stable across languages with different morphological complexities. Since lemmatization strips out the inflections from the words, the differences observed in lemma entropy across languages are likely to be because of the frequencies of function words, and rich derivation and compounding. Since derivation and compounding increase the inventory of lemmas, languages with rich derivational morphology and compounding are expected to get higher LH scores. To our knowledge, the lemma entropy is not considered in the earlier literature in this form. However, the ‘lexical predictability’ measure of Blache (2011) (the ratio of frequent lemmas to all lemmas) is related to our measure.

3.4 Mean size of paradigm (MSP)

The mean size of paradigm is the number of word-form types divided by the number of lemma types. Our calculations follow Xanthos et al. (2011) who use MSP to correlate morphological complexity and the acquisition of morphology during first language acquisition. If the morphology of a language has a large number of paradigm cells (and if those complex paradigms are used in real-world language), then MSP will be high. The same measure is termed ‘morphological variety’ by Blache (2011).

3.5 Inflectional synthesis (IS)

The index of synthesis of the verb (Comrie 1989) is a typological measure of morphological complexity, which is also used by Shosted (2006) for investigating the correlation between morphological and phonological complexity. Shosted (2006) uses the measure as extracted from grammar descriptions by Bickel and Nichols (2005). The measure is the number of inflectional features a verb can take. Here, we adapt it to a corpus-based approach. Our version is simply the maximum number of distinct inflectional features assigned to a lemma in the given sample. Some systematic differences are expected due to the linguistic coding in UD treebanks and the way Bickel and Nichols (2005) code morphological features. For example, Bickel and Nichols (2005) do not consider fused morphemes as separate categories, e.g., a fused tense/aspect/modality (TAM) marker counted only once, while treebanks are likely to code each TAM dimension as a separate morphological feature.

3.6 Morphological feature entropy (MFH)

The limitation of the inflectional synthesis measure described above to count the morphological features only on verbs is likely due to the cost of collecting data from descriptive grammars of a large number of languages. Having annotated corpora for many languages allows a more direct estimate of the use of inflectional morphology in the language. To also include the information on usage, we calculate the entropy of the feature–value pairs, similar to the WH and LH measures described above. Everything else being equal, the measure will be high for languages with many inflectional features. However, this measure is affected by language use. For example, a rarely used feature value, e.g., use of a rare/archaic case value, will not affect the value of this measure as much as a set of uniformly used case values.

3.7 Inflection accuracy (IA)

The inflection accuracy metric we use in this study is simply the accuracy of a machine learning model predicting the inflected word from its lemma and morphological features. The intuition is that if the language in question has a complex morphology, the accuracy is expected to be low. As a result, we report the negative inflection accuracy (indicated as –ia) in the results below.

Unlike the measures discussed above, inflection accuracy is expected to be high for languages if the language has a rather regular, transparent inflection system – even if it utilizes a large number of inflections. In other words, this measure should be similar to the integrative complexity of Ackerman and Malouf (2013). Unlike Ackerman and Malouf (2013) and Cotterell et al. (2019), we estimate the inflection system from word tokens rather than word types, which results in incomplete paradigms in comparison to models trained on lexical data.

The inflection method used in this study is based on linear classifiers, which provides close to state-of-the-art systems with relatively small demand on computing power. A neural-network-based inflection system (common in recent SIGMORPHON shared tasks, e.g., Cotterell et al. 2018; McCarthy et al. 2019) may provide greater accuracy. However, our focus here is on the differences in inflection accuracy rather than the overall success of the system in the inflection task. There is no a priori reason to expect substantial differences when scores on different languages are compared. We use the freely available inflection implementation by Çöltekin (2019) in this study.

4 Data and experimental setup

4.1 Data

Our data consists of 63 treebanks from the Universal Dependencies (UD) project (Nivre et al. 2016). The set of treebanks was selected for the Workshop on Measuring Linguistic Complexity (MLC 2019).[4] The full list of treebanks, along with statistics are provided in Table 2 in the appendix. Most treebanks are from the Indo-European language family (51 of the 63 treebanks). 15 languages are represented by multiple treebanks in the data set which are helpful for distinguishing differences across languages and differences due to text types from the same language. Some treebanks/languages do not include morphological annotations, and the sizes of treebanks are highly variable. The smallest treebank (Hungarian) has about 40 K tokens, while the largest (Czech PDT) consists of approximately 1.5 M tokens.

The usage of UD POS tag inventory is relatively stable across languages. The number of POS tags used varies between 14 and 18. The morphological features in different treebanks are more varied, ranging between 2 and 29 feature labels.

Since some of our measures require morphological features, we exclude the treebanks without morphological features (notably Japanese and Korean treebanks) from most of our analyses.[5] Furthermore, since some of the other features depend on the writing system, we also exclude treebanks of languages with non-alphabetic writing systems, which also excludes Chinese treebank in the collection.

For testing the relevance of the measures with typological variables, we use a set of 28 typological variables related to morphology from the World Atlas of Language Structures (WALS, Dryer and Haspelmath 2013). The same set of features is also used by Bentz et al. (2016). Note that not all features are available for all languages in our sample. The list of features and their coverage are listed in Table 2 in the appendix.

4.2 Experimental setup

As noted above, some of our measures depend on text size. For comparability, we calculate all measures on 20,000 tokens (approximately half of the smallest treebank) sampled randomly from the input treebank. Since some of the measures (e.g., WS) are sensitive to word order, our sampling process samples sentences randomly with replacement until the number of tokens reach 20,000. For all measures except inflection accuracy, we repeat this process 100 times, and report the mean scores obtained over these random samples.[6]

The inflection accuracy is measured on the same data set, the model is trained and tested on the inflection tables extracted from a single 20,000-token sample randomly obtained from each treebank. However, due to computational reasons, we do not repeat the process multiple times but report the average score over cross validation experiments. Specifically, the score we present is the best mean accuracy (exact match of the inflected word) obtained through 3-fold cross validation on this sample. The model is tuned for each language separately using a random search through the model parameters.

Since there is no gold-standard for evaluating a complexity metric, we present all values graphically, which serves as an informal validation (a measure that puts Finnish and Russian high on the scale, while assigning lower complexities to English and Vietnamese is probably measuring something relevant to morphological complexity). The graphs also allow a visual inspection of differences between languages and between measures. We also present linear and rank-based correlation coefficients between the measures to quantify the relations between the individual measures.

We further validate the measures by evaluating their relationship with the morphology-related features from WALS. Since WALS features are categorical, instead of assigning an ad hoc numeric score to a configuration of features, we use ridge regression (as implemented in scikit-learn; Pedregosa et al. 2011) to predict normalized complexity measures from selected WALS features. If the complexity measure is relevant to one or more of the variables, the prediction error by the regression model will be low. To prevent overfitting, we tune and test the regression model using leave-one-out cross validation. For each measure, we present the reduction of average error (in comparison to a random baseline whose expected root-mean-squared error is 1 on a standardized variable) as a measure of relation to the WALS features.

To analyze whether measures include multiple dimensions or not, we perform dimensionality reduction using principal component analysis (PCA). The intuition here is that if the measures differ in what they measure, the explained variance should be shared among multiple principal components. Furthermore, if the lower-order principal components measure meaningful dimensions of morphological complexity, we expect them to indicate linguistically relevant differences between languages.

5 Results

Figure 1 presents the values of each complexity measure on the x-axis of the corresponding panel and the language/treebank codes displayed in each panel are sorted by the corresponding complexity score. Note that the points in these graphs represent averages over 100 bootstrap samples. The standard deviations of the scores obtained on multiple samples are rather small and are not visible on the graphs when plotted.

Figure 1: 
A visualization of all measures on all treebanks. Languages are sorted according to the measure value in each panel (higher points indicate higher complexity). The original scale of each measure is given under each panel. The vertical gray lines represent the mean value of the measure.
Figure 1:

A visualization of all measures on all treebanks. Languages are sorted according to the measure value in each panel (higher points indicate higher complexity). The original scale of each measure is given under each panel. The vertical gray lines represent the mean value of the measure.

In general, all measures seem to show the expected trend: morphologically complex languages are generally on the top of all scales (mainly agglutinative languages like Finnish and Turkish, but also fusional ones like Russian and Latin). Similarly, languages like Vietnamese, English and Dutch are generally at the lower end of the scale. Figure 1 also includes scores for the treebanks with missing morphological annotations (Japanese and Korean treebanks). For the measures that do not require morphological annotations, these languages are placed close to the top of the scale. However, when morphological annotations are required, they are ranked at the bottom. Since the WS score is sensitive to the writing system, the ranking is meaningful only for languages with sufficiently similar writing systems.

Table 1 reports the correlations between the measures. The correlations are almost always positive, and quite strong for many pairs. In contrast to the findings of Cotterell et al. (2019), we do not observe any negative correlation between any of the measures and the negative inflection accuracy (which we expect to measure ‘integrative complexity’ to some extent). A potential reason for this is the fact that our model is trained on inflection tables extracted from a corpus. Our inflection tables are sparse where infrequent paradigm cells are not present and paradigms of infrequent words are underrepresented in the input of our model. As a result, our model/measure is affected more by frequent forms, which are also expected to include more irregular forms.

Table 1:

Correlations between the measures. The lower triangular matrix reports linear (Pearson) correlation coefficients. The upper triangular matrix reports rank (Spearman) correlation coefficients. Darker shades indicate stronger correlation. All correlations, except the ones marked with an asterisk, are significant at p < 0.05 .

To give an indication of the typological relevance of each measure, we present the reduction of error in predicting each complexity measure from WALS morphological variables (listed in Table 2) in Figure 2. In general, all complexity measures seem to be related to the typological features. However, since not all morphology-related features in WALS indicate complexity, this is also an approximate indication of validity of the measures. Furthermore, the WALS data has many missing features for the languages in our data set, reducing the value of the comparison even further. However, the consistently positive effect of features is a clear indication that there is a relation between the typological features, and the measures evaluated in this study (Table 3).

Table 2:

The information on treebanks used in this study. Size is given in 1,000 tokens.

id Language Treebank Size Family Notes
afr Afrikaans AfriBooms 49 Indo-European
ara Arabic PADT 282 Afroasiatic
bul Bulgarian BTB 156 Indo-European
cat Catalan AnCora 531 Indo-European
ces.cac Czech CAC 494 Indo-European
ces.fic Czech FicTree 167 Indo-European
ces.pdt Czech PDT 1,506 Indo-European
chu Old Church Slavonic PROIEL 57 Indo-European ancient
cmn Chinese GSD 123 Sino-Tibetan few features
dan Danish DDT 100 Indo-European
deu German GSD 292 Indo-European
ell Greek GDT 63 Indo-European
est Estonian EDT 434 Uralic
eng.ewt English EWT 254 Indo-European
eng.gum English GUM 80 Indo-European
eng.lin English LinES 82 Indo-European
eng.par English ParTUT 49 Indo-European
eus Basque BDT 121 Basque
fas Persian Seraji 152 Indo-European
fin.ftb Finnish FTB 159 Uralic
fin.tdt Finnish TDT 202 Indo-European
fra.gsd French GSD 400 Indo-European
fra.seq French Sequoia 70 Indo-European
got Gothic PROIEL 55 Indo-European ancient
grc.per Ancient Greek Perseus 202 Indo-European ancient
grc.pro Ancient Greek PROIEL 214 Indo-European ancient
heb Hebrew HTB 161 Afroasiatic
hin Hindi HDTB 351 Indo-European
hrv Croatian SET 197 Indo-European
hun Hungarian Szeged 42 Indo-European
ind Indonesian GSD 121 Austronesian
ita.isd Italian ISDT 298 Indo-European
ita.par Italian ParTUT 55 Indo-European
ita.pos Italian PoSTWITA 124 Indo-European
jpn Japanese GSD 184 Japonic few features
kor.gsd Korean GSD 80 Indo-European few features
kor.kai Korean Kaist 350 Koreanic no features
lat.itt Latin ITTB 353 Indo-European ancient
lat.pro Latin PROIEL 199 Indo-European ancient
lav Latvian LVTB 152 Indo-European
nld.alp Dutch Alpino 228 Indo-European
nld.las Dutch LassySmall 98 Indo-European
nno Norwegian Nynorsk 301 Indo-European
nob Norwegian Bokmaal 310 Indo-European
pol.lfg Polish LFG 130 Indo-European
pol.sz Polish SZ 83 Indo-European
por Portuguese Bosque 227 Indo-European
ron.non Romanian Nonstandard 195 Indo-European
ron.rrt Romanian RRT 218 Indo-European
rus.gsd Russian GSD 99 Indo-European
rus.syn Russian SynTagRus 1,107 Indo-European
slk Slovak SNK 106 Indo-European
slv Slovenian SSJ 140 Indo-European
spa.anc Spanish AnCora 549 Indo-European
spa.gsd Spanish GSD 431 Indo-European
srp Serbian SET 86 Indo-European
swe.lin Swedish LinES 79 Indo-European
swe.tal Swedish Talbanken 96 Indo-European
tur Turkish IMST 57 Turkic
uig Uyghur UDT 40 Turkic
ukr Ukrainian IU 116 Indo-European
urd Urdu UDTB 138 Indo-European
vie Vietnamese VTB 43 Austroasiatic few features
Figure 2: 
The average reduction of prediction error when predicting the complexity measures from the WALS features. Higher values indicate stronger affinity with WALS features. The labels on the x-axis are the abbrevieations of the measures we study: type-token ratio (ttr), mean size of paradigm (msp), information in word structure (ws), word entropy (wh), lemma entropy (lh), inflectional synthesis (is), morphological feature entropy (mfh), and inflection accuracy (ia).
Figure 2:

The average reduction of prediction error when predicting the complexity measures from the WALS features. Higher values indicate stronger affinity with WALS features. The labels on the x-axis are the abbrevieations of the measures we study: type-token ratio (ttr), mean size of paradigm (msp), information in word structure (ws), word entropy (wh), lemma entropy (lh), inflectional synthesis (is), morphological feature entropy (mfh), and inflection accuracy (ia).

Table 3:

WALS features used. The column ‘Coverage’ indicates the number of languages in our sample for which the feature is defined in WALS.

Feature ID Description Coverage
22A Inflectional Synthesis of the Verb 18
26A Prefixing vs. Suffixing in Inflectional Morphology 34
27A Reduplication 22
28A Case Syncretism 19
29A Syncretism in Verbal Person/Number Marking 19
30A Number of Genders 18
33A Coding of Nominal Plurality 35
34A Occurrence of Nominal Plurality 23
37A Definite Articles 31
38A Indefinite Articles 29
49A Number of Cases 29
51A Position of Case Affixes 35
57A Position of Pronominal Possessive Affixes 27
59A Possessive Classification 18
65A Perfective/Imperfective Aspect 25
66A The Past Tense 25
67A The Future Tense 25
69A Position of Tense-Aspect Affixes 34
70A The Morphological Imperative 33
73A The Optative 25
74A Situational Possibility 28
75A Epistemic Possibility 28
78A Coding of Evidentiality 25
94A Order of Adverbial Subordinator and Clause 30
101A Expression of Pronominal Subjects 32
102A Verbal Person Marking 22
111A Nonperiphrastic Causative Constructions 19
112A Negative Morphemes 35

Despite overall positive correlations, in some cases the correlations are low. Furthermore, although correlated measures are reassuring, if we are actually measuring different aspects of morphological complexity, meaningful differences between the measures are more interesting. First, to understand whether we are measuring multiple underlying constructs, we perform a PCA analysis. The PCA indicates that 92.62% of the variation in the results are explained by a single component, suggesting a single underlying dimension. To test this further, we plotted the first two principal components in Figure 3. The two dimensions together explain 97.88% of the variation in the data.

Figure 3: 
The first two dimensions after PCA transformation. The first component (x-axis) explains 92.62% of the variation in the data, while the first two components together explain 97.88% of the variation. Note that scales differ: the y-axis is stretched for clarity.
Figure 3:

The first two dimensions after PCA transformation. The first component (x-axis) explains 92.62% of the variation in the data, while the first two components together explain 97.88% of the variation. Note that scales differ: the y-axis is stretched for clarity.

The plot in Figure 3 clearly shows that the first PCA component is in agreement with the intuitive notion of morphological complexity. The languages are ranked according to expectations without many exceptions. Languages like Basque, Finnish, Latvian and Turkish, from different language families but all known for their morphological complexity, are on the higher part of the scale of the first principal component. On the very low end of the scale, not surprisingly, is Vietnamese, followed by Dutch and Afrikaans. The middle part of the scale is difficult to interpret. However we still see close clusters of languages that belong to the same language family and/or languages with similar morphological complexity. The figure also shows that multiple treebanks from the same languages are also placed very close to each other. Furthermore, the relation of the first principal component to WALS features is also strong (0.28 according to the scale in Figure 2). However, the second component does not provide any discernible indication of morphological complexity. The treebanks from the same or similar languages are often far apart in the second dimension, and we do not observe any clear patterns that can be generalized from the second principal component. Looking further into the principal components does not reveal any pattern in lower-ranked components either (the ranking of languages according to all PCA dimensions are provided in Figure 4). In summary, the measures in this study seem to indicate a single underlying variable.

Figure 4: 
Visualization of all principal components. Languages are sorted according to the measure value in each panel (higher points indicate higher complexity). The original scale of each measure is given under each panel. The vertical gray lines represent the mean value of the measure.
Figure 4:

Visualization of all principal components. Languages are sorted according to the measure value in each panel (higher points indicate higher complexity). The original scale of each measure is given under each panel. The vertical gray lines represent the mean value of the measure.

6 Concluding remarks

We presented an analysis of eight corpus-based measures of morphological complexity. As shown in Figure 1, noise and minor exceptions aside, all measures capture a sense of morphological complexity: languages known to be morphologically complex are placed high up in all scales, while languages that are morphologically less complex are ranked low. Furthermore, the treebanks that belong to the same or closely related languages obtain similar scores. There are some interesting differences that can also be observed here. For example, since the measures WH and LH are sensitive to derivational morphology and compounding, languages like Vietnamese and English which otherwise get lower scores obtain moderately higher scores on these metrics.

We also presented correlations between the scores (Table 1). The scores generally correlate positively. This is reassuring, since they are all intend to measure the same construct. However, an interesting finding here is the positive correlation between all measures and the negative inflection accuracy. Since inflectional accuracy is a measure of ‘difficulty’, based on earlier findings in the literature (Cotterell et al. 2019) we expect a negative correlation between enumerative complexity measures. We note, however, that the data sets on which the systems are trained are different. The inflection tables we use are extracted from corpora. They are incomplete, and reflect the language use, in contrast to inflection tables that include theoretically possible, but rarely attested word forms. As a result, our model is trained and tested on more frequent forms which are also more likely to be formed by irregular, unpredictable morphological processes. These frequent, irregular forms are likely to be overwhelmed by many regular forms in the complete inflection table of a language with high enumerative complexity. In a table extracted from a relatively small corpus of the language, the irregular forms are expected to have a larger presence. Hence, when the model is tested on frequent words, the difficult forms are similarly distributed regardless of the enumerative complexity. When tested on full inflection tables, the effect of frequent, irregular forms diminishes for languages with high enumerative complexity. In other words, perhaps languages get similar budgets for morphological irregularities (hence integrative complexity), but the size of the inflection tables determine the impact of these irregularities on average predictability of forms of the inflected words from morphological features. Even though this explanation needs further investigation to be confirmed, it is also supported by the fact that MSP is the measure with the highest correlation with the negative inflection accuracy.

Finally, despite the expectation of multiple dimensions of morphological complexity, dimensionality reduction experiments indicate that the measures analyzed here are likely to measure a single underlying dimension. This is not to say that morphological complexity is uni-dimensional. The findings indicate that if the measures at hand are measuring different linguistic dimensions, these dimensions are highly (positively or negatively) correlated with others. A practical side effect of the dimensionality reduction is, however, that the resulting single dimension seems to reflect the intuitions about morphological complexities of the measures better, placing the same or related languages much closer to each other. This also suggests that, when available, combining multiple measures provides a more stable and reliable indication of morphological complexity compared to a single measure.


Corresponding author: Çağrı Çöltekin, Department of Linguistics, University of Tübingen, Tübingen, Germany, E-mail:

Appendix: Data and additional visualizations

References

Ackerman, Farrell, James P. Blevins & Robert Malouf. 2009. Parts and wholes: Implicative patterns in inflectional paradigms. In James P. Blevins & Juliette Blevins (eds.), Analogy in grammar: Form and acquisition (Oxford Linguistics), 54–82. Oxford: Oxford University Press.10.1093/acprof:oso/9780199547548.003.0003Search in Google Scholar

Ackerman, Farrell & Robert Malouf. 2013. Morphological organization: The low conditional entropy conjecture. Language 89(3). 429–464. https://doi.org/10.1353/lan.2013.0054.Search in Google Scholar

Anderson, Stephen R. 2015. Dimensions of morphological complexity. In M. Baerman, D. Brown & G.G. Corbett (eds.), Understanding and measuring morphological complexity, 11–26. Oxford: Oxford University Press.10.1093/acprof:oso/9780198723769.003.0002Search in Google Scholar

Andrason, Alexander. 2014. Language complexity–an insight from complex system theory. International Journal of Language and Linguistics 2(2). 74–89.Search in Google Scholar

Bentz, Christian, Dimitrios Alikaniotis, Michael Cysouw & Ramon Ferrer-i-Cancho. 2017. The entropy of words—learnability and expressivity across more than 1000 languages. Entropy 19(6). 275. https://doi.org/10.3390/e19060275.Search in Google Scholar

Bentz, Christian, Tatyana Ruzsics, Koplenig Alexander & Tanja Samardžić. 2016. A comparison between morphological complexity measures: Typological data vs. language corpora. In Proceedings of the workshop on computational linguistics for linguistic complexity (CL4LC), 142–153. Osaka, Japan: The COLING 2016 Organizing Committee.Search in Google Scholar

Berdicevskis, Aleksandrs. 2018. Do non-native speakers create a pressure towards simplification? Corpus evidence. In The evolution of language: Proceedings of the 13th international conference (EVOLANGXII), 41–43. Torun, Poland: The Nicolaus Copernicus University Press.10.12775/3991-1.007Search in Google Scholar

Berdicevskis, Aleksandrs, Çağrı Çöltekin, Katharina Ehret, Kilu von Prince, Daniel Ross, Bill Thompson, Chunxiao Yan, VeraDemberg, Gary, Lupyan, Taraka Rama & Christian Bentz. 2018. Using Universal Dependencies in cross-linguistic complexity research. In Proceedings of the second workshop on universal dependencies (UDW 2018), 8–17. Brussels, Belgium: Association for Computational Linguistics.10.18653/v1/W18-6002Search in Google Scholar

Bickel, Balthasar & Johanna Nichols. 2005. Inflectional synthesis of the verb. In Martin Haspelmath, Matthew S. Dryer, David Gil & Comrie Bernard (eds.), The world atlas of language structures, 94–97. Oxford: Oxford University Press.Search in Google Scholar

Blache, Philippe. 2011. A computational model for linguistic complexity. In Proceedings of the 1st international work-conference on linguistics, biology and computer science: Interplays, 155–167. Tarragona: IOS Press.Search in Google Scholar

Bozic, Mirjana, William D. Marslen-Wilson, Emmanuel A. Stamatakis, Matthew H. Davis & Lorraine K. Tyler. 2007. Differentiating morphology, form, and meaning: Neural correlates of morphological complexity. Journal of Cognitive Neuroscience 19(9). 1464–1475. https://doi.org/10.1162/jocn.2007.19.9.1464.Search in Google Scholar

Brezina, Vaclav & Gabriele Pallotti. 2019. Morphological complexity in written L2 texts. Second Language Research 35(1). 99–119. https://doi.org/10.1177/0267658316643125.Search in Google Scholar

Bulté, Bram & Alex Housen. 2012. Defining and operationalising L2 complexity. In Alex Housen, Folkert Kuiken & Ineke Vedder (eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA, 23–46. Amsterdam & Philadelphia: John Benjamins.10.1075/lllt.32.02bulSearch in Google Scholar

Carstairs-McCarthy, Andrew. 2010. The evolution of morphology (Oxford Linguistics). Oxford: Oxford University Press.10.1093/oxfordhb/9780199541119.013.0047Search in Google Scholar

Čech, Radek & Miroslav Kubát. 2018. Morphological richness of text. In Masako Fidler & Václav Cvrček (eds.), Taming the corpus: From inflection and lexis to interpretation, 63–77. Cham, Switzerland: Springer Nature.10.1007/978-3-319-98017-1_4Search in Google Scholar

Chen, Xiaobin & Detmar Meurers. 2019. Linking text readability and learner proficiency using linguistic complexity feature vector distance. Computer Assisted Language Learning 32(4). 418–447. https://doi.org/10.1080/09588221.2018.1527358.Search in Google Scholar

Çöltekin, Çağrı. 2019. Cross-lingual morphological inflection with explicit alignment. In Proceedings of the 16th workshop on computational research in phonetics, phonology, and morphology, 71–79. Florence, Italy: Association for Computational Linguistics.10.18653/v1/W19-4209Search in Google Scholar

Çöltekin, Çağrı & Taraka Rama. 2018. Exploiting Universal Dependencies treebanks for measuring morphosyntactic complexity. In Aleksandrs Berdicevskis & Christian Bentz (eds.), Proceedings of first workshop on measuring language complexity, 1–7. Torun, Poland.Search in Google Scholar

Comrie, Bernard. 1989. Language universals and linguistic typology: Syntax and morphology. Chicago: University of Chicago Press.Search in Google Scholar

Cotterell, Ryan, Christo Kirov, Mans Hulden & Jason Eisner. 2019. On the complexity and typology of inflectional morphological systems. Transactions of the Association for Computational Linguistics 7. 327–342. https://doi.org/10.1162/tacl_a_00271.Search in Google Scholar

Cotterell, Ryan, Christo Kirov, John Sylak-Glassman, Géraldine Walther, Ekaterina Vylomova, Arya D. McCarthy, Katharina Kann, Sebastian Mielke, Garrett Nicolai, Miikka Silfverberg, David Yarowsky, Jason Eisner & Mans Hulden. 2018. The CoNLL-SIGMORPHON 2018 shared task: Universal morphological reinflection. In Proceedings of the CoNLL-SIGMORPHON 2018 shared task: Universal morphological reinflection, 1–27. Brussels: Association for Computational Linguistics.10.18653/v1/K17-2001Search in Google Scholar

De Clercq, Bastien & Alex Housen. 2019. The development of morphological complexity: A cross-linguistic study of L2 French and English. Second Language Research 35(1). 71–97. https://doi.org/10.1177/0267658316674506.Search in Google Scholar

Dahl, Östen. 2004. The growth and maintenance of linguistic complexity (Studies in Language Companion Series). Amsterdam, Philadelphia: John Benjamins.10.1075/slcs.71Search in Google Scholar

Deutscher, Guy. 2009. “Overall complexity”: A wild goose chase? In Geoffrey Sampson, David Gil & Peter Trudgill (eds.), Language complexity as an evolving variable, 243–251. Oxford: Oxford University Press.10.1093/oso/9780199545216.003.0017Search in Google Scholar

DuBay, William H. 2004. The principles of readability. Costa Mesa, CA: Impact Information.Search in Google Scholar

Ehret, Katharina. 2014. Kolmogorov complexity of morphs and constructions in English. Linguistic Issues in Language Technology (LiTL) 11(2). 43–71. https://doi.org/10.33011/lilt.v11i.1363.Search in Google Scholar

Ehret, Katharina & Benedikt Szmrecsanyi. 2016. An information-theoretic approach to assess linguistic complexity. In Raffaela Baechler & Guido Seiler (eds.), Complexity, isolation, and variation, vol. 57, 71–94. Berlin, Boston: De Gruyter.10.1515/9783110348965-004Search in Google Scholar

Ehret, Katharina & Benedikt Szmrecsanyi. 2019. Compressing learner language: An information-theoretic measure of complexity in SLA production data. Second Language Research 35(1). 23–45. https://doi.org/10.1177/0267658316669559.Search in Google Scholar

Hockett, Charles F. 1958. A course in modern linguistics. New York: Macmillan.Search in Google Scholar

Jarvis, Scott. 2002. Short texts, best-fitting curves and new measures of lexical diversity. Language Testing 19(1). 57–84. https://doi.org/10.1191/0265532202lt220oa.Search in Google Scholar

Juola, Patrick. 1998. Measuring linguistic complexity: The morphological tier. Journal of Quantitative Linguistics 5(3). 206–213. https://doi.org/10.1080/09296179808590128.Search in Google Scholar

Juola, Patrick. 2008. Assessing linguistic complexity. In M. Miestamo, K. Sinnemäki & F. Karlsson (eds.), Language complexity: Typology, contact, change, 104–123. Amsterdam, Philadelphia: John Benjamins.10.1075/slcs.94.07juoSearch in Google Scholar

Kettunen, Kimmo. 2014. Can type-token ratio be used to show morphological complexity of languages? Journal of Quantitative Linguistics 21(3). 223–245. https://doi.org/10.1080/09296174.2014.911506.Search in Google Scholar

Koplenig, Alexander, Peter Meyer, Sascha Wolfer & Carolin Mueller-Spitzer. 2017. The statistical trade-off between word order and word structure–large-scale evidence for the principle of least effort. PLoS One 12(3). e0173614. https://doi.org/10.1371/journal.pone.0173614.Search in Google Scholar

Kusters, Wouter. 2003. Linguistic complexity: The influence of social change on verbal inflection. Utrecht: Netherlands Graduate School of Linguistics.Search in Google Scholar

Matthew S. Dryer & Martin Haspelmath (eds.). 2013. WALS online. https://wals.info. Leipzig: Max Planck Institute for Evolutionary Anthropology.Search in Google Scholar

McCarthy, Arya D., Ekaterina Vylomova, Shijie Wu, Chaitanya Malaviya, Lawrence Wolf-Sonkin, Garrett Nicolai, Christo Kirov, Miikka Silfverberg, Sebastian J. Mielke, Jeffrey Heinz, Cotterell Ryan & Mans Hulden. 2019. The SIGMORPHON 2021 shared task: Morphological analysis in context and cross-lingual transfer for inflection. In Proceedings of the 16th workshop on computational research in phonetics, phonology, and morphology, 229–244. Florence, Italy: Association for Computational Linguistics.10.18653/v1/W19-4226Search in Google Scholar

McCarthy, Philip M. & Scott Jarvis. 2010. MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods 42(2). 381–392. https://doi.org/10.3758/brm.42.2.381.Search in Google Scholar

McWhorter, John H. 2001. The world’s simplest grammars are creole grammars. Linguistic Typology 5(2–3). 125–166. https://doi.org/10.1515/lity.2001.001.Search in Google Scholar

Mehravari, Alison S., Darren Tanner, K. EmmaWampler, Geoffrey D. Valentine & Osterhout Lee. 2015. Effects of grammaticality and morphological complexity on the p600 event-related potential component. PLoS One 10(10). e0140850. https://doi.org/10.1371/journal.pone.0140850.Search in Google Scholar

Michel, Marije, Akira Murakami, Theodora Alexopoulou & Detmar Meurers. 2019. Effects of task type on morphosyntactic complexity across proficiency: Evidence from a large learner corpus of A1 to C2 writings. Instructed Second Language Acquisition 3(2). 124–152.10.1558/isla.38248Search in Google Scholar

Miestamo, Matti, Kaius Sinnemäki & Fred Karlsson. 2008. Language complexity: Typology, contact, change (Studies in Language Companion Series). Amsterdam & Philadelphia: John Benjamins.10.1075/slcs.94Search in Google Scholar

Montemurro, Marcelo A & Damián H Zanette. 2011. Universal entropy of word ordering across linguistic families. PLoS One 6(5). e19875. https://doi.org/10.1371/journal.pone.0019875.Search in Google Scholar

Newmeyer, Frederick J. & Laurel B. Preston. 2014. Measuring grammatical complexity (Oxford Linguistics). Oxford: Oxford University Press.10.1093/acprof:oso/9780199685301.001.0001Search in Google Scholar

Nivre, Joakim, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Hajič Jan, Christopher Manning, McDonald Ryan, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty & Daniel Zeman. 2016. Universal Dependencies v1: A multilingual treebank collection. In Proceedings of the tenth international conference on language resources and evaluation (LREC’16), 23–28. Portorož, Slovenia: European Language Resources Association (ELRA).Search in Google Scholar

Nivre, Joakim, Mitchell Abrams, Željko Agić, Lars Ahrenberg, Lene Antonsen, Katya Aplonova, Maria Jesus Aranzabe, Gashaw Arutie, Masayuki Asahara, Luma Ateyah, Mohammed Attia, Aitziber Atutxa, Liesbeth Augustinus, Elena Badmaeva, Miguel Ballesteros, Esha Banerjee, Sebastian Bank, Verginica Barbu Mititelu, Victoria Basmov, John Bauer, Sandra Bellato, Kepa Bengoetxea, Yevgeni Berzak, Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Erica Biagetti, Eckhard Bick, Rogier Blokland, Victoria Bobicev, Carl Börstell, Cristina Bosco, Gosse Bouma, Sam Bowman, Adriane Boyd, Aljoscha Burchardt, Marie Candito, Bernard Caron, Gauthier Caron, Gülşen Cebiroğlu Eryiğit, Flavio Massimiliano Cecchini, Giuseppe G. A. Celano, Slavomír Čéplö, Savas Cetin, Fabricio Chalub, Jinho Choi, Yongseok Cho, Jayeol Chun, Silvie Cinková, Aurélie Collomb, Çağrı Çöltekin, Miriam Connor, Marine Courtin, Elizabeth Davidson, Marie-Catherine de Marneffe, Valeria de Paiva, Arantza Diaz de Ilarraza, Carly Dickerson, Peter Dirix, Kaja Dobrovoljc, Timothy Dozat, Kira Droganova, Puneet Dwivedi, Marhaba Eli, Ali Elkahky, Binyam Ephrem, Tomaž Erjavec, Aline Etienne, Richárd Farkas, Hector Fernandez Alcalde, Jennifer Foster, Cláudia Freitas, Katarína Gajdošová, Daniel Galbraith, Marcos Garcia, Moa Gärdenfors, Sebastian Garza, Kim Gerdes, Filip Ginter, Iakes Goenaga, Koldo Gojenola, Memduh Gökırmak, Yoav Goldberg, Xavier Gómez Guinovart, Berta Gonzáles Saavedra, Matias Grioni, Normunds Grūzītis, Bruno Guillaume, Céline Guillot-Barbance, Nizar Habash, Jan Hajič, Jan Hajič jr., Linh Hà Mỹ, Na-Rae Han, Kim Harris, Dag Haug, Barbora Hladká, Jaroslava Hlaváčová, Florinel Hociung, Petter Hohle, Jena Hwang, Radu Ion, Elena Irimia, Ọlájídé Ishola, Tomáš Jelínek, Anders Johannsen, Fredrik Jørgensen, Hüner Kaşıkara, Sylvain Kahane, Hiroshi Kanayama, Jenna Kanerva, Boris Katz, Tolga Kayadelen, Jessica Kenney, Václava Kettnerová, Jesse Kirchner, Kamil Kopacewicz, Natalia Kotsyba, Simon Krek, Sookyoung Kwak, Veronika Laippala, Lorenzo Lambertino, Lucia Lam, Tatiana Lando, Septina Dian Larasati, Alexei Lavrentiev, John Lee, Phương Lê Hồng, Alessandro Lenci, Saran Lertpradit, Herman Leung, Cheuk Ying Li, Josie Li, Keying Li, KyungTae Lim, Nikola Ljubešić, Olga Loginova, Olga Lyashevskaya, Teresa Lynn, Vivien Macketanz, Aibek Makazhanov, Michael Mandl, Christopher Manning, Ruli Manurung, Cătălina Mărănduc, David Mareček, Katrin Marheinecke, Héctor Martínez Alonso, André Martins, Jan Mašek, Yuji Matsumoto, Ryan McDonald, Gustavo Mendonça, Niko Miekka, Margarita Misirpashayeva, Anna Missilä, Cătălin Mititelu, Yusuke Miyao, Simonetta Montemagni, Amir More, Laura Moreno Romero, Keiko Sophie Mori, Shinsuke Mori, Bjartur Mortensen, Bohdan Moskalevskyi, Kadri Muischnek, Yugo Murawaki, Kaili Müürisep, Pinkey Nainwani, Juan Ignacio Navarro Horñiacek, Anna Nedoluzhko, Gunta Nešpore-Bērzkalne, Lương Nguyễn Thị, Huyền Nguyễn Thị Minh, Vitaly Nikolaev, Rattima Nitisaroj, Hanna Nurmi, Stina Ojala, Adédayọ̀ Olúòkun, Mai Omura, Petya Osenova, Robert Östling, Lilja Øvrelid, Niko Partanen, Elena Pascual, Marco Passarotti, Agnieszka Patejuk, Guilherme Paulino-Passos, Siyao Peng, Cenel-Augusto Perez, Guy Perrier, Slav Petrov, Jussi Piitulainen, Emily Pitler, Barbara Plank, Thierry Poibeau, Martin Popel, Lauma Pretkalniņa, Sophie Prévost, Prokopis Prokopidis, Adam Przepiórkowski, Tiina Puolakainen, Sampo Pyysalo, Andriela Rääbis, Alexandre Rademaker, Loganathan Ramasamy, Taraka Rama, Carlos Ramisch, Vinit Ravishankar, Livy Real, Siva Reddy, Georg Rehm, Michael Rießler, Larissa Rinaldi, Laura Rituma, Luisa Rocha, Mykhailo Romanenko, Rudolf Rosa, Davide Rovati, Valentin Roșca, Olga Rudina, Jack Rueter, Shoval Sadde, Benoît Sagot, Shadi Saleh, Tanja Samardžić, Stephanie Samson, Manuela Sanguinetti, Baiba Saulīte, Yanin Sawanakunanon, Nathan Schneider, Sebastian Schuster, Djamé Seddah, Wolfgang Seeker, Mojgan Seraji, Mo Shen, Atsuko Shimada, Muh Shohibussirri, Dmitry Sichinava, Natalia Silveira, Maria Simi, Radu Simionescu, Katalin Simkó, Mária Šimková, Kiril Simov, Aaron Smith, Isabela Soares-Bastos, Carolyn Spadine, Antonio Stella, Milan Straka, Jana Strnadová, Alane Suhr, Umut Sulubacak, Zsolt Szántó, Dima Taji, Yuta Takahashi, Takaaki Tanaka, Isabelle Tellier, Trond Trosterud, Anna Trukhina, Reut Tsarfaty, Francis Tyers, Sumire Uematsu, Zdeňka Urešová, Larraitz Uria, Hans Uszkoreit, Sowmya Vajjala, Daniel van Niekerk, Gertjan van Noord, Viktor Varga, Eric Villemonte de la Clergerie, Veronika Vincze, Lars Wallin, Jing Xian Wang, Jonathan North Washington, Seyi Williams, Mats Wirén, Tsegay Woldemariam, Tak-sum Wong, Chunxiao Yan, Marat M. Yavrumyan, Zhuoran Yu, Zdeněk Žabokrtský, Amir Zeldes, Daniel Zeman, Manying Zhang, Hanzhi Zhu. 2018. Universal Dependencies 2.3. LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics. Charles University, Prague: Universal Dependencies Consortium.Search in Google Scholar

Oh, Yoon Mi, François Pellegrino, Marsico Egidio & ChristopheCoupé. 2013. A quantitative and typological approach to correlating linguistic complexity. In Proceedings of 5th conference on quantitative investigations in theoretical linguistics (QITL-5). Leuven: University of Leuven.Search in Google Scholar

Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot & Édouard Duchesnay. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12. 2825–2830.Search in Google Scholar

Sagot, Benoît. 2013. Comparing complexity measures. In Computational approaches to morphological complexity. Paris: Surrey Morphology Group.Search in Google Scholar

Sagot, Benoît & Géraldine Walther. 2011. Non-canonical inflection: Data, formalisation and complexity measures. In Proceedings of the international workshop on systems and frameworks for computational morphology, 23–45. Zurich, Switzerland: Springer.10.1007/978-3-642-23138-4_3Search in Google Scholar

Sampson, Geoffrey, David Gil & Trudgill Peter. 2009. Language complexity as an evolving variable, Vol. 13. Oxford: Oxford University Press.10.1093/oso/9780199545216.001.0001Search in Google Scholar

Shosted, Ryan K. 2006. Correlating complexity: A typological approach. Linguistic Typology 10(1). 1–40. https://doi.org/10.1515/LINGTY.2006.001.Search in Google Scholar

van der Slik, Frans, Roeland van Hout & Job Schepens. 2019. The role of morphological complexity in predicting the learnability of an additional language: The case of La (additional language) Dutch. Second Language Research 35(1). 47–70. https://doi.org/10.1177/0267658317691322.Search in Google Scholar

Stump, Gregory. 2017. The nature and dimensions of complexity in morphology. Annual Review of Linguistics 3. 65–83. https://doi.org/10.1146/annurev-linguistics-011415-040752.Search in Google Scholar

Szmrecsanyi, Benedikt & Bernd Kortmann. 2012. Introduction: Linguistic complexity–second language acquisition, indigenization, contact. In Bernd Kortmann & Benedikt Szmrecsanyi (eds.), Linguistic complexity: Second language acquisition, indigenization, contact, 6–34. Berlin: De Gruyter.10.1515/9783110229226.6Search in Google Scholar

Tsarfaty, Reut, Djamé Seddah, Sandra Kübler & Joakim Nivre. 2013. Parsing morphologically rich languages: Introduction to the special issue. Computational Linguistics 39(1). 15–22. https://doi.org/10.1162/COLI_a_00133.Search in Google Scholar

Vainio, Seppo, Anneli Pajunen & Jukka Hyönä. 2014. L1 and L2 word recognition in Finnish: Examining L1 effects on L2 processing of morphological complexity and morphophonological transparency. Studies in Second Language Acquisition 36(1). 133–162. https://doi.org/10.1017/s0272263113000478.Search in Google Scholar

Weiss, Zarah & Detmar Meurers. 2019. Analyzing linguistic complexity and accuracy in academic language development of German across elementary and secondary school. In Proceedings of the fourteenth workshop on innovative use of NLP for building educational applications, 380–393. Florence, Italy: Association for Computational Linguistics.10.18653/v1/W19-4440Search in Google Scholar

Xanthos, Aris, Sabine Laaha, Steven Gillis, Ursula Stephany, Ayhan Aksu-Koç, Anastasia Christofidou, Natalia Gagarina, Gordana Hrzica, F. Nihan Ketrez, Marianne Kilani-Schoch, Katharina Korecky-Kröll, Melita Kovacěvić, Klaus Laalo, Marijan Palmović, Barbara Pfeiler, Maria D. Voeikova & Wolfgang U. Dressler. 2011. On the role of morphological richness in the early development of noun and verb inflection. First Language 31(4). 461–479. https://doi.org/10.1177/0142723711409976.Search in Google Scholar

Yoon, Hyung-Jo. 2017. Linguistic complexity in L2 writing revisited: Issues of topic, proficiency, and construct multidimensionality. System 66. 130–141. https://doi.org/10.1016/j.system.2017.03.007.Search in Google Scholar

Received: 2021-01-16
Accepted: 2021-09-23
Published Online: 2022-09-22

© 2022 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 26.4.2024 from https://www.degruyter.com/document/doi/10.1515/lingvan-2021-0007/html
Scroll to top button