Tuning in to non-adjacencies: Exposure to learnable patterns supports discovering otherwise difficult structures
Introduction
Non-adjacent dependencies are ubiquitous in language. For instance, English marks number agreement (e.g. The linguists at the conference are restless) and aspect (e.g. People are learning all of the time) via inflectional morphemes that establish dependencies between distal items. Despite their prevalence in natural languages, non-adjacent dependencies in artificial grammar learning experiments are notoriously difficult to learn, both for adults and infants (e.g., Gómez, 2002; Gonzalez-Gomez & Nazzi, 2012; Newport & Aslin, 2004; Romberg & Saffran, 2013; see Wilson et al., 2018 for a recent review). Given their centrality to language structure, how do we learn non-adjacent dependencies that are not easily detected in speech?
Previous research suggests that the input can be structured to support learners' discovery of non-adjacent regularities. For example, learning can be facilitated simply by increasing exposure (Romberg & Saffran, 2013; Vuong, Meyer, & Christiansen, 2016); additional experience may allow learners more opportunity to uncover patterns. Learning can also be improved when the non-adjacent dependencies are paired with additional cues that highlight their relatedness (e.g., Onnis, Monaghan, Richmond, & Chater, 2005; van den Bos, Christiansen, & Misyak, 2012). For instance, Onnis et al. (2005) found that learners were better able to learn dependencies between phonologically similar syllables, and Newport and Aslin (2004) showed that participants could successfully detect non-adjacent patterns among sets of consonants or vowels, but failed to discover non-adjacent patterns among syllables. Thus, non-adjacent relations seem to be more easily tracked when dependent elements are perceived as similar. Perceptual cues that make relevant items more salient, such as prosody or pauses that mark boundaries in the speech stream, can also boost learning (e.g., Grama, Kerkhoff, & Wijnen, 2016; Peña, Bonatti, Nespor, & Mehler, 2002; Wang & Mintz, 2018), demonstrating that non-adjacent relations can be highlighted in numerous ways.
A particularly powerful factor that can highlight the presence of non-adjacent dependencies is the variability surrounding to-be-learned patterns (Gómez, 2002; Gómez & Maye, 2005). In a classic study by Gómez (2002), participants' learning of non-adjacent regularities improved significantly as the number of unique items that appeared between the dependent elements increased. Variability in the intervening elements affects learning because it can focus attention toward invariant, and hence reliable, structure in the input. With highly variable intermediate elements, learners are better able to detect the reliable associations between non-sequential items, suggesting that surrounding information can help direct learners' attention to non-adjacent regularities.
Learners can also build on past experience with related structures to detect the presence of non-adjacent structures. Previous experience can shape learners' expectations and change the statistical relations that they track (e.g., LaCross, 2015; Lew-Williams & Saffran, 2012; Potter, Wang, & Saffran, 2017; Wang, Zevin, & Mintz, 2017). For example, experiencing some word categories in adjacent structures subsequently helps learners recognize non-adjacent relations between the same words (Lany & Gómez, 2008; Lany, Gómez, & Gerken, 2007). Following experience with associations that are easily learnable, learners may be better able to detect more complex relations (e.g., Elman, 1990; Lai & Poletiek, 2011). Existing native language knowledge can have a particularly powerful impact on the expectations learners form about the structure of upcoming language input. In a recent study, Wang et al. (2017) showed that recent experience with consistent rhythmic patterns embedded in native language structures changes what patterns learners subsequently infer from novel materials. Participants learned non-adjacent dependencies embedded in an artificial language after they were exposed to English phrases that had a matched four-word structure, but not when the two structures were in conflict. This finding is consistent with evidence that infants are better able to discover regularities with a structure that matches their prior experience (Lew-Williams & Saffran, 2012). Together, these studies suggest that learners can use prior experience to improve their learning of non-adjacent dependencies by building on past learning about specific items in simpler contexts or by drawing on knowledge about non-adjacent structures from their first language. However, this leaves open the question of whether learners can discover non-adjacent dependencies de novo when the relevant dependencies only appear in non-adjacent relations. When acquiring a novel language, learners must learn new distal grammatical relations that are rarely, if ever, encountered in simpler forms. How might learners break in to learning new non-adjacencies?
In the current work, we investigated whether past distributional learning itself may offer a solution to the problem of discovering new non-adjacencies. This explanation focuses on the role of past learning in guiding future learning. If the input is initially structured to support successful non-adjacent dependency learning, this could lead learners to expect to encounter non-adjacent structure in the language. These expectations could subsequently allow them to extract non-adjacent patterns, even in contexts when learning would otherwise be difficult. To test this proposal, we designed a series of experiments in which learners could build on past distributional learning to succeed when faced with a more difficult context for detecting non-adjacent structure. We hypothesized that prior experience with non-adjacent dependencies in the presence of high variability (a context known to support learning; Gómez, 2002; Gómez & Maye, 2005; Plante et al., 2014) would facilitate acquisition of a new set of non-adjacent associations among novel words. In three studies, we tested our hypothesis that experience with one set of non-adjacent dependencies presented in more learnable circumstances would subsequently facilitate learning of a new set of non-adjacent dependencies that learners otherwise struggle to detect. Together, these studies explore how pattern learning in the present builds on pattern learning from the past by testing whether prior experience with readily learnable structures allows difficult linguistic structures to be learned more easily.
Section snippets
Experiment 1
Our first study tested whether being pre-exposed to non-adjacent dependencies in a learnable context would aid participants in recognizing novel non-adjacent regularities that are difficult to learn. Learners were tested for their ability to discover the association between the first and third word in three-word sequences (e.g., pel-kicey-rud). One group of learners was pre-exposed to a set of artificial sentences that we expected to be learnable based on past work (Gómez, 2002): consistent
Experiment 2
In Experiment 2, we conducted a replication of Experiment 1 with an additional condition (No Pre-Exposure Condition) in which participants received no pre-exposure experience. We predicted a linear effect across the three conditions, such that performance would be strongest in the Learnable Condition, intermediate in the No Pre-Exposure condition, and weakest in the Non-Learnable Condition, with significant differences between all three conditions. The linear hypothesis and analytic approach
Experiment 3
In Experiment 3, we tested the effect of exposure to learnable non-adjacent dependencies against a new condition (Unstructured Pre-Exposure Condition) in which total language exposure was equated with materials presented in the Learnable Pre-Exposure Condition. Crucially, the Unstructured Pre-Exposure Condition included a pre-exposure phase consisting of the same words as the pre-exposure in the Learnable Pre-Exposure Condition. However, the words occurred individually in random order, instead
General discussion
This set of studies investigated a proposal for how distributional learning might build on itself, such that learners develop expectations about linguistic structures that allow them to successfully learn otherwise difficult patterns. When learners were exposed to patterns with learnable non-adjacent dependencies, they were subsequently more successful at learning novel non-adjacent dependencies than if their previous exposure did not include learnable non-adjacent patterns. We tested three
Author contributions
All authors developed the study concept and design. Data collection and data analysis were performed by MZ. All authors contributed to the interpretation of the data and wrote the manuscript.
Acknowledgements
This research was supported by NSF-GRFP DGE-1747503 awarded to MZ, and grants from the NICHD to JRS (R37HD037466), CEP (F32 HD093139), and the Waisman Center (U54 HD090256). We thank Jill Lany for helpful comments on an earlier draft, and Emily Cummings, Grace McCune, Lauren Silber, and Amy So for aiding in data collection.
References (58)
- et al.
Mixed-effects modeling with crossed random effects for subjects and items
Journal of Memory and Language
(2008) - et al.
Random effects structure for confirmatory hypothesis testing: Keep it maximal
Journal of Memory and Language
(2013) - et al.
Statistical learning of probabilistic nonadjacent dependencies by multiple-cue integration
Journal of Memory and Language
(2012) - et al.
Probabilistic models of language processing and acquisition
Trends in Cognitive Sciences
(2006) Finding structure in time
Cognitive Science
(1990)- et al.
Three ideal observer models for rule learning in simple languages
Cognition
(2011) - et al.
Simultaneous segmentation and generalisation of non-adjacent dependencies from continuous speech
Cognition
(2016) Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models
Journal of Memory and Language
(2008)- et al.
The impact of adjacent-dependencies and staged-input on the learnability of center-embedded hierarchical structures
Cognition
(2011) - et al.
All words are not created equal: Expectations about word length guide infant statistical learning
Cognition
(2012)
Learning at a distance I. Statistical learning of non-adjacent dependencies
Cognitive Psychology
Phonology impacts segmentation in online speech processing
Journal of Memory and Language
Learning the unlearnable: The role of missing evidence
Cognition
Linguistic entrenchment: Prior knowledge impacts statistical learning performance
Cognition
Contrast tests of interaction hypotheses
Psychological Methods
Fitting linear mixed-effects models using lme4
Journal of Statistical Software
Second language acquisition from a functionalist perspective: Pragmatic, semantic, and perceptual strategies
Statistical learning within and between modalities: Pitting abstract against stimulus-specific representations
Psychological Science
Statistical learning of adjacent and non-adjacent dependencies among non-linguistic sounds
Psychonomic Bulletin and Review
Infants avoid “labouring in vain” by attending more to learnable than unlearnable linguistic patterns
Developmental Science
Learning multiple rules simultaneously: Affixes are more salient than reduplications
Memory and Cognition
Variability and detection of invariant structure
Psychological Science
The developmental trajectory of nonadjacent dependency learning
Infancy
Acquisition of nonadjacent phonological dependencies in the native language during the first year of life
Infancy
Gleaning structure from sound: The role of prosodic contrast in learning non-adjacent dependencies
Journal of Psycholinguistic Research
Grammatical gender in L2: A production or a real-time processing problem?
Second Language Research
The gender marking effect in spoken word recognition: The case of bilinguals
Memory and Cognition
The novel object and unusual name (NOUN) database: A collection of novel images for use in experimental research
Behavior Research Methods
Exemplar variability facilitates rapid learning of an otherwise unlearnable grammar by individuals with language-based learning disability
Journal of Speech, Language, and Hearing Research
Cited by (7)
Using known words to learn more words: A distributional model of child vocabulary acquisition
2023, Journal of Memory and LanguageAnalogical inference from distributional structure: What recurrent neural networks can tell us about word learning[Formula presented]
2023, Machine Learning with ApplicationsThe influence of language-specific properties on the role of consonants and vowels in a statistical learning task of an artificial language: A cross-linguistic comparison
2024, Quarterly Journal of Experimental PsychologyThe Influence of Memory on Visual Perception in Infants, Children, and Adults
2023, Cognitive ScienceThe influence of memory on visual perception in infants, children, and adults
2021, Research Square