What memory representation is acquired during nonword speech production learning? The influence of stimulus features and training modality on nonword encoding

Abstract The purpose of this research was to investigate memory representations related to speech processing. Psycholinguistic and speech motor control theorists have hypothesized a variety of fundamental memory representations, such as syllables or phonemes, which may be learned during speech acquisition tasks. Yet, it remains unclear which fundamental representations are encoded and retrieved during learning and generalization tasks. Two experiments were conducted using a motor learning paradigm to investigate if representations for syllables and phonemes were acquired during a nonword repetition task. Additionally, different training modalities were implemented across studies to examine if training modality influenced memory encoding for nonword stimuli. Results suggest multiple representations may be acquired during training regardless of training modality; however, the underlying memory representations learned during training may be less abstract than current models hypothesize.


PUBLIC INTEREST STATEMENT
Speech-language pathologists focus on improving the communication of individuals who suffer from communication disorders. Treatments include training patients to speak in new and novel ways. However, it remains unclear what information is learned during speech therapy, and how this information can be used at a later time when therapeutic support is not available (e.g., in the patient's home).
The focus of this research was to examine two specific speech variables, i.e., syllables and sounds, which were predicted to be fundamental to speech learning. Results indicated more variables were learned during speech training than originally predicted. These results also suggest individuals process speech stimuli differently during learning than has previously been theorized.
Generally, these results provide a theoretical basis for investigating new speech variables during motor learning, as well as updating theoretical models of speech production. In the long-term, such investigations may have clinical implications for speech therapy practices.

Subjects: Memory; Motor Skills; Communication Disorders
Keywords: speech; motor learning; memory There is a clinical and theoretical need to understand the features of stimuli that are encoded as memory representations during speech training, as well as the strategies or stimuli that enhance overall memory retrieval of these representations. Current speech processing models have described speech representations as connections (e.g., Dell, 1986), motor programs (e.g., Schmidt, 1975Schmidt, , 2003, and gestures (e.g., Browman & Goldstein, 1989, 1992. Each speech unit of analysis, e.g., the phoneme, may be viewed through a motor (e.g., phonetic) or language (e.g., phonologic) lens. For instance, in many linguistic models, phoneme representations detail the features associated with a given sound (e.g., place, manner, or voicing), as well as the phonotactic rules relating to the phoneme's position in words (e.g., Dell, 1986). However, phoneme representations in motor-based models of speech production may focus on the translation of an individual sound into spatial and temporal motor commands within the vocal tract (e.g., Van der Merwe, 2011). Each model assumes a memory representation is being retrieved from long-term memory (e.g., phoneme); however, the features associated with the representation vary widely based on the theoretical lens.
These interpretations have affected how speech researchers and speech-language pathologists have viewed different disorders. For instance, apraxia of speech (AOS) is considered a motor programming speech disorder despite historical debate on whether the deficits are truly phonetic-motor (e.g., M. R. McNeil, Robin, & Schmidt, 2009;Van der Merwe, 1997), phonologic violations (e.g., Dogil & Mayer, 1998;Mayer, 1995), working memory limitations (e.g., Clark & Robin, 1998;Rogers & Storkel, 1999;Whiteside & Varley, 1998), or some combination (e.g., Code, 1998). The model to which one subscribes will influence the focus and delivery of treatment. AOS treatment may consist of a phonetic approach utilizing principles of motor learning or a phonologic approach where stimuli are selected based on simple and complex phonotactic constraints. Determining the underlying variables that are encoded during learning may provide insight into the stimulus characteristics that are most salient in speech learning, as well as to how phonetic and phonologic levels of speech processing interact.
For the current work, a broad information-processing model is used to describe memory processing of speech stimuli during motor learning, which incorporates both psycholinguistic and speech vantage points. This model, found in work by Schmidt (1988) and Schmidt and Wrisberg (2004), identifies four levels of sequential processing: cognitive input, response selection, response programming, and execution (Schmidt, 1988;Schmidt & Wrisberg, 2004). Early in memory processing, i.e., cognitive input and response selection stages, memory representations are postulated to be abstract and devoid of context-specific information (e.g., Curtis, Rao, & D'Esposito, 2004;Miller & Ulrich, 1998;Schmidt & Wrisberg, 2004). Specification of the representation, e.g., parameterization of timing and sequence order, is processed during the motor programming stage, and then physiologically realized during the execution stage (e.g., Hulstijn & Van Galen, 1983;Klapp, 1995;Schmidt & Wrisberg, 2004).
Each stage of processing is also measured separately providing insight into the changing memory representation and resultant motor behavior. Although reaction time measurements may be used to evaluate cognitive input, response selection, and response programming stages, different variables are associated with changes in early programming (e.g., response selection) compared with later response programming (Klapp, 1995(Klapp, , 2003. During early programming, reaction times may increase with response uncertainty (Klapp, 1995(Klapp, , 1996(Klapp, , 2003, such as varying the number of distinctive makers that distinguish between two alternative responses (Heuer, 1982;Rosenbaum, 1980Rosenbaum, , 1990. However later in programming, reaction time measures are more sensitive to changes related to the context-specific elements of the movement that are programmed prior to execution, such as the number of elements within a sequence (e.g., Anson, 1982;Deger & Ziegler, 2002), the number of effectors (e.g., Schmidt & Wrisberg, 2004), accuracy demands (e.g., Sidaway, Sekiya, & Fairweather, 1995), and/or movement durations (e.g., Schmidt & Wrisberg, 2004). Performance measures are more commonly used to evaluate the last stage of processing, i.e., execution, and may include measures of kinematic variables and physiologic responses using electromyography (Schmidt & Lee, 2005).
Clinically, speech motor deficits have been assumed to reside in the programming and execution stages of the information processing model, which has determined how these deficits are measured. Despite theoretical disparities, AOS historically has been considered a programming disorder characterized by speech with inappropriate spatial and temporal parameters, as well as sequencing errors (e.g., M. R. McNeil, Robin, & Schmidt, 2009;Wambaugh, Duffy, McNeil, Robin, & Rogers, 2006;. In comparison, dysarthria is considered an execution disorder in which impaired neurophysiology and muscles results in an inability to execute movements properly (Duffy, 2013). Thus, much of the research evaluating speech disorders has focused on the programming and execution stages of processing where the representation is parameterized for specific environmental and task demands. However, the evaluation of speech representations prior to programming and execution, i.e., at the memory selection stage of processing, may provide insight into the abstract fundamental motor and linguistic properties of the representation prior to aberrant programming or execution. For this reason, the design and measures of the current series of studies are focused on manipulating the distinctive properties of the memory representation during the selection stage of processing. To focus on this specific level of processing, Schmidt's (1975Schmidt's ( , 2003 generalized motor program theory is detailed as an information processing framework to understand how memory encoding and retrieval align with speech learning. Generalized motor programs (GMPs) are hypothesized as abstract, context-independent representations that store invariant features of a movement, which include the relative timing, relative force, and movement sequence (Schmidt, 1975(Schmidt, , 2003. Within the information-processing model of motor control previously described, GMPs are selected during the selection stage of processing and then parameterized for a particular task during the motor programming stage ( Figure 1). Thus, GMP memory representations are flexible in their application as there are no specific features of the movement programmed to meet a specific environmental or movement goal. Later processing during the response programming stage may include specification of muscles, training conditions (e.g., speaking versus listening), and environmental demands (e.g., distance to a target). However, reaction time measurements at the response selection stage should not vary based on later processing factors (e.g., training condition). This assumption is evaluated in the following two experiments when comparing two training modalities during a nonword repetition task.
The abstract nature of the motor representation allows a single GMP to direct performance on a wide variety of specific motor behaviors that share the same invariant features (termed a class of actions), minimizing overall cognitive demands (Chamberlin & Magill, 1992;Schmidt, 1975). Motor behaviors within a class of actions may not share similar physical attributes or training conditions; however, generalization is predicted because the same underlying representation, the GMP, is shared across motor behaviors (Chamberlin & Magill, 1992;Schmidt, 1975;Wulf & Schmidt, 1988). As noted earlier, reaction times during the memory selection stage should be stable unless there is response uncertainty present, e.g., varying the number of distinctive makers between two alternative responses (Heuer, 1982;Rosenbaum, 1980Rosenbaum, , 1990. Thus, evaluation of reaction times for a hypothesized class of actions should provide insight into underlying GMP representations. Stable, or nonsignificant reaction time differences, would indicate a set of behaviors share the same underlying GMP, whereas significantly different reaction times may suggest more than one GMP is providing guidance for a set of movements. This prediction is evaluated in both experiments where participants will be challenged to judge nonwords with trained and untrained motor class features. Although Schmidt's (1975Schmidt's ( , 2003 GMP theory has been adopted by speech motor control theorists (for a review: Maas et al., 2008;Meigh, 2017), the underlying speech GMP is still contested. The difficulty in defining GMP representations, and their associated class of actions, may be secondary to the multiple levels of processing that occur during rapid speech production (e.g., combining syllables into words), as well as the interaction between segments and suprasegmental processing (e.g., syllable stress). Two main speech units have been proposed as fundamental memory representations in psycholinguistic and speech motor control theories: stressed syllables (e.g., Cholin & Levelt, 2009;Rapp, Buchwald, & Goldrick, 2014;Sevald, Dell, & Cole, 1995) and phonemes (e.g., Austermann-Hula, Robin, Ballard, Maas, & Robin, 2007).
From a motor perspective, salient characteristics of syllable stress align with the proposed invariant features of GMPs (Schmidt, 1975), including an increase in duration (relative timing) and increase in pitch and intensity (relative force; Meigh, 2014). However, there are also several linguistic variables that provide a predictable set of rules for placement of syllable stress in English words (Chomsky & Halle, 1968;Guion, Clark, Harada, & Wayland, 2003;Hayes, 1982), which provides additional evidence for syllable stress as a potential fundamental memory representation. Specifically, stress production patterns that occur more frequently in English have been hypothesized as a separate GMP from stress patterns that occur less frequently (Aichert & Ziegler, 2004;Cholin & Levelt, 2009;Laganaro, 2005Laganaro, , 2008Staiger & Ziegler, 2008). By examining reaction times of frequent versus non-frequent stress patterns in multisyllabic words in English, an evaluation of syllable stress as a class of actions may be determined. Particularly, stimuli with frequently occurring syllable stress patterns (or trained stress patterns) should have similar stable reaction times compared to stimuli with less frequently occurring stress patterns. This hypothesis was tested in both experiments, in which participants learned first and second syllable-stress patterns in three-syllable nonwords.
Within the information processing model depicted in Figure 1, syllable GMPs would be selected during the memory selection stage; however, specification to meet specific communicative task demands would occur in the programming stage (e.g., Rapp et al., 2014;Sevald et al., 1995). Phonemic features (such as phoneme order) are hypothesized to provide specific parameterization to abstracted, syllable frames (Buchwald & Miozzo, 2012;Rapp et al., 2014), and may alter the timing and force characteristics for specific phonemic contexts (e.g., Aichert & Ziegler, 2004). Thus, reaction times accessed during memory selection would not be influenced by the presence of specific phonemic properties.
Conversely, phoneme representations have also been hypothesized as GMPs (e.g., Austermann-Hula et al., 2008;Ballard et al., 2007). Thus, differences in reaction times may be based on phonemic properties, which may suggest phonemes are a fundamental speech GMP in addition to (or in lieu of) proposed syllable GMPs. Disassociation of phonemic from syllabic properties of stimuli may provide insight into the salient GMP features being processed during memory selection since the GMP has not yet been parameterized for a specific communicative task. This hypothesis was evaluated in both experiments by varying nonword stimuli by syllable stress and phonemic similarity. Although speculation exists that phonemes may be GMPs, there is little specification as to which phonemic features are invariant and govern a class of actions. The construct of similarity has been postulated to be essential for generalization of a motor behavior to a new behavior or environment (e.g., Magill & Hall, 1990;Wood & Ging, 1991); thus, the current studies explored two continua of phoneme similarity (presence of trained phonemes and phoneme order). It was assumed reaction times would decrease as phonemic similarity between nonwords decreased, i.e., nonwords with similar phonemic construction would have the fastest reaction times whereas nonwords with dissimilar phonemic construction would have the slowest reaction times.
In summary, the purpose of this work was to investigate potential fundamental speech representations at specific levels of processing. Specifically, this series of experiments examined the stimulus features encoded during a nonword repetition task, as well as the training conditions influencing encoding, in an effort to better understand what is learned during speech training. Larger prosodic units, i.e., stressed syllables, and smaller segmental units, i.e., phonemes, were investigated in two different experiments that varied in training modality (speaking versus listening). During nonword repetition training, it was predicted participants would encode two frequently occurring stress patterns as class of actions under the direction of a syllable stress GMP (e.g., Aichert & Ziegler, 2004;Cholin & Levelt, 2009;Meigh, 2017) regardless of training modality. Following training, an oldnew judgement task was administered to investigate the stimulus features encoded during training (phonemic, prosodic, or both) and the influence of training modality (production versus perception training). Old-new judgment tasks are frequently used in cognitive science to investigate the salient characteristics of stimuli encoded following training (e.g., Shanks & Berry, 2012;Yonelinas, 2002). This judgment task has also been used to evaluate the similarity between motor stimuli targeted at the response selection stage of motor processing (Wood & Ging, 1991), the proposed processing level where GMPs are activated.
Reaction time, as well as judgement accuracy, obtained during an old-new judgment task were used to address three main hypotheses. First, it was hypothesized trained syllable stress patterns would be judged with greater accuracy and speed than untrained syllable stress patterns during the old-new judgment task. Nonword repetition training was presumed to create a class of actions for two frequently occurring syllable stress patterns. When presented with novel untrained nonwords during the old-new judgment task, it was predicted participants would use the encoded syllable GMP learned during training to guide their judgments. This would result in accurate, fast responses to trained stress patterns compared to untrained stress patterns. However, no difference in performance was predicted for untrained nonwords with trained stress patterns in comparison to one another regardless of the phonemic composition of the nonwords. Any untrained nonwords with the trained stress pattern were predicted to be in the same class of actions and should be recognized quickly by the trained stress pattern (GMP).
Second, it was hypothesized that phonemic stimulus features would influence accuracy and reaction times on the old-new judgment task. Based on the literature, it is unclear if phonemic representations are encoded as GMPs (e.g., Austermann-Hula et al., 2008;Ballard et al., 2007) or parameters of GMPs (e.g., Aichert & Ziegler, 2004). As an initial step, this study investigated whether two phonemic features were encoded during training: specific trained phonemes and phonemic sequence within a consonant-vowel (CV) syllable unit. Participants were expected to respond more quickly on the old-new judgement task to untrained stimuli with trained phonemes and phonemic sequences given the assumption that well-learned phonemic representations would be easier to retrieve from memory than novel representations (Lee, 1988;Shanks, 1995).
Finally, it was hypothesized that accuracy and reaction time measures from the old-new judgment task would be similar across both experiments, as training modality should not influence the GMP encoded during training. Training modality is presumed to be a contextdependent property of GMP representations, and is parametrized during the motor programming stages of processing (Buchwald & Miozzo, 2012;Hulstijn & Van Galen, 1983;Klapp, 1995Klapp, , 1996Schmidt & Wrisberg, 2004). The old-new judgment task was used in this study to target the abstract, context-independent GMPs encoded during speech training, which were theorized to be accessible at the memory selection stage of processing (e.g., Schmidt & Wrisberg, 2004). Thus, the parametrization of modality during training should not influence the underlying stimulus features encoded as a GMP.

Participants
Twenty-nine young adults (15 females, 14 males) between the ages of 18-34 years (M = 25.28, SD = 5.66) were recruited to participate in this study. All participants were required to be monolingual English speakers with no history of speech and hearing disorders. Speech was screened using the Test of Minimal Articulation sentence and reading screening subtests (Secord, 1981), an oral-facial-sensory-motor exam conducted by the principle investigator (a certified, licensed speech-language pathologist), and evaluation of conversational speech for fluency or articulation errors. Hearing was screened using pure tone thresholds at 35 dB HL at 500, 1,000, 2,000, and 4,000 Hz in at least one ear (American Speech-Language-Hearing Association, 1990). Speech discrimination was evaluated using the Northwestern University Auditory Test No.6 word list (Tillman & Carhart, 1966), and all eligible participants correctly identified 98% or better of all words. Auditory processing was evaluated using the Computerized Revised Token Test (McNeil et al., 2015) to ensure participants could adequately process the stimuli and instructions for the experiment. All participants were required to score within the normed ranges for neurologicallyintact populations. All participants signed informed consent documents approved by the University of Pittsburgh Institutional Review Board and were compensated $30.00.

Procedures
1.1.2.1. Overview. The experiment consisted of a single session comprised of two tasks: syllablestress training followed by an old-new judgment task. During syllable-stress training, participants practiced repeating nonwords with varying stress patterns (either first or second syllable stress) using custom software that provided visual feedback regarding stress production. Following syllable stress training, participants completed an old-new judgment task where they were asked to recognize stimuli as "old" or "new" by pressing a button on a response box. This judgment task was used to evaluate features of the representation at the response selection stage of motor programming (Healy & Wohldmann, 2012;Wood & Ging, 1991). Judgments were anticipated based on the type of stimuli, i.e., trained stimuli should elicit an "old" response, whereas untrained stimuli should elicit a "new" response from participants. Reaction time measures from accurate trials on the old-new judgment task were used to evaluate participants' responses to stimuli that varied by phonemic similarity and motor class membership.
1.1.2.2. Syllable stress training. Custom software, "Stimulate," presented an auditory presentation of a nonword stimulus, which the participant would repeat into the microphone. Stimulate then initiated "PRAAT" (Boersma & Weenink, 2013) to analyze intensity patterns of the participant's response to determine the participant's syllable stress production (full details regarding Stimulate and stress analysis can be found in Meigh, 2017). Participants were provided visual feedback from Stimulate on the accuracy of their stress production on 65% of trials. As depicted in Figure 2, the participant's feedback (as represented by yellow circles) was shown in reference to the correct stress pattern (as represented by horizontal bars). The examiner perceptually rated each trial for syllable stress accuracy in real time during training and provided summary feedback regarding accuracy at the end of each training block. Participants were given a maximum of 720 trials (120 trials per block for 6 blocks) to achieve 90% syllable stress accuracy for a given training block. Following each training block, a recognition probe was administered to ensure accuracy of the phonemic representation being encoded during training. Participants were required to listen to sets of three nonwords, one trained and two foil stimuli, to determine which of the three nonwords was the trained item. Ten sets of stimuli were included in each recognition probe and participants were required to achieve 90% accuracy to discontinue training. Syllable stress training continued, alternating between training blocks and recognition probes, until 90% accuracy was achieved on both measures. If a participant was unable to meet criterion on the training blocks or recognition probes within 720 trials, training was discontinued and the experiment was ended. At the end of training it was assumed that participants had encoded first and second syllable stress patterns as the trained motor class, and that the phonemic representations of the nonwords were also accurately encoded into memory.
Experimental nonwords used during syllable stress training were taken from Kendall, McNeil, Shaiman, and Pratt (2005). First and second syllable stress positions were targeted stress positions for this study secondary to their high-frequency of occurrence in three-syllable words in the English language (Clopper, 2002). Each nonword was assigned stress on either the first or second syllable (40% and 60% of the stimuli, respectively). Equal distribution of syllable stress placement was not feasible as consistent mapping of stress and nonword syllable was maintained (e.g., /te/ was always unstressed regardless of its syllable position in a nonword). Filler stimuli with first and second syllable stress were adapted from Kendall et al. (2005), Roy and Chiat (2004), and Dollaghan (1998) to control for response bias in the old-new judgment task. All stimuli were pseudorandomized across all training blocks, and all participants received the same order of training blocks during training. All trained stimuli and their associated stress assignments are provided in Appendix A.
1.1.2.3. Old-new judgment task. For this task, a serial response box (Psychology Software Tools; Model #200A) was placed directly in front of the participant's dominant hand. Participants listened to a nonword and pressed either button 2 or 4 on the response box to indicate the nonword was "old" or "new." Response box buttons 2 and 4 were randomly assigned "old" and "new" positions for each participant, and button assignments were counterbalanced across participants. All stimuli were pseudorandomized and presented using E-prime (v. 2.0 Professionl; Schneider, Eschman, & Zuccolotto, 2002) with the following experimental cycle for a given trial: 250 ms long 500 Hz warning tone, 250 ms silent pause, auditory presentation of a single nonword (mean duration: 1,293 ms), 4,000 ms interval of time to capture the participant response, and a 3,000 ms silent inter-stimulus interval prior to the next trial. Participants listened to a total of sixty nonwords in a single block, and repetitions of a stimulus were not permitted.
Stimuli from syllable stress training and untrained stimuli were included in the old-new judgment task. All untrained stimuli were constructed to vary along a continuum of similarity compared to the Training stimuli on two parameters: syllable stress and phonemic context. Stress varied by trained stress patterns (first and second position) and one untrained pattern (third position). Phonemic similarity varied in two ways: (1) training phonemes used in stimuli (same or different) and (2) order of phonemes within a CV unit (same or different; Table 1). As  Table 1, Transfer Set 1 stimuli varied in order of phonemes only, whereas Transfer Set 2 varied in phoneme order and phonemes used. Transfer Set 3 was constructed to be as different as possible from the Training stimuli in syllable stress (third syllable) and unfamiliar phonemes and phoneme order. All untrained stimuli and their associated stress assignments are shown in Appendix A.

Data analysis
Data from 24 participants were analyzed for this study. Attrition was secondary to three participants failing one or more of the screening procedures, one participant who failed to meet the accuracy criterion required for syllable stress training, and one participant who failed to meet the accuracy criterion during the recognition probe tasks. This study employed a repeated measures design to evaluate reaction times and accuracy across stimuli types. Reaction time and accuracy analyses were conducted separately. Nonparametric statistics were used for all three analyses involving reaction time, as normality assumptions were not met. All reaction times were from correctly answered experimental stimuli derived during the old-new judgment task. Reaction times greater than three standard deviations from the median for a given trial were excluded from analysis, as were reaction times recorded as "0 ms." These latter trials were the result of the participant pushing the response button while Eprime played the stimulus item, rendering an inaccurate reaction time. All responses were included in the accuracy analysis and coded as a dichotomous variable (correct or incorrect).

Results
Two hypotheses were evaluated in this study. First, trained syllable stress patterns would be reacted to faster and with more accuracy than untrained syllable stress patterns. Nonword repetition training was presumed to create a class of actions for two frequently occurring syllable stress patterns. During the old-new judgment task, untrained nonwords with the same stress pattern would be reacted to more quickly and accurately than untrained stress patterns. This was based on the assumption the encoded syllable GMP learned during training would be retrieved from memory to aid participant's judgments. However, no significant difference in reaction time or accuracy was anticipated when comparing untrained stimuli with the trained stress pattern to one another, as all the untrained stimuli shared the same class of actions (i.e., GMP). This hypothesis was evaluated in the Syllable and Error analyses below.
The second hypothesis proposed trained phonemes and phoneme sequences would result in faster and more accurate responses than untrained phonemes/phoneme sequences. Within a GMP framework, phonemes may be parameters of syllable GMPs or separate GMP representations. If phonemes were parameters of syllable GMPs, then no difference in reaction time would be observed. Alternatively, phoneme GMPs should result in the proposed hypothesis, i.e., trained phonemic features will result in faster, more accurate judgements than untrained phonemic features. These hypotheses were evaluated in the Phoneme, Item, and Error analyses below.

Syllable analysis
A Wilcoxon signed-rank test was conducted to evaluate a motor class boundary based on syllable stress between stimuli Transfer Sets 2 and 3. These stimuli had novel phoneme patterns and

Phoneme analysis
A Friedman's Test was conducted to evaluate reaction time differences across stimuli type that varied by phoneme but shared the same syllable stress pattern (Trained, Transfer Sets 1 and 2). Transfer Set 3 was not included in this analysis, as it was considered to be outside the trained motor class (i.e., stress on the third syllable) and did not share any phonemes with the other stimuli. There was a significant difference in reaction times across stimulus type, χ 2 (2) = 18.58, p < .001. Pairwise comparisons were performed (SPSS, 2012) with Bonferroni correction for multiple comparisons. The median reaction times for Transfer Set 1 stimuli (Mdn = 607.85 ms) were significantly slower than the reaction times for the Trained (Mdn = 520.65 ms; p = .003) and Transfer Set 2 (Mdn = 454.39; p < .001) stimuli (see Figure 3). There were no other significant differences.

Item analysis
An item analysis was conducted for each untrained stimulus set to investigate if specific phonemic features (e.g., specific phonemes or phonemic orders) were influencing participants' reaction times. A Friedman's Test was conducted and revealed a significant difference in reaction times across Transfer Set 2 stimuli, χ 2 (12) = 39.38, p < .001. Pairwise comparisons were performed (SPSS, 2012) with Bonferroni correction for multiple comparisons. Reaction times for Transfer Set 2 stimuli/naeθodaep/ were significantly slower than for other stimuli within this set (Mdn = 649 ms). No significant difference in reaction times was observed for the stimuli in Transfer Set 1, χ 2 (9) = 12.54, p = .185. There was a significant difference in reaction times across Transfer Set 3 stimuli, χ 2 (9) = 22.61, p = .007; however, pairwise comparisons performed with Bonferroni correction for multiple comparisons (SPSS, 2012) were nonsignificant at p > .05. All item-analyses are included in Appendix B.

Error analysis
Cochran's Q test (Cochran, 1950) was run to determine if the percentage of accurately identified nonwords varied across stimuli type. Sample size was adequate to use the χ 2 -distribution approximation (Tate & Brown, 1970). Participants were 96.3% accurate in identifying the trained nonwords as "old." Participant accuracy varied across the untrained stimuli when identifying transfer stimuli as "new:" Transfer Set 1-87.9%, Transfer Set 2-97.5%, Transfer Set 3-100%. The percentage of response identified correctly as "old" or "new" was statistically significantly different across stimuli, χ 2 (3) = 43.754, p < .001. Pairwise comparisons were performed using Dunn's (1964) procedure with a Bonferroni correction for multiple comparisons (adjusted p values are presented). There was a significant decrease in the percentage of accurate old-new judgments for Transfer Set 1 compared to all other stimuli (p < .001).
No other significant differences were noted between accuracy judgments across the other stimuli sets.

Discussion
During this experiment, participants repeated nonwords during an extensive training period to learn first and second syllable stress patterns. An old-new judgment task was administered to evaluate the salient stimuli features that influence motor encoding and retrieval. An investigation of syllabic features was conducted to evaluate high-frequency syllable stress patterns as a salient feature encoded during training (Hypothesis 1). Results indicate participants were significantly slower to judge Transfer Set 2 stimuli as "new" compared to Transfer Set 3 stimuli. A significant difference in reaction time between these two sets of stimuli was predicted, indicating the trained syllable stress pattern was encoded. However, the direction of the effect was unexpected. It was hypothesized that the trained stress patterns (i.e., GMP) in Transfer Set 2 would result in faster reaction times, but instead the similar stress pattern increased overall judgment reaction time. These data suggest participants may have based their judgments on other aspects of the stimuli, e.g., phonemic information, as well as the syllable stress pattern.
The phoneme analysis revealed participants were significantly slower to respond to Transfer Set 1 stimuli compared to Trained or Transfer Set 2 stimuli. Like the syllable analysis, the results indicate that increased similarity between untrained and trained stimuli increased reaction times. These results do not align with any of our GMP hypotheses. All three transfer sets included the same trained syllable stress pattern (i.e., the proposed GMP for this study), which we postulated created a class of actions for these stimuli. Thus, reaction time results for these stimuli were predicted to be the same, or nonsignificant, as all three sets of stimuli included the trained syllable stress pattern. A similar pattern of reaction time results was predicted if phonemes were a parameter of the trained syllable stress pattern. However, all analyses suggest phonemes were encoded during training and influenced the old-new judgment results. We had proposed trained phonemes and/or phoneme sequences may be encoded as GMPs during training. However, we would have anticipated a significant decrease in reaction time across Trained, Transfer Set 1, and Transfer Set 2 as the presence of trained phoneme features systematically decreased across these stimuli sets. Instead, the results of this study suggest that untrained phonemes that were not similar or present in the Trained stimuli influenced old-new judgments (as noted in Transfer Set 2).
The slowest Transfer Set 2 stimuli had phonemes identical to those found in the Training stimuli set, whereas the fastest reaction times were associated with novel phonemes ( Table 2). As noted in the table, trained syllables are bolded and underlined. Participants experienced significant increases in reaction time when encountering trained syllables, especially when the syllable was in the initial position of the nonword. This is in contradiction to our hypotheses, which predicted trained syllable patterns in initial position would signal a familiar pattern to the participant early in the auditory presentation of the stimuli. This early recognition would allow the participant to ready him-or herself for a fast response once the stimulus had finished playing.
The item-analysis also revealed that phoneme sequence influenced reaction times. When syllable stress patterns were present later in the nonword (e.g.,/nasaeθoʃ/), reaction times were significantly faster when novel, untrained phonemes were present in the first syllable. Furthermore, stimuli with two novel phonemes in the initial position of the nonword were faster than those stimuli with only a single novel phoneme. Taken together, these findings suggest participants were evaluating the untrained stimuli phoneme-by-phoneme instead of using syllable stress to guide their judgments as was originally predicted in the proposed GMP framework. Specifically, the more novel the phonemes and phoneme sequence in the initial part of the nonword, the faster the reaction time. Additionally, the phoneme-and item-analyses suggest that phoneme features are important to transfer performance even though trained phoneme features did not seem be encoded as a GMP. GMP theory predicts trained features encoded during learning can be retrieved in novel contexts to speed judgments; however, the trained phonemic features in this study slowed down overall reaction times.
Furthermore, participants' accuracy in judging trained stimuli as "old" and untrained stimuli as "new" align with the reaction time results of this study. Participants' were significantly more inaccurate in judging Transfer Set 1 stimuli as compared with any other stimuli set. Although not statistically significant, the most accurate old-new judgments occurred with stimuli that were most dissimilar from the Trained stimuli. Specifically, participants were more accurate in judging novel stimuli as "new" (Transfer Sets 2 and 3) compared with judging the Trained stimuli as "old" despite over an hour of syllable stress training.
A second experiment was conducted to evaluate the influence of training modality (listening versus speaking) on the old-new judgment task. We hypothesized that stimuli features learned during training would be similar regardless of modality, as the underlying representation was abstract and context-independent (e.g., Browman & Goldstein, 1989;Buchwald & Miozzo, 2012;Maas, Barlow, Robin, & Shapiro, 2002). Thus, both syllabic and phonemic features of the stimuli would be encoded, and results from the old-new judgment task for Experiment 1 and 2 would be the same.

Participants
Thirty young adults (13 females, 17 males) between the ages of 18-34 years (M = 20.26, SD = 1.62) were recruited for this study. All participants were required to meet the same screening criterion as in Experiment 1. All participants signed informed consent documents approved by the West Virginia Institutional Review Board and were compensated $20.00.

Procedures
4.1.2.1. Overview. The experimental tasks were exactly the same as in Experiment 1, except during syllable-stress training participants were asked to listen to the stimuli and push a button on a response box to indicate syllable stress production.  following experimental cycle for a given trial: 750 ms visual presentation of the word "listen," 250 ms silent pause, auditory presentation of a single nonword (mean duration: 1,293 ms), 4,000 ms interval of time to capture the participant response, and a 3,000 ms silent inter-stimulus interval prior to the next trial. Participants pushed one of two buttons labeled on a serial response box (Psychology Software Tools; Model #200A) to indicate syllable stress on the first or second syllable of the nonword. As in Experiment 1, participants were provided with visual feedback on their accuracy in identifying stress production on 65% of trials. Additionally, following each training block a recognition probe was administered to ensure accuracy of the phonemic representation being encoded during training. Syllable stress training continued, alternating between training blocks and recognition probes, until 89% accuracy was achieved on both measures. If this accuracy criterion was not met, training was discontinued and the experiment was ended.
4.1.2.3. Old-new judgment task. This task was identical to Experiment 1, where participants listed to nonwords (Trained and Transfer Sets 1-3) and made a judgment of "old" versus "new" by pressing a button on a response box.

Data analysis
Data from sixteen participants were analyzed for this study. Attrition was secondary to one participant failing one or more of the screening procedures, equipment failure during one participant's session, and twelve participants unable to meet the accuracy criterion required at the end of training. Of these twelve participants, two participants were unable to meet criterion on the second training block despite an overall increase in accuracy across training, eight participants were unable to meet the accuracy criterion on both training blocks despite a general increase in overall accuracy, one participant's accuracy decreased with training, and one participant did not meet criterion and no change was observed in performance between blocks one and two. The same analyses were conducted as in Experiment 1. Nonparametric analyses for reaction time data were used as normality assumptions were not met. All reaction times were from correctly answered experimental stimuli, and the same exclusion criteria to remove reaction times were used. All responses were included in the accuracy analysis and coded as dichotomous variables as in Experiment 1.

Results
This experiment evaluated the two main hypotheses put forth in Experiment 1: 1) trained syllable stress patterns would result in faster, more accurate responses than untrained syllable stress patterns and 2) trained phonemes and phoneme sequences would result in faster, more accurate responses than untrained phonemes/phoneme sequences. The comparison of each experiment's analyses provided insight into the third hypothesis for this series of experiments, which predicted accuracy and reaction times results would be similar in Experiments 1 and 2. Training modality (i.e., production versus perception) should not influence GMP encoding during training as this variable is presumed to be a parameter, or context-dependent, property of GMP representations present during the motor programming stage of processing. As such, the influence of training modality should not be present during the old-new judgment task, which targets the memory selection stage of processing. The outcome of the third hypothesis is not directly stated in the results section but is addressed in the discussion section.

Syllable analysis
A Wilcoxon signed-rank test was conducted to evaluate a motor class boundary based on syllable stress between stimuli Transfer Sets 2 and 3. Participants' reaction times were significantly slower when responding to Transfer Set 2 (Mdn = 516 ms) compared to Transfer Set 3 (Mdn = 437.5 ms), z = −2.966, p = .003.

Phoneme analysis
A Friedman's Test was conducted to evaluate reaction time differences across stimuli type that varied by phoneme but shared the same syllable stress pattern (Trained, Transfer Sets 1 and 2). As in Experiment 1, Transfer Set 3 was not included in this analysis as it was considered to be outside the trained motor class. There was a significant difference in reaction times across stimulus type, χ 2 (2) = 9.910, p = .007. Pairwise comparisons were performed (SPSS, 2012) with Bonferroni correction for multiple comparisons. The median reaction times for Transfer Set 1 stimuli (Mdn = 688 ms) were significantly slower than the reaction times for the Trained (Mdn = 493 ms; p = .010) and Transfer Set 2 (Mdn = 516; p = .040) stimuli (see Figure 4). There were no other significant differences.

Error analysis
Cochran's Q test (Cochran, 1950) was run to determine if the percentage of accurately identified nonwords varied across stimuli type. Sample size was adequate to use the χ 2 -distribution approximation (Tate & Brown, 1970). Participants were 89.4% accurate in identifying the trained nonwords as "old." Participant accuracy varied across the untrained stimuli when identifying transfer stimuli as "new:" Transfer Set 1 -76.3%, Transfer Set 2 -90.6%, Transfer Set 3 -97.5%. The percentage of response identified correctly as "old" or "new" was statistically significantly different across stimuli, χ 2 (3) = 34.904, p < .001. Pairwise comparisons were performed using Dunn's (1964) procedure with a Bonferroni correction for multiple comparisons (adjusted p values are presented).
There was a significant decrease in the percentage of accurate old-new judgments for Transfer Set 1 compared to all other stimuli (p < .002). No other significant differences were noted between accuracy judgments across the other stimuli sets.

Discussion
During this experiment, participants listened to nonwords and determined syllable stress patterns during training. An old-new judgment task was then administered to evaluate whether trained syllable stress and phonemic patterns directed judgments on untrained stimuli. The results for Experiment 2 were nearly identical to Experiment 1 despite the increase in participant attrition in this experiment. The syllable analysis revealed participants were significantly slower to judge Transfer Set 2 stimuli as "new" compared to Transfer Set 3 stimuli. This is in contradiction to our first hypothesis, which states trained syllable stress patterns (Transfer Set 2) should be reacted to faster than untrained stress patterns (Transfer Set 3). Additionally, participants were also significantly more inaccurate in their judgments of "old" versus "new" when encountering Transfer Set 1 stimuli compared with any other stimuli set. However, the overall accuracy in identifying nonwords as "old" (i.e., trained stimuli) was lower for participants who trained by listening compared to those participants in Experiment 1 who trained by speaking the nonwords (cf., 89.4% vs. 96.3%). Overall training accuracy was also decreased in this experiment, as more participants were unable to meet the accuracy criterion compared to Experiment 1 (cf., 12 vs. 2 participants). This may suggest that training modality may have influenced the memory representation encoded during training.
The results of the phoneme analysis were also similar to Experiment 1 where reaction times increased with transfer stimuli that were more similar to the Trained stimuli (e.g., Transfer Set 1) than stimuli that were significantly different (e.g., Transfer Set 3). The item-analysis also revealed similar results, in which novel initial phonemes were reacted to faster than stimuli with trained CV combinations. However, further inspection of Transfer Sets 1 and 3 item analysis results reveal a potential exposure effect for similarly constructed stimuli. Some of the transfer stimuli presented early during the old-new judgment task had faster reaction times than very similar stimuli presented later (see Table 3). This exposure effect will be discussed further in the General Discussion section.
In summary, the results of Experiment 2 suggest phonemic similarity and syllable stress frequency influenced reaction times and accuracy judgments on the old-new judgment task. It was assumed that syllable stress frequency would be a main predictor of transfer performance. However, for both experiments phonemic similarity within and across transfer stimuli sets influenced reaction times more broadly than syllable stress alone. This suggests that both syllable stress and phonemic information were influential during the old-new judgment task; however, the direction of the influence was unexpected with dissimilarity of syllabic and phonemic information influencing transfer performance.
Unique to Experiment 2, we explored the abstract, context-independent features of the underlying representation learned during training. Two different modalities of training, speaking versus listening, were used across experiments. Training modality was hypothesized to have negligible effect on the reaction time results of the old-new judgment task, as parameterization was anticipated to occur during the programming stage of motor processing. Results of both experiments were nearly identical suggesting the underlying representations were not overtly influenced by training modality. However, accuracy and reaction time differences were observed between the two experiments with decreased accurate judgements and slower reaction times present in Experiment 2. Moreover, participants' ability to achieve the accuracy criteria during Experiment 2 training was markedly different from Experiment 1 with fewer participants able to achieve or maintain the required level of accuracy. This suggests that portions of the task demands may have influenced the underlying representation being encoded, which would modify the representation selected during the old-new judgment task. Further elaboration of the potential underlying speech representations learned during these experiments will be discussed below.

Overall discussion
The aim of these studies was to investigate the underlying speech features encoded during a nonword repetition task to better understand what is learned during speech training. It was hypothesized that high-frequency syllable stress patterns would be encoded as a GMPs during training, and these GMPs would speed reaction times for novel stimuli with the same syllable stress pattern during an old-new judgment task. It was also predicted that phonemic features would be encoded during training; however, this prediction was exploratory in nature, as the actual features and the level of representation encoded were not specified (e.g., GMP or parameter). Trained phonemic features were also assumed to speed reaction times for similar, novel stimuli during the old-new judgment task; participants would be able to retrieve these encoded features and not have to reconstruct features in working memory. Finally, it was predicted that differences in training modality across experiments would not influence reaction times on the old-new judgment task because the encoded GMPs were theorized as abstract, context-independent memory representations.
These hypotheses were partially met for both studies. Syllabic and phonemic features of the stimuli influenced reaction times on the old-new judgment task in both experiments; however, the direction of the effect was unexpected with participants responding slower to novel stimuli with trained syllabic and phonemic features. Moreover, phonemic features were more influential in overall judgments than the proposed syllable stress GMP. These results suggest phonemic features may be GMPs and should be considered fundamental memory representations in psycholinguistic and speech motor control models. However, syllable stress information was also encoded during training and influenced judgments; thus, information processing models need to incorporate how syllable templates and phonemic features interact to produce speech.
Many psycholinguistic models, as well as Schmidt's (1975Schmidt's ( , 2003 GMP theoretical framework used in this study, posit foundational memory representations as abstract and context-independent. Our third hypothesis evaluated this by contrasting different training modalities (speaking versus listening) across both experiments. Additionally, the old-new judgment task was used specifically to evaluate speech representations selected prior to programming and execution where environmental and task specification may occur. The results of Experiment 1 and 2 were nearly identical, which supported our third hypothesis; however, participants were slower and more inaccurate in making their judgments when they were only allowed to listen during training versus speaking. This was also noted in the attrition of participants in Experiments 1 and 2, where a substantial number of participants were unable to meet the accuracy criteria during perceptual training in Experiment 2. This suggests additional motor information may have been encoded during speech training, and this additional information may provide a richer, more specific memory representation.
There were several limitations to this study that may have contributed to these findings. The oldnew judgment task was used to evaluate differences in the memory representation of the stimuli prior to motor programming or execution (Healy & Wohldmann, 2012;Wood & Ging, 1991). The instructions for this task to respond quickly and accurately may have biased participants to focus their attention on the initial portion of the nonwords (e.g., novel phonemes) in an effort to increase their reaction time. If this strategy were used, participants would quickly judge Transfer Set 3 stimuli as novel based on the initial phoneme but would require significant time to determine the novel characteristics of Transfer Set 1 to assign a judgment of "new." It is unclear if instructional bias influenced old-new judgments in this study. Studies have directly manipulated instructions to induce focus on specific aspects of stimuli during and after extensive training, and participants' responses are not always changed (e.g., learning artificial grammar; Dienes, Broadbent, & Berry, 1991;Vokey & Brooks, 1992; n. Expeirment 1). Moreover, it remains unclear what aspects of the instructions (e.g., focusing on a particular facet of the stimuli or task) influence participants' responses.
Additionally, the experimental design and lack of randomization of the old-new judgment task across participants may have also influenced participant's judgments due to previous exposure to the stimuli. Randomization of the old-new judgment task stimuli was limited to a single experimental block presented to all participants. The item-analysis for Experiment 2 suggests an order effect influenced reaction times for specific stimuli, which was also likely present during Experiment 1. However, this limitation presents an insight into how stimuli are encoded into memory with only a single, brief exposure on the old-new judgment task.
The theoretical framework for these studies utilized an information-processing model of speech motor control (e.g., Hulstijn & Van Galen, 1983;Meyer & Gordon, 1985;Rosenbaum, 1980;Van der Merwe, 1997). These models, as well as many psycholinguistic models, rely on a rule-based system of memory, where the central representation is a series of context-independent set of rules or abstracted information (e.g., invariant features of a motor program; Doody & Zelaznik, 1988;Logan, 1988;Shanks, 1995). The process of abstraction occurs during training where central information about a stimulus is summarized into an averaged representation that lacks specificity or context (Dopkins & Gleason, 1997;Neal & Hesketh, 1997;Posner & Keele, 1970;Shanks, 1995). Abstracted memory representations align with homogenous transfer predictions, e.g., motor class differences, as only a single variable needs to be matched for positive transfer (e.g., all behaviors must share the same invariant features/rules; Rochet-Capellan, Richer, & Ostry, 2012).
This model was assumed for this experiment, and significant amounts of training were undertaken to result in a refined, well-learned memory representation for syllable stress and phonemes. Additionally, it was anticipated the trained features of the stimuli would be the memory representations retrieved to aid judgments during the old-new judgment task. Thus, the influence of untrained stimuli presented for a single trial was not accounted for as a potential variable in influencing judgments. The nearly identical results of these two experiments suggest the underlying memory representations may be partially abstract. However, accuracy differences between production and listening training, as well as the potential exposure effect noted during the item-analyses, suggest that task and environmental demands may also influence encoding of the underlying memory representation. Theoretical frameworks that incorporate abstract memory representations, such as GMP theory (Schmidt, 1975(Schmidt, , 2003, do not incorporate task or environmental specifications within the memory representation; instead, these features are programmed at later stages of information processing. Alternative accounts of memory and information processing that incorporate specificity at the memory selection stage may provide further understanding of the results of this study and the memory representation encoded during training. Exemplar models of memory rely on specific, context-dependent information encoded into multiple representations (or exemplars) during training (Logan, 1988;Rochet-Capellan et al., 2012;Tremblay, Houle, & Ostry, 2008). Similar exemplars are encoded closely in psychological space (e.g., Nosofsky, 1986Nosofsky, , 1992Nosofsky, Little, Donkin, & Fific, 2011) during motor training, and the intersections of similar features during memory retrieval impact reaction time (Downing-Doucet & Guérard, 2014;Rochet-Capellan et al., 2012) or overall transfer performance (Rochet-Capellan et al., 2012). Transfer predictions are similar to rule-based models of memory; however, instead of strengthening an underlying memory representation, multiple individual representations are activated.
Task and environmental demands may influence transfer performance in exemplar models more so than in rule-based models of memory representation. Whereas abstracted representations are retrieved based on matching of context-independent features, exemplars may be retrieved based on the variety of features included in the memory representation. Task demands may rely on matching similar or dissimilar features to achieve the task goal. For motor-related task demands, similar exemplars may aggregate to produce complex motor behaviors. This has been noted with speech production (Rochet-Capellan et al., 2012;Tremblay et al., 2008), typing (Crump & Logan, 2010), and hand reaching (Meulenbroek, Thomassen, Rosenbaum, Loukopoulos, & Vaughan, 1996;Rosenbaum, Loukopoulos, Meulenbroek, Vaughan, & Engelbrecht, 1995;Rosenbaum, Meulenbroek, Vaughan, & Jansen, 2001). Alternatively, during judgment tasks similarity between exemplars would increase interference making it difficult to distinguish old from new stimuli (Downing-Doucet & Guérard, 2014;Johns & Mewhort, 2002. Comparable reaction time patterns to those observed in this study are noted with other motor recognition tasks, including recalling handgrip positions (Downing-Doucet & Guérard, 2014) and finger presses during piano chord learning (Wifall, McMurray, & Hazeltine, 2014). For speech, syllabic and phonemic information are encoded into memory; however, this study suggests understanding the underlying memory representation as a singular entity (e.g., GMP) may not be straightforward.
Retrieval of trained memory representations to enhance transfer performance on untrained behaviors is the goal of clinical practice in speech-language pathology. However, transfer predictions are related to the underlying theoretical memory representations. The memory model we ascribe dictates these predictions, and the resultant stimuli, treatment design, and measurements we use to measure transfer performance. Rule-based theories, as proposed in this series of studies, evaluate a common rule shared between two motor behaviors that may only consider a singular feature of the behavior (Pothos, 2007). Exemplar-based models of memory may provide a more flexible adaptation of similarity across a range of features associated with the speech movement and the task demands. From both a developmental, as well as a disordered, standpoint, patients with communication disorders may not be able to successfully identify and encode the speech features needed for successful communication. Instead of attempting to identify and train a single speech unit (e.g., syllable stress pattern), multiple features of speech should be targeted related to the individual's communication delay or disorder. The results of these studies suggest multiple features of the nonword were encoded during production and listening training sessions despite instructions, feedback, and focus on a single feature (syllable stress).
In conclusion, the initial findings of these studies provide a foundation for future work to explore the fundamental features encoded during speech learning, as well as those features inherent to enhanced transfer performance in novel contexts or with new stimuli. Future studies should evaluate multiple features of speech stimuli and task demands to determine variables that enhance and detract from overall transfer performance in typical training situations. This would inform the specific task and stimuli features that should be targeted in treatment with different clinical populations (e.g., individuals with AOS). Additionally, future studies should include children with typically-developing speech to further investigate how specific phonologic and motoric features of speech may be acquired through training as phonologic and motor systems are developing. The current study may not provide adequate insight into how speech motor learning occurs in a developing system, and similar studies with typically-developing children may shed light on specific stimuli and/or task features that are required for successful communication throughout development. Such insight could then be applied to clinical populations suffering from developmental communication disorders. as well as the Audrey Holland Endowed Research Award