Expressive recall and recognition as complementary measures to assess novel word learning ability in aphasia

Novel word learning ability has been associated with language treatment outcomes in people with aphasia (PWA), and its assessment could inform prognosis and rehabilitation. We used a brief experimental task to examine novel word learning in PWA, determine the value of phonological cueing in assessing learning outcomes


Introduction
Learning novel words is a fundamental cognitive ability that supports the growth of one's mental lexicon throughout the lifespan. New word learning entails the ability to acquire novel word forms, their conceptual representations and the associative links between them (Gupta & Tisdale, 2009). Establishing these reciprocal associative connections between novel word forms and their meanings during learning enables speakers to access a semantic representation via its associated novel word form (receptive link) and access a novel word form via its semantic representation (expressive link) (Gupta & Tisdale, 2009). The ability to learn novel word-referent mappings and to ultimately access new words via their conceptual representations can be assessed using recall and recognition tests, although these measures place different requirements on the memory and language systems. Recall measures require effortful item retrieval while recognition performance relies on item familiarity (Haist et al., 1992). Further, while recall measures engage verbal output requirements, recognition tests bypass the high demands that recall measures pose on language production (Peñaloza et al., 2022). In this way, both recall and recognition measures can provide a comprehensive overview of novel word learning ability.
The study of novel word learning ability in aphasia using different complementary metrics is of particular interest, given that people with aphasia (PWA) often experience lexical access deficits after brain insult (Laine & Martin, 2006) and recent studies have demonstrated an association between novel word learning ability and response to anomia therapy in this population (Dignam et al., 2016;Tuomiranta et al., 2014a). Examining word learning ability in aphasia is theoretically important, as it may lead to a better understanding of the interactions between language processing and memory/learning systems in the presence of language impairment. Characterizing word learning in PWA is also clinically relevant, as word learning ability could be a potential mechanism underlying the recovery of word retrieval deficits (Basso et al., 2001). Learning mechanisms may contribute to strengthening the links between word forms and meanings to regain access to lexical knowledge and help anomia recovery via brain plasticity processes (Kelly & Armstrong, 2009). Notably, the relationship between novel word learning ability and anomia therapy outcomes in PWA (Dignam et al., 2016;Tuomiranta et al., 2014a) underscores its potential predictive value and suggests that assessing lexical learning capacity via recall and recognition measures could be useful for screening and prognostic purposes. For instance, expressive recall tasks could provide an objective estimation of potential individual improvement on single word production, its consolidation and long-term maintenance (Peñaloza et al., 2022). This is particularly relevant as there are no currently available neuropsychological tests developed to assess the potential for word learning in PWA, and traditional aphasia tests characterize residual language processing but not language learning abilities. The present study sought to evaluate novel word learning ability in PWA using both recall and recognition measures, to determine the effects of phonological cueing on expressive recall which is particularly demanding for PWA, and to identify language, cognitive and lesionrelated factors that modulate this learning ability.
Numerous studies have found that PWA can learn novel words and word-meaning associations despite damage to their language processing system, albeit showing great individual variability in their learning performance which is often slower and below that of healthy controls (see Peñaloza et al., 2022, for a review). This learning ability has been demonstrated across a variety of studies including experimental learning tasks that use completely novel stimuli such as novel word forms and novel word-novel referent pairings. The use of completely novel stimuli in the study of verbal learning in aphasia offers several advantages including a purer measurement of learning ability as it minimizes compensatory influences from previously existing linguistic knowledge (Peñaloza et al., 2022). In this vein, studies have demonstrated that some PWA preserve their ability to learn novel words in different learning paradigms such as exposure to unknown spoken words that need to be segmented from a continuous speech stream (Peñaloza et al., 2015), training in unambiguous referential contexts via repetition (Coran et al., 2020;Dignam et al., 2016;Gupta et al., 2006;Tuomiranta et al., 2011;, via orthography (Laganaro et al., 2006), using a variety of combined training methods (Kelly & Armstrong, 2009), training across auditory and visual modalities (Tuomiranta et al., 2014a;2014b), and in more natural contexts of referential ambiguity via probabilistic and incidental associative learning with and without feedback (Peñaloza et al., 2016;. However, little research has been dedicated to examine word learning ability in PWA more comprehensively using both recall and recognition measures. Recognition tests often require deciding whether a trained conceptual referent or novel word is old or new. They may also require identifying a trained conceptual referent or novel word among foils when being presented with its associated trained word or conceptual referent respectively. In turn, recall measures require the oral or written production of a trained novel word form when being presented with its corresponding conceptual referent. Importantly, most studies have demonstrated spared recognition performance in PWA in the absence of verbal output requirements, while their findings on recall measures have shown mixed results (see Peñaloza et al., 2022 for a review). For instance, Gupta et al. (2006) examined word learning ability in 20 PWA and found significant learning in a task measuring lexical acquisition via recognition, while the task measuring learning via expressive recall evidenced performance at floor levels. Similarly, Dignam et al. (2016) assessed novel word learning in 30 PWA and demonstrated that while most participants showed significant learning on recognition tests, only 4 PWA showed successful learning as measured by expressive recall tasks. Recently, Coran et al. (2020) described 3 PWA who underwent an extensive novel word learning practice schedule involving training on novel word-referent associations and learning tests of recognition and expressive recall. All participants showed successful learning as measured by recognition tests yet expressive recall performance was impaired for one of them and limited for the other two PWA. Altogether, these studies suggest that PWA can show variable degrees of learning when both assessment approaches are combined (Coran et al., 2020;Dignam et al., 2006;Gupta et al., 2006), with recognition measures showing largely preserved learning while expressive recall measures suggest that verbal demonstrations of learning are much more challenging for this population. Of note, preliminary research has shown that novel word learning as assessed via expressive recall of trained items can be facilitated when preceded by learning measured via recognition tasks including the same items .
In contrast, other single-case studies have shown that some PWA can demonstrate significant novel word learning as measured via spontaneous and cued recall tasks after intensive training via repetition (Tuomiranta et al., 2011; with superior learning when training involves input in the written modality as compared to the auditory modality (Tuomiranta et al., 2014a;2014b). These studies have also revealed that novel word learning in PWA can be comparable to that of healthy controls (Tuomiranta et al., 2014a;2014b) and achieve significant long-term maintenance for up to 6 months (Tuomiranta et al., 2012;2014a). Notably, these studies (i) used a flexible scoring system on expressive recall tests to account for phonological proximity to the correct target response and/or (ii) provided phonological cueing to aid the verbal retrieval of newly trained words during testing (Tuomiranta et al., 2011;2014a;2014b). These procedures revealed significant improvement on expressive recall measures of word learning in PWA even when the successful production of trained items had not yet been fully achieved (but see Dignam et al., 2016, for low expressive recall performance despite using similar scoring methods). Indeed, expressive recall measures should consider the inconsistency with which PWA successfully access their acquired vocabulary (Laine & Martin, 2006). This is particularly important given that word learning as assessed via expressive recall in aphasia seems to be modulated by the phonological complexity of the trained words (Kroenke et al., 2013). These findings highlight the importance of considering flexible measurements that effectively capture learning ability in aphasia.
Another key issue in the study of novel word learning in PWA is the examination of factors that underlie individual differences in their learning success. Novel word learning in aphasia can be influenced by a person's profile of preserved versus impaired language and cognitive abilities, as well as lesion-related factors (Peñaloza et al., 2022). In terms of language processing abilities, lexical-semantic processing has been associated with novel word learning as evaluated via recognition (Dignam et al., 2016;Gupta et al., 2006), and with learning as measured via recognition and expressive recall . Moreover, two single-case studies found that PWA with better lexical-semantic processing also demonstrated better word learning and long-term maintenance on expressive recall measures relative to PWA with impaired lexical-semantic processing (Tuomiranta et al., 2011;. Also, better phonological processing is associated with better ability to learn nonwords (without visual referents) via repetition, but it is not associated with the ability to learn novel word-referent pairings as measured via recognition (Gupta et al., 2006). Moreover, there is evidence that the better the lexico-semantic abilities of PWA with a predominant phonological impairment, the better their gesture-supported word learning ability (Kroenke et al., 2013). In turn, the better the phonological processing abilities of PWA, the better their pure verbal learning via repetition (Kroenke et al., 2013). This evidence suggests that in aphasia, specific language abilities such as lexical-semantic and phonological processing modulate word learning ability.
Novel word learning in aphasia may also rely on the integrity of cognitive processes such as verbal short-term memory (STM) which makes essential contributions to novel word learning (Baddeley, 2003) but is often affected in PWA (Martin & Saffran, 1997). Freedman and Martin (2001) examined word learning in 5 PWA with phonological or semantic STM deficits using phonological and lexical-semantic tasks that measured learning via expressive recall. Their findings showed that deficits in learning novel phonological and lexical-semantic information align with profiles of verbal STM impairment in aphasia. Other single case studies have shown better learning on expressive recall tasks in PWA with better verbal STM as measured by nonword repetition (Tuomiranta et al., 2011) and both digit and word span tasks (Tuomiranta et al., 2012). Moreover, verbal STM also modulates learning measured via recognition in both healthy controls  and PWA (Peñaloza et al., 2015;, although this association is less clear when accounting for aphasia severity . Finally, lesion-related factors, such as lesion volume of specific brain areas supporting word learning may also explain individual variability in lexical acquisition in PWA. Previous studies have shown impaired novel word learning in PWA with left inferior frontal lesions (Peñaloza et al., 2015;, in line with fMRI evidence showing increased activation in inferior frontal regions during novel word learning in healthy adults (Gore et al., 2022;Sliwinska et al., 2017). However, previous lesion findings require corroboration from detailed structural imaging studies that allow quantification of lesion volume in relevant brain regions since lesion location has been previously defined only broadly according to clinical radiological reports (Peñaloza et al., 2015;. Diffusion Tensor Imaging (DTI) studies also suggest that damage to critical white matter pathways such as the left arcuate fasciculus and the inferior longitudinal fasciculi can impair learning ability in aphasia (Coran et al., 2020;Tuomiranta et al., 2014a). However, the contribution of lesion volume in key brain regions such as the left inferior frontal gyrus (IFG) to novel word learning has not yet been examined in aphasia.
The present study aimed to evaluate lexical acquisition in PWA using a brief, potentially clinically useful experimental task that required learning novel word-referent mappings and used expressive recall and recognition tests as indexes of word learning ability. As in previous single case studies (Tuomiranta et al., 2011;2014a;2014b), our learning task incorporated a flexible scoring criterion to increase the possibility to capture improvement on expressive recall performance in PWA. Our first aim was to evaluate novel word learning ability in PWA (i) at the group level, comparing their learning performance on expressive recall and recognition tests over time to that of a healthy control group matched for age, and (ii) at the individual level, comparing their learning outcomes on both measures at the end of training to those of the healthy control group. Based on prior research and considering the flexible scoring method employed in our study, we expected that the group-level analyses would show a significant improvement of word learning ability in PWA as measured by recognition (Gupta et al., 2006;Dignam et al., 2016) and expressive recall (Tuomiranta et al., 2011; during training, although below that of the healthy controls. We also expected that case-by-case analyses would reveal PWA who can demonstrate learning outcomes on both measures comparable to those of the control group, as well as learning outcomes on recognition significantly above chance level performance. Our second aim was to examine whether phonological cueing during expressive recall testing at the end of training would facilitate the evaluation of new word learning ability in PWA. Similar to previous studies (Tuomiranta et al., 2011;2014b), we expected that PWA would benefit from phonological cueing provided on the last recall test at the end of training, demonstrating significantly better learning outcomes with versus without phonological cues once training was discontinued. Our final aim was to determine whether novel word learning in PWA was modulated by three aphasia-related factors: (i) single word processing abilities, (ii) verbal STM, and (iii) left IFG lesion volume previously linked to word learning deficits in aphasia (Peñaloza et al., 2015;. Additionally, we examined whether verbal STM would modulate word learning ability in healthy controls as found in previous research . We expected to find that verbal STM would modulate word learning ability in both healthy controls  and in PWA (Freedman & Martin, 2001;Peñaloza et al., 2015;. We also expected that word learning performance in PWA would be influenced by their post-stroke single-word processing abilities (Dignam et al., 2016;Gupta et al., 2006) and left IFG lesion volume (Peñaloza et al., 2015;. It is worth noting that identifying factors that modulate new word learning ability in aphasia is important given that no previous studies have examined their contribution to learning as measured by both expressive recall and recognition. Previous research has only evaluated the effects of language, cognitive and lesion factors on recognition performance alone in more complex learning paradigms (Peñaloza et al., 2016;. Some studies have also investigated language but not cognitive or lesion factors as predictors of learning ability (Dignam et al., 2016;Gupta et al., 2006). In addition, given that expressive recall has been found to be at minimal levels (Dignam et al., 2016;Gupta et al., 2006) with successful performance reported only in single case studies (Tuomiranta et al., 2011;2014a;2014b), the factors underlying individual variation in expressive recall as an index of new word learning in aphasia have not been fully examined.

Participants
The total sample consisted of 31 participants across two groups. The first group included 12 adults with stroke-induced chronic aphasia (2 female, mean age = 56.7 years, SD = 8.9, range = 42-73 years; mean number of years of education = 10.4, SD = 3.3, range = 5-16 years; mean time post-stroke onset = 27.9 months, SD = 10.4, range = 9-41 months). PWA were recruited across three hospitals in Barcelona, Spain. To be included in this study, PWA were required to (i) be between 30 and 80 years old, (ii) be Spanish speakers, (iii) have persistent aphasia as determined by speech and language assessments at least 6 months after a single left hemisphere stroke verified by medical records, (iv) be able to follow instructions, and (v) be eligible for MRI scanning. They did not present with a history of psychiatric or neurological disorders other than stroke and did not report any severe visual or auditory difficulties (see Table 1 for demographics and clinical information).
The second group included 19 healthy controls (12 female; mean age = 59.9 years, SD = 7.6, range = 50-77 years; mean number of years of education = 13.9 years, SD = 3.2, range = 9-18 years). 1 All healthy controls were Spanish speakers, presented normal or corrected-tonormal vision and audition and did not present with a history of psychiatric or neurological disorders, subjective memory complaints, memory disorders or learning disabilities as reported in their initial interview. The two groups were matched by age [t (29) = − 1.04, p =.3]. All participants gave their written informed consent to undergo study procedures approved by the Hospital Universitari de Bellvitge Institutional Review Board (reference number: PR224/12).

Aphasia diagnosis, language and verbal STM assessments
All PWA were diagnosed during hospital admission and were re-1 The mean number of years of education and SD are reported for 17 healthy controls since two participants reported not to have received any formal education in the classroom although they had acquired reading, writing and arithmetic skills via informal education (i.e., homeschooling). The aphasia and healthy control groups were largely similar in their number of years of education although this comparison could not be reliably tested with the mean and SD of just 17 participants while 19 healthy controls were included in all statistical analyses. assessed for this study at least six months post stroke. Table 2 summarizes the scores of PWA on all language and verbal STM assessments. The Spanish version of the Boston Diagnostic Aphasia Examination (BDAE-III, Goodglass et al., 2005) was used to determine their clinical aphasia profile, aphasia severity and their Language Competence Index measuring overall expression and comprehension performance. We also used the following BDAE-III subtests. The Word Comprehension (section: Basic Word Discrimination), Commands and the Complex Ideational Material subtests were used to measure verbal comprehension, the Boston Naming Test (BNT) was used to assess naming; the Repetition of Sentences subtest was used to evaluate verbal repetition, and the Oral Reading subtest (sections: Basic Oral Word Reading and Oral Reading of Sentences with Comprehension) were used to evaluate reading ability. The Token Test (De Renzi & Faglioni, 1978) was employed to assess verbal comprehension of increasingly complex demands. Finally, tasks of semantic (animals) and letter fluency (letters P, M, R) (Peña-Casanova et al., 2009) were administered to measure word retrieval.
Selected subtests of the Temple Assessment of Language and Shortterm memory in Aphasia (TALSA, Martin et al., 2018) were used to evaluate phonological processing and verbal STM (the latter in both PWA and healthy controls). The Phoneme Discrimination subtest required deciding whether spoken pairs of words or nonwords sounded the same or not. The Rhyming Judgements subtest required identifying whether spoken words or nonwords rhymed or not. Both subtests were administered under two memory load conditions (1 and 5 s between the presentation of each item). The proportion of correct responses across subtests and conditions was averaged into a phonological processing composite score. The Nonword Repetition subtest required repeating 15 nonwords of varying length (1, 2, or 3 syllables) across two memory load conditions (1 and 5 s unfilled intervals). The proportion of correct responses was averaged across conditions to compute a nonword repetition composite score reflective of phonological STM.
In order to examine factors that modulate learning ability in aphasia (aim 3), we computed a composite score for single-word processing abilities reflecting the proportion of accurate responses of each participant across the BDAE-III Word Comprehension subtest and the Boston Naming Test. These two measures combine phonological and lexicalsemantic abilities which have been associated with novel word learning ability in PWA (Dignam et al., 2016;Gupta et al., 2006;Tuomiranta et al., 2011;. Additionally, the nonword repetition composite score was chosen as a measure of phonological STM (Gathercole et al., 1994) since it allows for the assessment of encoding, storage and production processes while minimizing the potential contributions of semantic knowledge and item familiarity to STM performance as measured by conventional span tasks (Perrachione et al., 2017). Moreover, the requirement of repeating unfamiliar phoneme sequences is close to the experience of learning novel words (Baddeley, 2003), as demonstrated in our previous work showing an association between nonword repetition and word learning performance in PWA (Peñaloza et al., 2016; and healthy individuals .

Image acquisition and preprocessing
The structural MRI of PWA was obtained at Hospital Clinic (Barcelona, Spain) on a Siemens Magneton 3 T scanner with the Syngo MR B17 software, using a 32-channel head-coil. High resolution T1 brain images (MPRAGE) were obtained (TR = 1970 ms; TE = 2.34 ms; slice thickness = 1.0 mm; acquisition matrix, 256 256; voxel size, 1.0 × 0.8 × 0.4 mm). The stroke lesion was traced manually on MRIcron (https://www.nitrc. org/projects/mricron/) for each PWA and the resulting lesion masks were normalized to the Montreal Neurological Institute (MNI) space using SPM12 (https://www.fil.ion.ucl.ac.uk/spm/software/spm12/). Lesion masks were then used to determine the lesion localization and to compute the left IFG lesion volume (Table 1) to evaluate its effects on novel word learning in PWA. Although eligible for the neuroimaging protocol upon initial screening, 3 PWA were unavailable for MRI scanning due to a change in personal circumstances which prevented their participation in this part of the study. Fig. 1 depicts the lesion overlay map for the 9 PWA who completed the MRI scanning procedures.

Experimental word learning task
The associative novel word learning task required participants to learn 6 novel word-referent pairings involving 6 black-and-white pictures of unknown objects (AFE paradigm, Laine & Salmelin, 2010) paired with 6 pseudowords (3 bisyllabic, 3 trisyllabic) created according to the phonotactic rules of Spanish ( Fig. 2A). The learning task consisted of 3 training cycles, each with a learning phase of 4 training blocks, and a testing phase including a naming test which served as a measure of expressive recall, followed by a recognition test (Fig. 2B). The stimuli were presented on a laptop computer using PowerPoint presentation and the task administration had a duration of approximately 20 min for each participant. Each training block included 6 learning trials comprising the randomized presentation of all stimuli. Each learning trial showed the picture of an unknown object with its corresponding written label (pseudoword) on the computer screen. The object label was read aloud by the examiner and participants needed to repeat it aloud. Scores on the BDAE-III below the 50th percentile are marked in bold to identify the cases with the most severe deficits as compared to the normative sample of PWA included in the development of this aphasia diagnostic battery. Scores on the Token test and the verbal fluency tasks reflecting deficits according to Spanish normative data are also marked in bold. Scores on the TALSA battery subtests are presented individually and as composite scores of phonological processing and phonological verbal short-term memory (nonword repetition) expressed as proportions of correct responses. 1secU = 1-second unfilled interval condition; 5secU = 5-second unfilled interval condition. The learning task included 3 naming tests as measures of expressive recall, 1 per training cycle, and each test included 6 randomized trials. Each test trial presented a trained object picture for up to 30 s for naming. To determine whether learning outcomes on expressive recall at the end of training could be more clearly demonstrated in PWA via phonological cueing, the examiner provided the first syllable of the target pseudoword if participants could not retrieve it accurately on Naming Test 3, allowing them 10 additional seconds to name the test object. For instance, if the participant could not retrieve the correct name for the item "balute", the syllable "ba" was provided as a phonological cue. In this way, phonological cues were only provided after naming failures. Participants' oral responses were annotated by hand by the examiner and the naming tests were scored offline. As done in previous research (Tuomiranta et al., 2011;, Naming Tests 1, 2, and 3 were scored using a permissive criterion [total score = fully correct responses + responses with a single phoneme change (addition, omission, or change of position) in an otherwise correct response]. Additionally, we computed a Naming with Phonological Cueing score [total score = fully correct responses produced on Naming Test 3 + fully correct responses produced after a phonological cue].
The learning task also included 3 self-paced recognition tests, 1 per training cycle following a naming test, and each recognition test included 12 randomized trials. Each trial presented a trained object picture with a trained spoken word and participants needed to decide whether the pairing was correct or not. Each trained object was tested twice, once with the correct word reflecting a correct word-referent association, and once with an incorrect trained word matching another object of the training set, thus reflecting an incorrect wordreferent association. Incorrect word-referent pairings appeared only once in a recognition test and followed a randomized order across recognition tests.

Statistical analyses
All group-level statistical analyses were performed using R (R Core Team, version 4.2.2). We used generalized linear mixed models (GLMMs) to evaluate the likelihood of correct responses to items in both measures of novel word learning, namely expressive recall and recognition, comparing PWA and healthy control groups across all three training cycles. GLMMs (lme4 package; Bates et al., 2015) were selected because they are well suited to work with binomially distributed responses reflecting accuracy in performance (e.g., correct versus incorrect) while accounting for sources of variation in the data unrelated to the experimental design (Jaeger, 2008). The GLMMs (binomial family; logit link function) were constructed following the forward method for variable selection, and we compared them using the Akaike Information Criterion (AIC) and the Likelihood Ratio Test (LRT) to identify the best model explaining the variability of the data. Of note, a backward approach to model building with all three predictors of interest in a reduced sample would lead to a highly complex model with many interactions and a low number of data points. Hence, the forward method was preferred since it allowed us to control the complexity of the models avoiding too many and unnecessary interactions. The order of predictors was guided by theory while also considering our data availability: (i) phonological STM was examined first since this is a relevant predictor of learning ability in both healthy adults (n = 19) and PWA (n = 12) and the data were available for all the sample, (ii) single word processing abilities were assessed next with just PWA since there is large evidence that language abilities predict word learning ability in PWA and the data were available for all participants in this group (n = 12), and (iii) lesion volume was assessed last since lesion characteristics have been less frequently examined in relation to learning ability and this is the predictor with the least number of cases with available data (n = 9). Notably, single word processing abilities were not significantly correlated with phonological STM (r = 0.466, p =.146) or with left IFG lesion volume (r = − 0.611, p =.081). Phonological STM was also not significantly correlated with left IFG lesion volume (r = − 0.250, p =.516). Multicollinearity was further assessed and ruled out (tolerance > 0.95 and Variance Inflation Factors VIFs < 1.5 in the final models). This allowed us to evaluate their potential contributions to novel word learning in PWA as independent from one another.
All GLMMs are presented in Supplementary Table 1 as per best practice reporting guidelines (Meteyard & Davies, 2020). Results from the GLMMs were extracted, converted from log-odds to probabilities, and plotted using R packages ggeffects (Lüdecke, 2018) and ggsignif (Ahlmann-Eltze & Patil, 2021). In each model, item-level accuracy on either the naming or the recognition tests (i.e., scored as 0 or 1) were defined as the dependent variable. We included random intercepts for participant and item to allow for differences in accuracy on both tests according to individual participant and item characteristics. However, random slopes could not be included because the final models became singular indicating that they could be overfitted and underpowered  (Bates et al., 2015). Fixed effects included group (PWA; healthy controls), training cycle, and phonological STM, the latter to evaluate whether it was a significant predictor of expressive recall and recognition performance in both groups. Additionally, the GLMMs constructed for recognition performance also considered whether the recognition test presented a trained object picture with its corresponding trained word (correct word-referent association) or with a trained word that matched another object of the training set (incorrect word-referent association), hereafter: association (correct; incorrect) as a fixed effect. All models also considered all possible interactions between fixed factors. We followed the same procedure to construct separate GLMMs to test whether specific measures that were only relevant to PWA such as single word processing abilities and left IFG lesion volume were predictors of learning performance on both tests in this group. These two models were constructed separately since single word processing scores were available for all participants in the aphasia group but the lesion volume was available for only nine of them. This approach allowed us to avoid reducing the analyses of predictors of word learning ability in aphasia to just the minimum sample for whom all data were available which may have led to model overfitting. When needed, post-hoc analyses were conducted via t-test comparisons to identify the levels that showed significant differences, using the False Discovery Rate (FDR) to correct p-values (emmeans package, Lenth et al., 2019). Both the t-ratio and the corrected p-values (p adj ) are reported. All the steps performed to obtain the final models are detailed in the R file provided in the OSF platform available at https://osf.io/879gf/.
Case-control comparisons used modified t-tests based on classical inferential methods for single cases (Crawford & Garthwaite, 2002) implemented in the Singlims_ES software available at https://homepag es.abdn.ac.uk/j.crawford/pages/dept/psychom.htm to compare the learning outcomes of PWA on both learning measures at the end of training to those of the healthy control group. Finally, the binomial test was used to examine if the overall recognition performance of the PWA at the end of training was significantly above chance level. Case-control comparisons as per reporting guidelines (Crawford et al., 2010) and binomial test results are presented in Supplementary Tables 2, 3, 4.

Novel word learning as measured by expressive recall in PWA and healthy controls
To evaluate change over time on expressive recall across the healthy control and aphasia groups, we followed the forward method for variable selection in the GLMM. We started with a null model that only considered the random intercepts, and the predictors were added one by one. The final GLMM [naming accuracy ~ cycle + group + (1|participant) + (1|item); χ 2 (3) = 37.2, p = 4.28x10 − 8 , AIC = 563, LL = -275] revealed a significant improvement in expressive recall during training [χ 2 (2) = 31.0, p = 1.85x10 − 7 ], with significantly superior learning for the healthy controls relative to the PWA [χ 2 (1) = 4.7, p =.0299]. No differential benefit of training between PWA and healthy controls was observed across training cycles (Fig. 3A, Supplementary Fig. 1 for group performance and individual data distribution) given that the model containing the main effects (Cycle + Group) was better than the model with the Cycle × Group interaction (compare models Main effects 1 and Cycle × Group provided in Supplementary Table 1).
To assess the factors that could influence expressive recall in PWA, we fitted two separate GLMMs using the PWA data, one to evaluate the influence of single word processing abilities, and the other to assess the influence of the left IFG lesion volume. The first model [PWA naming accuracy ~ cycle + (1|participant) + (1|item); χ 2 (2) = 8.8, p =.0126, AIC = 254, LL = -122] revealed that single word processing abilities had no effect on expressive recall, while the second model [PWA naming accuracy ~ cycle + IFG lesion + (1|participant) + (1|item); χ 2 (3) = 9.7, p =.0214, AIC = 201, LL = -94] indicated that the left IFG lesion volume did modulate expressive recall [χ 2 (1) = 7.8, p =.0053] with superior learning for PWA who had a smaller lesion volume relative to those with larger damage to the left IFG (Fig. 3B).

Effects of phonological cueing on learning outcomes as measured by expressive recall in PWA
To capture the effect of phonological cueing on learning outcomes in PWA on expressive recall testing at the end of training, we used a pairedsamples t-test to contrast Naming Test 3 scores versus Naming with Phonological Cueing scores (both measures including fully correct responses only) for just the participants who were not at ceiling on Naming Test 3 and therefore, could show some benefit from receiving a phonological cue (n = 10 excluding P4 and P6; score on Naming Test 3 range = 0 to 5). The results showed that learning as measured by expressive recall at the end of training was significantly superior when including correct responses that followed a phonological cue (Naming with Phonological Cueing scores: M = 0.48; SD = 0.29) versus when accounting for only correct responses without the cue (Naming Test 3 scores: M = 0.33; SD = 0.25) [t (9) = 3.25, p =.01] (Fig. 3C). Thus, phonological cueing provided a sensitive measure of learning on expressive recall testing in PWA with substantial lexical acquisition yet inaccurate recall of the trained items at the end of training.

Learning outcomes as measured by expressive recall in PWA at the individual level
Case-control comparisons revealed that 7 PWA showed spared learning outcomes on Naming Test 3 as compared to those of the healthy control group. Five PWA (P1, P2, P3, P8 and P12) showed impaired word learning on expressive recall testing (all p <.05) (Supplementary Table 2 and Supplementary Fig. 2).

Novel word learning as measured by recognition in PWA and healthy controls
A subsequent GLMM was fitted to evaluate change in recognition performance over time in the healthy control and aphasia groups. The final model [recognition accuracy ~ cycle + association * group + phonological STM + (1|participant) + (1|item); χ 2 (6) = 72.3, p = 1.41x10 − 13 , AIC = 529, LL = -256] showed a significant overall increase in recognition performance across training cycles [χ 2 (2) = 29.2, p = 4.52x10 − 7 ; Fig. 3D], with a significant interaction between association and group [χ 2 (1) = 5.9, p =.0152] which indicates a superior performance by the healthy controls relative to PWA in incorrect association test trials (z-ratio = 3.01, p =.0026), but not in correct association test trials (z-ratio = 0.84, p =.4028) (Fig. 3E, Supplementary Fig. 3 for group performance and individual data distribution). We also found a significant effect of the participants' phonological STM on recognition performance across groups [χ 2 (1) = 5.6, p =.0184], with higher recognition performance for participants with higher verbal STM scores (Fig. 3F). Similar to expressive recall, there was no differential benefit of training between PWA and healthy controls across training cycles given that the model with the Cycle × Group interaction was not as good as the model with the Association × Group interaction (compare models Association × Group and Cycle × Group in Supplementary Table 1).
Two separate GLMMs using just the PWA data were fitted to study the factors that could influence word learning in this group as measured by recognition tests. The first model [PWA recognition accuracy ~ cycle + association + single word processing + (1|participant) + (1|item); χ 2 (5) = 58.6, p = 2.38x10 − 11 , AIC = 293, LL = -140] evaluating the influence of single word processing abilities, revealed significantly superior learning for PWA with higher relative to lower language ability as reflected by this composite score [higher language score; χ 2 (1) = 16.8, p = 4.23x10 − 5 ; Fig. 3G]. The second model [recognition accuracy ~ cycle = 5.18x10 − 10 , AIC = 205, LL = -93] showed a significant interaction between training cycle and left IFG lesion volume [χ 2 (2) = 7.3, p =.0257]. This interaction indicates that learning ability measured via recognition was different for PWA depending on their left IFG lesion volume. Specifically, there were significant differences between training cycles 1 and 2 for PWA with smaller lesions ≤ 2.17 cm 3 (threshold set using median split; z-ratio = -3.0, p =.0043 for a lesion volume of 0.043 cm 3 , corresponding to the first quartile), but no differences across cycles for PWA with larger lesions > 2.17 cm 3 (z-ratio = − 1.88, p =.0607 for a lesion volume of 12.02 cm 3 , corresponding to the third quartile). For the sake of simplicity, this interaction is illustrated in Fig. 3G using the median split in the left IFG lesion volume factor.

Learning outcomes as measured by recognition in PWA at the individual level
Case-control comparisons revealed that most PWA showed spared learning outcomes on Recognition Test 3 relative to the control group. Only 3 PWA (P1, P3 and P8) showed impaired learning when considering all recognition trials 2 (p =.001 in all cases) and only incorrect association trials (p ≤ 0.015 in all cases). Likewise, the exact binomial test showed that overall performance on this measure was significantly superior to chance level (50% correct) for most PWA (binomial test p ≤ 0.0032, one-tailed), except for P1, P3 and P8 (binomial test p =.073, one-tailed in all cases) (Supplementary Tables 3 and 4, Supplementary  Fig. 4).

Discussion
The present study aimed to (i) evaluate novel word learning ability in PWA at the group and at the individual level, comparing their learning performance on expressive recall and recognition tests to that observed in healthy control individuals, (ii) to determine whether phonological cueing during expressive recall testing at the end of training would provide a sensitive measure of learning which is particularly demanding for PWA, and (iii) to identify language-related, cognitive and lesion factors that modulate word learning ability. Our findings revealed that most PWA demonstrated word learning ability as evaluated by expressive recall and recognition tests. Moreover, their expressive learning became more evident with phonological cueing on expressive recall testing at the end of training. Single word processing abilities and phonological STM modulated word learning ability in PWA on recognition but not on expressive recall tests. Phonological STM showed the same modulatory effect on learning as assessed by recognition tests in the healthy controls. Importantly, the integrity of the left IFG modulated learning ability in PWA in both recognition and expressive recall performance. In what follows, we discuss these findings in more detail.

Novel word learning as measured by expressive recall tests in aphasia
The results on expressive recall as a measure of novel word learning showed that PWA as a group demonstrated significant novel vocabulary acquisition during training, although their performance was lower than that of the healthy controls. Individually, 7 PWA showed learning outcomes on expressive recall comparable to those of the healthy controls, while 5 PWA demonstrated impaired learning on this measure. These findings contribute to the still limited evidence on word learning ability in aphasia as assessed by expressive recall and suggest that while this measure reveals preserved learning ability in some PWA, demonstrating lexical acquisition via expressive recall can be challenging for others (Coran et al., 2020;Dignam et al., 2016;Gupta et al., 2006). While past research has reported the learning ability of PWA to be at floor levels on expressive recall measures on group level analyses (Dignam et al., 2016;Gupta et al., 2006), our findings support single case studies (Tuomiranta et al., 2011;2014a;2014b) demonstrating significant learning in PWA on this measure. This discrepancy across studies could be explained in part by differences in the methods employed to measure learning ability. For example, Gupta et al. (2006) considered only fully correct responses reflecting full acquisition of the trained novel words. Similar to previous research (Coran et al., 2020;Tuomiranta et al., 2011;2014a;2014b), we used a permissive criterion which considered minimal variations in the spoken production of the trained words as correct responses, allowing us to capture significant improvements in learning on expressive recall testing as a function of training. Scoring systems that credit mostly correct responses may help capture successful word retrieval while disregarding minimal phonological deviations that may arise during post-lexical processes (Schuchard et al., 2020). In this way, our scoring criterion allowed us to capture improvement for newly acquired words that were nevertheless difficult to recall in a fully correct manner possibly due to phonological output difficulties which are commonplace in aphasia (Laine & Martin, 2006). However, Dignam et al (2016) did not find significant learning on expressive recall measures in most participants despite using a permissive criterion. This discrepancy could reflect differences in the participants' aphasia severity since our participants had predominantly mild to moderate aphasia, whereas those reported by Dignam et al (2016) presented with more severe language impairment.

Phonological cueing during expressive recall testing facilitates the evaluation of novel word learning outcomes in aphasia
We further examined if learning outcomes could be more clearly evidenced in PWA by providing phonological cues when a fully accurate response was not produced on expressive recall testing at the end of training. The comparison of naming accuracy with a phonological cue versus without one on the final naming test revealed that PWA could demonstrate significantly greater learning when phonologically cued responses were taken into account. This finding suggests that the acquisition of the trained novel word forms in some PWA was largely successful and the phonological cue effectively helped them to produce the target word correctly. Phonological cues can help to determine the status of word knowledge in the language system (Jefferies & Lambon   Fig. 3. Novel word learning in PWA and healthy controls as evaluated by expressive recall and recognition measures. The learning performance of the healthy controls (HC) and people with aphasia (PWA) is shown as proportion of correct responses in naming and recognition tests, with all panels showing the mean and the standard error of the mean (shaded area) reflecting individual variability. Naming accuracy reflects the use of a flexible scoring criterion unless noted otherwise. (A) Learning curves of PWA and HC across naming tests throughout the three training cycles. (B) For PWA, the left IFG lesion volume modulated their expressive recall performance indicating that PWA with smaller lesions showed better performance across naming tests. (C) Contrast between Naming Test 3 scores (without phonological cueing) versus Naming with Phonological Cueing scores (both scores computed with a strict scoring criterion), showing a significant phonological cueing effect on expressive recall outcomes for PWA. (D) Learning curves of PWA and HC across recognition tests throughout the three training cycles. (E) Interaction between group and association (correct vs incorrect) showing that recognition performance is significantly superior for HC relative to PWA only in incorrect association trials which require rejecting incorrect word-referent mappings. Main effects of (F) phonological STM and (G) single word processing abilities on overall recognition performance, each one showing better learning for PWA with better STM and language scores. (H) Interaction between training cycle and left IFG lesion volume in PWA, showing significant changes between training cycles 1 and 2 for PWA with smaller relative to larger lesions (median split). Ralph, 2006) as words that respond well to cueing are the closest ones to activation threshold for accurate retrieval (Jefferies et al., 2008). In terms of novel word learning, our findings align with this proposal, indicating that minimally stable representations of the newly acquired words must be reached before phonological cueing can boost them to sufficient activation threshold levels for accurate retrieval. Such minimal stability of novel word representations may be a necessary condition for phonological cueing to effectively support the long-term maintenance of newly trained vocabulary in PWA after discontinued training as reported in previous studies (Tuomiranta et al., 2011;2014a). This interpretation is supported by evidence of phonological cueing being sensitive to lexical information loss for newly acquired words in patients with preclinical Alzheimer's disease (Tort-Merino et al., 2017), mild cognitive impairment and Alzheimer's disease (Grönholm-Nyman et al., 2010) on similar learning tasks using the AFE paradigm (Laine & Salmelin, 2010). Phonological cueing is a clinically relevant facilitation technique since responsiveness to phonological cueing on naming assessments can indicate whether PWA will benefit from cue-based naming therapy (Hickin et al., 2002). Moreover, words that can be successfully retrieved by PWA with only minimal cueing prior to therapy are more likely to show a more successful response to therapy than those requiring more substantial cues (Conroy et al., 2012). Thus, our study extends the evidence on the clinical utility of phonological cueing in naming to the assessment of word learning ability in aphasia on expressive recall measures, as it can more clearly evidence nearly full lexical acquisition for responses that otherwise would be taken as failed attempts of lexical retrieval.

Novel word learning as measured by recognition tests in aphasia
We found that PWA showed significant learning on recognition measures with group level performance being similar to the healthy controls in the identification of correct word-referent trained associations. However, PWA also showed significantly worse identification of incorrect word-referent associations (i.e., trained referents incorrectly paired with words of the training set associated with other referents during training) relative to the healthy controls. Recognition performance in correct word-referent association trials may have been less difficult for PWA since the mapping between the novel word form candidate and the target object was systematically presented during training, which enhances reliance on familiarity at testing. In turn, recognition performance on incorrect word-referent association trials may be more challenging for PWA because memory traces for correct novel word-referent mappings may be initially unstable, and the introduction of trained yet incorrect novel word candidates during recognition testing may lead to competition and interference, yielding a high number of false alarms during the initial phase of learning. It has been suggested that novel word learning may call for semantic retrieval mechanisms that require conflict monitoring and resolution between candidate lexical items to allow for the selection of the best fitting candidate (Rodríguez-Fornells et al., 2009). Moreover, the on-line control of irrelevant or competing memory associations during pairedassociate learning can be disrupted in people with frontal lesions (Shimamura et al., 1995). Thus, our findings suggest that some PWA may present with more susceptibility to memory interference during initial encoding and that their learning ability may depend on their ability to resolve such interference. Importantly, although the two groups differed in their recognition performance patterns during training, PWA were far behind the control group in their ability to reject incorrect word-referent mappings on the first but not on the following recognition tests. This finding suggests the difficulty in discriminating correct versus incorrect word-referent associations may resolve after additional training, indicating slower yet successful learning ability.
The group-level findings on successful learning in recognition testing were confirmed at the individual level since most PWA showed significantly superior learning outcomes (Recognition Test 3) relative to chance level and their performance on incorrect association trials was comparable to that of the healthy controls at the end of training. This study therefore contributes to the growing body of research demonstrating the largely preserved word learning ability of PWA on recognition tasks (Coran et al., 2020;Dignam et al., 2016;Gupta et al., 2006). It further underscores two important methodological suggestions for the assessment of learning ability in aphasia, namely (i) a fine-grained examination of recognition performance as different patterns that arise for correct versus incorrect word-referent mappings may indicate interference during learning and (ii) the use of repeated assessments to determine whether initial interference during learning is resolved over time, since only a few learning instances or a single time-point assessment may underestimate word learning capacity in PWA. Our findings confirm that recognition measures can provide a reliable metric of learning ability (Coran et al., 2020;Dignam et al. 2016) and may help some PWA with production difficulties to circumvent the higher demands of expressive recall measures (Peñaloza et al., 2022).
Of note, all PWA who showed successful learning on expressive recall also succeeded on recognition testing, but not the other way around. Specifically, P2 and P12 had spared learning on recognition measures but showed impaired learning outcomes on expressive recall. Although both had frontal lesions, P2 was more severely affected than P12 who was substantially recovered but presented with residual language complaints, making it difficult to identify the reasons for this pattern of performance. Nonetheless, this dissociation suggests that some PWA may show intra-individual variability in their learning ability across expressive recall and receptive recognition measures (Coran et al., 2020), and that using both expressive recall and recognition measures is essential to have a comprehensive overview of word learning abilities in PWA. Importantly, a variety of experimental methods have been described to enhance performance on expressive recall measures in PWA (Peñaloza et al., 2022) which may improve learning outcomes in individuals with similar profiles. However, the relationship between learning as measured via expressive recall and recognition and the clinical implications of their interactions requires further research.

Predictors of novel word learning performance as measured by expressive recall and recognition tests
We also examined the influence of language-related, cognitive and lesion factors on word learning in aphasia (see Peñaloza et al., 2022 for a review). We found that both single-word processing abilities and phonological STM modulated learning as measured by recognition, while left IFG lesion volume modulated learning performance on both recognition and expressive recall in aphasia. Our results support the finding that novel word learning measured via recognition relies on lexical-semantic processing abilities (Gupta et al., 2006) captured by measures of single word comprehension and lexical retrieval in our composite score. While we chose a combined metric of specific measures tapping phonological and lexical-semantic processing abilities to index the functionality of the language system, our findings align with other studies showing that recognition performance is modulated by language abilities as indexed by global measures such as aphasia severity (Dignam et al., 2016;Marshall et al., 2001;Peñaloza et al., 2016;. Altogether, these findings suggest that the progressive recognition of novel words during learning requires functional language abilities that help process novel linguistic representations to correctly identify newly trained word-referent mappings. In this way, effective recognition performance may signal the availability of the language processing system to support successful word learning (Peñaloza et al., 2022). The finding that learning as measured by expressive recall was not modulated by single word processing abilities may reflect the inclusion of mostly mild to moderate impairment profiles in our sample. While other studies have aimed to differentiate the contributions of specific language abilities such as phonological and lexical-semantic processing to word learning in aphasia (Gupta et al., 2006), our composite score reflected the combination of these abilities to minimize the number of predictors evaluated given our limited sample size. More research is needed to elucidate whether specific language processing abilities modulate learning on expressive recall measures in PWA.
Similarly, we found that phonological STM modulated novel word learning as measured by recognition but not expressive recall tests in both PWA and healthy controls. Our earlier work has shown that verbal STM tapped by different metrics is associated with novel word learning in aphasia as assessed via recognition performance (Peñaloza et al., 2015;. Moreover, the present study supports our previous research showing that phonological STM as measured by nonword repetition modulates the ability to learn novel word-referent mappings in healthy older adults  and PWA (Peñaloza et al., 2016). Phonological STM is known to make important contributions to novel word learning (Baddeley, 2003;Gupta & Tisdale, 2009). Verbal STM mechanisms would contribute to encoding and temporary maintenance of the serial order of activation of representations at the lexical and sub-lexical levels, allowing for accurate reproduction of a given linguistic sequence in both novel word learning and nonword repetition (Gupta, 2003). This short-term maintenance mechanism may help establish connections between the sub-lexical and lexical levels of representation (Gupta, 2003) facilitating the strengthening of memory traces for novel word forms as unique lexical entities during learning. Because the recognition test included trained novel words in incorrect word-referent mappings potentially leading to competition and interference during learning, the maintenance of serial order of sub-lexical units (e.g.: syllables) may have been required to help achieve appropriate lexical selection during recognition testing. In contrast, it is possible that our test of expressive recall was less taxing on phonological STM since only the trained object picture was presented for naming and recall accuracy was measured using a flexible criterion on the verbal outcome.
We also found that the left IFG lesion volume negatively modulated novel word learning ability in PWA in both recognition and expressive recall measures. Notably, the association between lesion volume in key brain regions for lexical acquisition and word learning performance in PWA had not been assessed in other studies. Our findings corroborate previous evidence for the involvement of the left IFG in vocabulary learning in healthy adults (Gore et al., 2022;Sliwinska et al., 2017), and research showing impaired novel word learning in PWA with left inferior frontal lesions (see Peñaloza et al., 2022 for a review). The fMRI study conducted by Sliwinska et al. (2017) revealed increased activation in the left IFG, among other brain regions in the cingulo-opercular network, in healthy adults during associative novel word learning. The left IFG is part of the cingulo-opercular system, one of the domaingeneral networks that constitute the Multiple Demand Cortex (Fedorenko et al., 2013) known to interact with domain-specific brain regions to facilitate learning (Chein & Schneider, 2005). The Multiple Demand Cortex may deploy general cognitive resources such as working memory, selective attention and performance monitoring during the initial stages of vocabulary learning characterized by high uncertainty on performance accuracy, and shows a decline in activity as more automated mechanisms are in place once learning has occurred (Sliwinska et al., 2017). A more recent fMRI study by Gore et al. (2022) suggests that the left IFG is part of the language processing network where newly acquired vocabulary is transferred from the hippocampus for consolidation and long-term storage (Davis & Gaskell, 2009;McClelland et al., 1995). Gore et al (2022) found that greater left hippocampal activation was associated with lower accuracy and longer reaction times when naming newly acquired words. In turn, greater neocortical activation in language regions (including the left IFG) supporting already well-known vocabulary was associated with higher accuracy and shorter reaction times during newly acquired vocabulary retrieval. In the view of these fMRI studies, damage to the left IFG could impair novel vocabulary learning in aphasia by decreasing the capacity of the cingulo-opercular network to support initial lexical acquisition (Sliwinska et al., 2017), making it more difficult for novel language representations to become hippocampus-independent and stabilize in the language processing system (Gore et al., 2022).
It is worth noting that the left IFG also contributes to verbal shortterm/working memory  and PWA with left inferior frontal lesions demonstrate both impaired verbal STM and new word learning (Peñaloza et al., 2015;. However, phonological STM and left IFG lesion volume were not significantly correlated in the present sample, suggesting that they possibly make independent contributions to word learning ability in aphasia. These findings concur with previous research by Martin et al. (2021) showing that phonological working memory (i.e., digit matching span task) is predominantly supported by the supramarginal gyrus, with additional involvement of other cortical and subcortical regions including the inferior frontal junction, whereas semantic working memory (i.e., category probe task) is supported by the left IFG and the angular gyrus. Our previous research has shown that although both phonological (i.e., nonword repetition task) and semantic STM (i.e., pointing and repetition span tasks) modulate novel word learning in aphasia, PWA with left frontal lesions and impaired learning show significantly worse semantic STM (but not worse phonological STM) relative to PWA with non-frontal lesions and better learning (Peñaloza et al., 2016).
As discussed here, the left IFG may have a role in different operations supporting novel word learning and different IFG sub-regions may make different contributions to these operations. Future neuroimaging studies using methods with high spatial and temporal resolution to capture word learning as it unfolds over time may help specify in more detail the contributions of the left IFG and other brain regions to novel word learning in aphasia. Altogether, our findings from language, cognitive and lesion factors that modulate learning ability support the proposal that damage to language regions may place input and output processing constraints for novel word learning, while deficits in verbal STM and damage to key brain regions for learning may make it difficult to consolidate newly acquired word representations in aphasia (Peñaloza et al., 2022).

Individual variation in novel word learning ability
Although the learning performance of PWA and the healthy controls was differentiated at the group level, most PWA showed performance within the range observed in the healthy controls across learning cycles, showing a dispersion pattern that narrowed down towards the end of training. At this time point, the individual learning outcomes of most PWA were comparable to the average performance of the healthy controls. It is worth considering that the PWA in our sample showed mostly mild to moderate aphasia severity and that larger differences relative to the healthy controls might be expected for more severe cases. Additionally, the individual variation in novel word learning ability in healthy older individuals may reflect both individual differences in phonological STM as shown here and in prior research  as well as the substantial heterogeneity observed in memory performance as a function of advanced age (Nyberg et al., 2012). Research in healthy aging has shown increasing individual variation in cognitive performance on psychometric and non-psychometric tests with increasing age (Morse, 1993;Nelson & Daneffer, 1992). This individual variation is greater in memory measures (including tests of word recall and recognition) relative to other cognitive abilities (Christensen et al., 1994) and persists after excluding low score individuals and accounting for MMSE scores (Christensen et al., 1994). The inter-individual variability observed here has important implications for individual diagnostics. It is possible to conclude that PWA present with impaired learning ability after their brain insult when their performance is significantly below that of the average performance of a representative group of healthy controls. This approach also informs about the minimum proportion of the healthy population that may present with low performance as identified by the case-control comparison methods for single case research employed here (Crawford & Garthwaite, 2002). In turn, for PWA whose performance is on par with the lower end of the healthy control data who do not significantly differ from the control group, it is not possible to reliably determine whether their performance is maintained or is poorer relative to premorbid levels. Importantly, the task reported here is able to capture the individual variability of the (overall lower) word learning ability of PWA after stroke.

Limitations and suggestions for future research
The present study had some limitations. Our examination of predictors of word learning ability in this study was exploratory given the limited number of participants included in our sample and the number of missing data across predictors. Our small sample size may have reduced the statistical power needed to detect relevant factors that influence word learning as measured by expressive recall in aphasia, and the interpretation of our results regarding the left IFG lesion volume based on even a smaller sample requires caution. Nevertheless, our findings cohere with previous research and may help future studies to narrow down hypotheses regarding potential factors that modulate learning ability on different assessment measures in aphasia. Also, as most participants had mild to moderate aphasia, future research should involve larger samples with diverse impairment profiles to ensure generalizability. Finally, while our study focused on the Left IFG Lesion Volume based on previous research (Peñaloza et al., 2015;, examining the role of other brain regions is strongly encouraged. In this regard, the superior frontal gyrus/ dorsal anterior cingular cortex region within the cingulo-opercular system could be a relevant region of interest given its recruitment during novel word learning in healthy adults, the enhanced learning performance resulting from its stimulation (Sliwinska et al., 2017) and its potential role in aphasia recovery . Also, examining white matter tracts found to contribute to word learning in healthy adults (López-Barroso et al., 2013;Ripollés et al., 2017), verbal STM (Olivé et al., 2023) and word learning in PWA (Coran et al., 2020;Tuomiranta et al., 2014a) could provide complementary knowledge on the neural correlates of lexical acquisition in aphasia.

Conclusion
The present study shows that novel word learning as indexed by expressive recall and recognition tests can remain largely comparable to healthy controls in some PWA, although their learning performance may not be as fast as that of healthy individuals. It also demonstrates that flexible scoring criteria and phonological cueing can be helpful methods to better estimate learning on expressive recall tests by PWA, whereas the production of spontaneous, fully correct oral responses as learning criterion may not properly reflect their learning potential. Moreover, our findings provide valuable insights into factors that influence word learning ability in aphasia and signal a path forward for the study of predictors of this ability in future research. This study demonstrates that our brief word learning task can capture individual variability in lexical acquisition capacity in PWA. Therefore, its capacity to predict language therapy outcomes in PWA should be examined in future research. If its potential prognostic value is demonstrated, its validation with PWA could be an important step towards its use as an informative assessment tool of feasible administration in the clinic.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
The datasets and materials generated during the current study are available in the OSF repository available at https://osf.io/879gf/