Functional near‐infrared spectroscopy measures of neural activity in children with and without developmental language disorder during a working memory task

Abstract Introduction Children with developmental language disorder (DLD) exhibit cognitive deficits that interfere with their ability to learn language. Little is known about the functional neuroanatomical differences between children developing typically (TD) and children with DLD. Methods Using functional near‐infrared spectroscopy, we recorded oxygenated hemoglobin (O2hb) concentration values associated with neural activity in children with and without DLD during an auditory N‐back task that included 0‐back, 1‐back, and 2‐back conditions. Analyses focused on the left dorsolateral prefrontal cortex (DLPFC) and left inferior parietal lobule (IPL). Multilevel models were constructed with accuracy, response time, and O2hb as outcome measures, with 0‐back outcomes as fixed effects to control for sustained attention. Results Children with DLD were significantly less accurate than their TD peers at both the 1‐back and 2‐back tasks, and they demonstrated slower response times during 2‐back. In addition, children in the TD group demonstrated significantly greater sensitivity to increased task difficulty, showing increased O2hb to the IPL during 1‐back and to the DLPFC during the 2‐back, whereas the DLD group did not. A secondary analysis revealed that higher O2hb in the DLPFC predicted better task accuracy across groups. Conclusion When task difficulty increased, children with DLD failed to recruit the DLPFC for monitoring information and the IPL for processing information. Reduced memory capacity and reduced engagement likely contribute to the language learning difficulties of children with DLD.


INTRODUCTION
Children with developmental language disorder (DLD) have impairments in language comprehension and production that are not attributable to conditions such as hearing loss or autism or to extenuating circumstances, such as lack of exposure to language (Bishop, 2017;. DLD affects one in 15 children (Norbury et al., 2016) and is a life-long condition in which language deficits contribute to negative social, academic, and/or vocational consequences into adulthood (Arkkila et al., 2008;Clegg et al., 2005;Conti-Ramsden et al., 2018). While the specific cause of DLD is unknown, the condition has been linked to a variety of phenomena, including genetic influences and deficits in cognitive functions (Conti-Ramsden & Durkin, 2015). The influence of genetic factors is supported by high heritability rates in which families with a history of DLD have an estimated incidence of approximately 20%-40% compared to 4% in the general population (Choudhury & Benasich, 2003). DLD has been linked to specific single-nucleotide polymorphisms such as FOXP2 (Fisher, 2005) and ATP2C2 (Martinelli et al., 2021). Commonly reported deficits in cognitive functions that are associated with DLD are reduced verbal working memory (WM) capacity (Montgomery et al., 2019) and procedural memory deficits (Ullman et al., 2020). In the present study, we examine WM performance in concert with cortical hemoglobin concentration measures in children with and without DLD to investigate how neural activity may differ between children with and without DLD.
The domain-general account of language learning posits that cognitive mechanisms such as memory and attention play an integral role in learning and maintaining language (Chow et al., 2021;Montgomery et al., 2021). Consistent with this account, many behavioral studies have shown that children with DLD have limited WM capacity (Archibald & Gathercole, 2006;Gillam et al., 1995;Gray et al., 2019), reduced sustained attention (Finneran et al., 2009;Spaulding et al., 2008), and slower processing speeds than their typically developing (TD) peers (Leonard et al., 2007;Miller et al., 2001). These children may show deficiencies in one or many cognitive processes (e.g., WM, attention, processing speed, and any combination of the three) that can negatively affect their ability to acquire and use language (Bishop, 2006;Gillam et al., 2021). Children with DLD that demonstrate diminished WM capacity have difficulty with sentence comprehension, understanding grammatical rules, and learning new words (Montgomery, 2003;Montgomery et al., , 2010. Controlled attention allows a person to filter out irrelevant information and focus on relevant information. Individuals with limited attentional control are often overloaded with irrelevant information (Awh & Vogel, 2008;Cowan et al., 2005;deBettencourt et al., 2019;Magimairaj & Montgomery, 2013). Slowed processing primarily interferes with language comprehension. In order to understand sentences, listeners need to switch between retrieving lexical knowledge and world knowledge while building a complete representation of what is being said. Incoming information that is not processed quickly enough is vulnerable to decay or interference, resulting in incomplete representations of what was said (Leonard et al., 2007). While there is wide agreement that cognitive skills are impaired in a DLD population, there is less consensus on some of the specific details of how it impacts language.
The limited capacity theory is a specific philosophical orientation within the general domain-general model of language. According to this theory, children with DLD are affected by limitations in processing, storing, and retrieving information. These processing limitations negatively affect nonverbal intellectual abilities (Gallinat & Spaulding, 2014), leaving children especially vulnerable in language learning contexts that pose higher cognitive and linguistic demands (Weismer et al., 2005;Im-Bolter et al., 2006). This finding is congruent with the work of Gillam et al. (2021), Leonard et al. (2013), Montgomery et al. (2018), and Robertson (2009), who have identified links between poor storage of previously processed linguistic material and sentence comprehension in a DLD population.
Cowan's embedded-processes model of WM (Cowan, 1999(Cowan, , 2014(Cowan, , 2017 is consistent with the basic tenets of the limited capacity theory of DLD. According to the embedded-processes model, WM is the ability to access and hold a limited amount of information in the focus of attention for use in cognition and language (Cowan et al., 2021). WM requires focused attention to activate a portion of long-term memory.
The focus of attention is limited in capacity and, as a result, affects the amount of possible information that is readily accessible to the learner at any given time (Cowan, 2014). Cognitive schemas found in long-term memory contain organized information about experiences, thoughts, and behaviors and facilitate recoding of information into chunks that are easier to store and retrieve (Ericsson & Kintsch, 1995).
Children with DLD have highly variable WM profiles, meaning some children will evidence deficits in one or two aspects of WM (e.g., phonological loop, visual-spatial sketchpad, episodic buffer, central executive, memory updating, inhibition, etc.), while others may present deficiencies in multiple aspects of WM (Adams et al., 2018;Archibald, 2017;Gray et al., 2019). Although much has been learned about WM in children with DLD, identifying WM deficits and how they vary remains an important issue.
Measuring patterns of neural activity in children with DLD during a WM task can shed light on the cognitive abilities and associated neural processes that may differ in children with DLD and TD children. Lee and colleagues conducted two recent functional magnetic resonance imaging (fMRI) studies on a cohort of young adults with DLD and a cohort of adolescents with DLD using a cross-sectional design. Both studies found that age-related changes in white matter structures in the corticostriatal, dorsal, and ventral pathways, as quantified by fractional anisotropy, gradually emerged across time for participants with typical language abilities but not for participants in the DLD group.
These biological changes were associated with multiple factors, including environmental, cognition, and language differences. Therefore, it was suggested that to understand the scope of DLD fully, researchers should employ multisystem methods that simultaneously investigate neural, cognitive, and linguistic abilities (Lee, Dick, et al., 2020;Lee, Nopoulos, et al., 2020).

N-back tasks
N-back is a useful paradigm for assessing WM because it measures the ability to store, manipulate, and update items in memory while inhibiting irrelevant information (Kane et al., 2007;Rottschy et al., 2012). In N-back tasks, participants must recall a stimulus that was seen and/or heard "N" trials before a recall cue. A key finding of N-back studies is the reliable decline in task performance as N increases (Braver et al., 1997;Ewing & Fairclough, 2010;Gajewski et al., 2018). In some N-back experiments, researchers include a 0-back condition, in which participants are asked to indicate when they hear a prespecified target within the presentation stream. Zero-back requires sustained attention but has minimal storage or retrieval components (Miller et al., 2009). Therefore, in data analysis, 0-back can be used as a covariate to control for sustained attention statistically, allowing the analysis to focus on the storage, recall, inhibition, and updating elements of WM.
Evans and colleagues (2011) used electroencephalogram to assess speed of processing, P3 amplitude, and P3 latency in 10 adolescents with DLD and 10 age-matched controls as they performed an N-back task. There were no significant group differences in accuracy during the 1-back task, but the children in the DLD group presented significantly lower P3 amplitude relative to matched controls in both 1-back and 2-back tasks. Lower P3 amplitude in the DLD group was interpreted as indicating a deficit in encoding and the need for greater cognitive resources to maintain and inhibit information.

Neural regions that contribute to WM and language
Several neuroimaging studies using N-back tasks have identified active core areas in the prefrontal and parietal regions that are widely recognized as critical areas for executive functioning and WM in children and adults (Eriksson et al., 2015;Fishburn et al., 2014;Miró-Padilla et al., 2020;Owen et al., 2005;Rottschy et al., 2012;Yaple & Arsalidou, 2018). Parietal areas, specifically the inferior parietal lobule (IPL), have been linked to encoding, phonological storage, and manipulation during WM tasks. Furthermore, studies have found the left dorsolateral prefrontal cortex (DLPFC) plays an important role in WM performance by filtering and monitoring information in both visual and auditory modals (Barbey et al., 2013;Cole et al., 2012: Crottaz-Herbette et al., 2004Lamichhane et al., 2020;Rodriguez-Jimenez et al., 2009;Rottschy et al., 2012).
Functional imaging studies conducted on individuals with DLD have reported abnormalities in the prefrontal cortex (Gallagher & Watkin, 1997;Gauger et al., 1997) and decreased activation in left frontal and temporal language areas and subcortical areas such as the caudate and putamen while individuals performed a variety of tasks including verbal WM, task switching, covert auditory response naming, and phonological processing (see reviews: Badcock et al., 2012;de Guibert et al., 2011;Mayes et al., 2015;Pigdon et al., 2020;Weismer et al., 2005). Behavioral results showed children with DLD had worse accuracy and displayed longer response times for high encoding items compared to the control group. The fMRI results revealed significant hypoactivation during encoding in the left parietal region, inferior frontal gyrus, and the precentral sulcus in the DLD group. Weismer and colleagues noted the important role of the parietal region, inferior frontal gyrus, and the precentral sulcus in attentional control, language processing, and retention. Thus, hypoactivation in these regions in DLD samples provides evidence for atypical brain activation during sentence encoding and recognition. Despite the inconsistencies in the literature, there is general agreement that key language areas (frontal, temporal, and parietal cortex) are affected in individuals with DLD and overlap with brain areas active in executive function tasks such as the N-back (Rottschy et al., 2012;Yaple & Arsalidou, 2018). Taken together, the left frontal and parietal areas seem important to probe in children with DLD given the role both brain regions play in executive functioning and language.

Functional near-infrared spectroscopy
Functional near-infrared spectroscopy (fNIRS) uses near-infrared wavelengths to measure cerebral metabolic changes that serve as a proxy for neuronal activity. As neuronal activity increases in an area of the cortex, there is a concomitant increase of oxygenated hemoglobin (O 2 hb) and a decrease in deoxygenated hemoglobin that provides a blood oxygenated level-dependent (BOLD) signal similar to fMRI (Boas et al., 2003;Fantini et al., 2018;Huppert et al., 2006;Lecrux et al., 2019). Several studies have demonstrated the utility of using fNIRS for language research in adults and children due to its relatively low cost, noninvasive nature, portability, and tolerance to motion (see reviews: Butler et al., 2020;Pinti et al., 2020;Yücel et al., 2017). fNIRS has the advantage over other imaging techniques such as fMRI in that it is quiet and less restrictive and provides an environment that is more comfortable and conducive to capturing real-world behavior (with associated neural responses) in children and in clinical populations that might find methods such as fMRI distressing (Pinti et al., 2020;Soltanlou et al., 2018). fNIRS also has a higher tolerance to movement than either fMRI or electroencephalography (Aslin & Mehler, 2005;Quiñones-Camacho et al., 2019). Finally, Butler and colleagues (2020) point out that fMRI is particularly problematic for studying individuals who have difficulty in processing language, such as children with DLD, because it requires them to listen in a noisy environment.
Two fNIRS studies have investigated neural activity within a DLD population: one with children (Fu et al., 2016) and one with adults ADHD (Gu et al., 2017), and patients with cochlear implants (Sherafati et al., 2022). In addition, fNIRS has been used to examine WM in adults (Baker et al., 2018;Berglund-Barraza et al., 2019;Fishburn et al., 2014;Meidenbauer et al., 2021). fNIRS appears to be a viable tool for investigating neural activity in children with DLD as they perform an auditory WM memory task.
The present study was designed to explore the general cognitive mechanisms in WM that are often found to be impaired in children with

Participants
Twenty-eight school-aged children between the ages of 9 and 14 years participated in the study. All the participants met the following criteria: (1) right-handed; (2)  Recruitment of children in both groups occurred simultaneously.
Parents were informed of the study by word of mouth or by flyers that were distributed throughout the community. Children and parents were familiarized with research procedures before obtaining informed consent. Children came to the University for three separate sessions.
Each session included a test battery as well as neuroimaging and eye-tracking.

Measures
The test battery was designed to assess language, reading, and memory. Linguistic skills were measured using the comprehension and For the fNIRS task, we administered an auditory N-back task that consisted of three blocks: 0-back, 1-back, and 2-back. Children heard a stream of letters that contained "P," "Q," "R," and "S." In the 0-back task, for each auditory item, children were asked to press the key that indicated the letter that was heard. In the 1-back task, children were asked to press the key that indicated the letter presented one item before the current item. Finally, in the 2-back task, children were asked to press the key whenever the item they heard was the same as the one that occurred two items before the current item. We created a continuous N-back task because we wanted our hemoglobin concentration measure to reflect the extent of neural activity across the entirety of each block. To do that, we needed participants to respond to each stimulus, not just those stimuli that were followed by a tone.
Each block contained 31 trials that were 3 s apart. Every block was proceeded with a 20-s fixation period in which children were instructed to "look at the cross until it goes away." The blocks were pseudorandomized and lasted a total of 120 s. The experiment was programmed in E-prime 2.0, and children's responses were recorded on a button box that had each button marked with the letters "P," "Q," "R," and "S" (see Figure 1).

Procedure
Participants underwent a training period in which a researcher read the instructions and provided the opportunity to practice each task before starting the actual experiment. Before the start of each block, children would receive a reminder of the instructions for the task. Once children indicated they understood the task and responded consistently to them, the researchers placed the fNIRS cap on their heads. fNIRS data were acquired at 697.8 and 827.9 nm wavelengths using the Hitachi ETG-4000 system (Hitachi Medical Group, Tokyo) at 10 Hz.

fNIRS data preprocessing
NIRS Brain AnalyzIR toolbox (Santosa et al., 2018) was used for data preprocessing and channel registration. Channels were visually inspected prior to analysis to examine overall data quality. Channels with excessive noise were removed if the coefficients of variation threshold exceeded 7.5% (Hocke et al., 2018). Next, raw intensity light values were converted to optical density values, which were then corrected for motion using the temporal derivative distribution repair procedure . The final values were then converted to oxygenated and deoxygenated concentration values using the modi-fied Beer-Lambert law (Jacques, 2013). Oxygenated values were used as dependent values in the analyses because they have been shown to yield a more intense response to changes in physiological blood flow than deoxygenated concentration values (Strangman et al., 2002;Tachtsidis & Scholkmann, 2016). Because of the unique statistical properties of fNIRS, such as high serial correlation errors and heavy-tailed noise distribution, an autoregressive, iteratively reweighted leastsquare model (prewhitening) was used that was more statistically robust than other standard motion correction techniques (Barker et al., 2013;Huppert, 2016). Beta values were constructed by solving a generalized linear model (GLM) for every channel for each subject and task. Boxplots representing the beta values are shown in Figure 2b.
The known onsets and durations of each trial throughout each scan were used to define each regressor. All regressors were then convolved with the canonical hemodynamic response function (HRF). The average HRF for group, ROI, and N-back task is plotted in Figure 2c. The shaded areas represent the standard error from the function created by Martínez-Cagigal (2022).

Data analysis plan
We were primarily interested in determining whether children with parameters package (Lüdecke et al., 2020), and performance packages (Lüdecke et al., 2021).

Descriptive data
Descriptive measures of the language and cognition data for the two groups and Cohen's d effect sizes are presented in Table 1 Table 2). The best-fitting model determined by the likelihood ratio test was Model 4 (χ 2 (2) = 19.67, p < .001; standardized coefficient = 0.79), which contained a two-way interaction between group and task. This model indicated that children in the DLD group had significantly lower accuracy than the TD group during 1-back and 2-back tasks even when we equated the groups on sustained attention (see Figure 3).   Abbreviations: DLD, developmental language disorder; TD, typically developing. ***p < .001; **p < .01; *p < .05.  (Table 3). The best-fitting model, Model 4, as determined by the likelihood ratio test (χ 2 (1) = 54.08, p < .001; standardized coefficient = 0.33), included a F I G U R E 3 N-back accuracy in the 1-back and 2-back conditions. Model 4 fit for two-way interaction between group and task. DLD, developmental language disorder; TD, typically developing. Accuracy = proportion correct. Error bars represent model-based standard errors and are, therefore, of equal width. Zero-back is not included because it was incorporated into the model as a fixed effect. Abbreviations: DLD, developmental language disorder; RT, response time; TD, typically developing. ***p < .001; **p < .01; *p < .05.

TA B L E 3 Progression of mixed-effects models for response time
significant two-way interaction between group and task. This model indicated that children in the TD group responded significantly faster than children in the DLD group during the 2-back task, while children in both groups responded with similar speed for the 1-back task (See Figure 4).

fNIRS data
Linear mixed models were used to assess the effects of group (DLD, Finally, Model 5 included a three-way interaction between task, group, and ROI, with 0-back O 2 Hb beta values as a fixed effect (Table 4).
The best-fitting model, Model 5, determined by the likelihood ratio test, contained a three-way interaction between task, group, and ROI

F I G U R E 4
Response time for correct responses. Model 4 fit for two-way interaction between group and task. Response time in milliseconds (ms). DLD, developmental language disorder; TD, typically developing. Error bars represent model-based standard errors and are, therefore, of equal width.

F I G U R E 5
Best fitting model for O 2 hb. Model 5 fit for three-way interaction between task, group, and ROI. On the y-axis, zero represents a predicted value of zero for a specific combination of task, group, and region when they are at the mean level of 0-back (across all random effects). O 2 hb, oxygenated hemoglobin; DLD, developmental language disorder; TD, typically developing; ROI, region of interest; IPL, inferior parietal lobule; DLPFC, dorsolateral prefrontal cortex. Error bars represent model-based standard errors and are, therefore, of equal width.

Secondary analyses
We wondered about the extent to which performance on the N-back task depended on the O 2 Hb concentration values in left DLPFC or IPL.
We performed two secondary analyses-one was based on trial-by-trial accuracy data and the other was based on trial-by-trial response time data. Critical to this secondary analysis, a difference score was computed by subtracting IPL beta values from the left DLPFC beta values. We hypothesized that greater activation in the left DLPFC (relative to IPL) would be associated with more efficient filtering and inhibition of relevant information, and thus improved accuracy. This hypothesis was tested using linear mixed-effects models, with the difference score as the main fixed effect, accounting for trial number, group, task, and 0-back accuracy or response time, with random intercepts by individual participant. This specification allowed the investigation of trial-by-trial estimates of the relationship between difference scores and accuracy or response time. As with previous analyses, the accuracy models used a binomial distribution and logit link, and response time models used a Gaussian distribution and an identity link. The best-fitting model (Model 2) for accuracy revealed a significant effect for ROI (χ 2 (1) = 8.64, p = .003; standardized coef- Abbreviations: DLD, developmental language disorder; DLPFC, dorsolateral prefrontal cortex; fNIRS, functional near infrared spectroscopy; IPL, inferior parietal lobule; ROI, region of interest; TD, typically developing. ***p < .001; **p < .01; *p < .05. ficient = 1.18) ( Table 5). This model indicated that when children had higher O 2 Hb concentration values in the DLPFC relative to IPL, overall accuracy was increased regardless of group or task (see Figure 6). For response time, no significant effects were found as the models were not significantly different from the null model (χ 2 (1) = 0.854, p = .355).

DISCUSSION
We measured neural activity during a continuous N-back task in children with DLD and TD children while controlling for sustained attention (0-back). As expected, we found that accuracy decreased and response time increased from the 0-back to 1-back to 2-back conditions for children in both groups, demonstrating that the difficulty of our continuous N-back task increased as a function of N. There were significant group differences in accuracy and response time. Children in the DLD group exhibited overall lower accuracy scores than children in the TD group and longer response times for correct responses during the 2-back task. fNIRS data for the DLD group revealed no Our results suggest a relationship between DLD and difficulties in engaging neural activity required to successfully perform the filtering, storing, and updating memory functions that underlie N-back success.
There are two findings suggesting that the 2-back task was unusually difficult for children with DLD. First, they were significantly less accurate than the children in the TD group (DLD M = 0.31 vs. TD M = 0.58). Second, they were significantly slower in responding even when they were correct (DLD M = 959.98 ms vs. TD M = 701.76 ms). It has been proposed that a minimum threshold of activation patterns in the frontal cortex is needed to support accuracy during executive function tasks (Aghajani et al., 2017;Mandrick et al., 2013Mandrick et al., , 2016Mayes et al., 2015;Meidenbauer et al., 2021). This interpretation would be in line with Meidenbauer and colleagues (2021) and others (Aghajani et al., 2017;Mandrick et al., 2013Mandrick et al., , 2016 who reported a nonlinear effect of difficulty on activity in the frontal cortex. Specifically, frontal activity in the TD group tended to increase as a function of difficulty as Abbreviations: diffROI, difference score between left dorsolateral prefrontal cortex and inferior parietal lobule; DLD, developmental language disorder; TD, typically developing. ***p < .001; **p < .01; *p < .05.

F I G U R E 6
Predicted probabilities of accuracy. This model shows the trial-by-trial estimates of the relationship between difference scores and accuracy. The difference score was computed by subtracting inferior parietal lobule (IPL) beta values from the left dorsolateral prefrontal cortex (DLPFC) beta values.
long as the task did not consistently exceed the range in which the participant could still be successful at least 60% of the time. It is important to note that this interpretation is limited because we did not collect self-report data or provide feedback to the participants, which would better corroborate this hypothesis.
For the children in the DLD group, it is possible that the lack of differences in neural activity in frontal and parietal regions during 1-back and 2-back tasks could relate to the heterogeneous linguis-tic and cognitive profile of the DLD population. Recall that children with DLD often present deficits in one or more areas of language (e.g., semantics, syntax, and morphology) together with deficits in one or more cognitive functions (e.g., WM, attention, speed of processing).
Such co-occurring deficits can impact the way children learn, produce, and comprehend language (Archibald, 2017;Badcock et al., 2012;Brown et al., 2014;Gray et al., 2019;Pigdon et al., 2020;Tomas & Vissers, 2019). As language develops, children with linguistic and cognitive deficits may employ compensatory strategies that could influence neural activity patterns in some or multiple brain regions (Badcock et al., 2012;Lee, Dick, et al., 2020;Lee, Nopoulos, et al., 2020). For example, Sherafati and colleagues (2022) found that adults with cochlear implants had reduced activation in the auditory cortex and increased activation in the left prefrontal cortex during a speech perception task compared to adults with normal hearing. Their findings suggest that adults with cochlear implants had learned to recruit cognitive processes via the left prefrontal cortex as a type of compensatory mechanism to help with deficits in the auditory cortex.
For children in the TD group, IPL O 2 Hb concentration values were greater than those in the left DLPFC during the 1-back task. Recall that children in the TD group also had higher accuracy scores and faster response times in the 1-back compared to 2-back task. It is likely that 1-back task performance in our study represents a "Goldilocks effect" (Kidd et al., 2012;Wilson et al., 2019), in which the task was at just the right level of difficulty to keep the children in the TD group engaged.  (Cole et al., 2012;Lamichhane et al., 2020). Activity in the prefrontal cortex has been found to be positively correlated with interindividual differences in visual WM capacity, suggesting a contribution to filtering out irrelevant information (McNab & Klingberg, 2008). In addition, the prefrontal cortex has been shown to play a role in boosting parietal memory capacity (Edin et al., 2009 Interestingly, children in the DLD group had similar response times to the TD group during the 1-back task. However, group differences emerged as processing demands increased during the 2-back. The maintenance of task accuracy appeared to come at a cost of slower response time for children in the DLD group during the 2-back. This reflects a trade-off between processing time and response time, as noted by . Similar findings were reported in Evans et al. (2011), who also found children with DLD did not show slower response time until the auditory 2-back task. Furthermore, Weismer et al. (2005) reported adolescents with DLD exhibited slower correct response times, but only during a higher complexity task.
Poor inhibitory control has been reported for children with DLD as a function of processing limitations (Larson et al., 2020;Marton et al., 2007;Poll & Miller, 2021). According to the inefficient inhibition hypothesis, children with DLD struggle to inhibit competing stimuli resulting in higher demands on WM to process both relevant and irrelevant information (Bjorklund & Harnishfeger, 1990;Marton et al., 2007). Uncertainty has also been shown to increase demands on WM (Coutinho et al., 2015). Our accuracy data indicated slower response times together with poorer accuracy in children in the DLD group during the more difficult 2-back task. It is likely that their WM capacity limits were met, resulting in random, slow, inaccurate responses, thus reflecting the uncertainty and difficulty in processing and selecting their response. This idea is corroborated by the accuracy data showing children in the TD group were more accurate during the 2-back task than the DLD group was during the 1-back task.
Our analyses employed multilevel modeling techniques, which yield more reliable estimates than classical analyses with multiple comparison adjustments (Gelman et al., 2012). Cautious interpretation of these results is warranted until further studies can replicate and extend the findings. A second limitation of this study is that our N-back task used a continuous stream of letters. Perhaps neural differences between groups would be more apparent in other types of complex WM tasks, such as a listening span task, that involve word or sentence recall after a secondary task. It may be valuable in future work to administer a task that can tease apart attentional differences and WM capacity within a DLD population. Finally, our study relies on a number of statistical models and tests, most of which were planned a priori. The secondary analyses of the O 2 Hb concentration values in left DLPFC or IPL were not preplanned and should be considered exploratory.

CONCLUSION
To our knowledge, this is the first study that used fNIRS to compare neural activity between children with and without DLD as they performed an auditory N-back task. Our behavioral findings are generally consistent with the capacity limitation hypothesis of DLD in which higher cognitive demands resulted in lower accuracy and slower response times for children in the DLD group as compared to an agematched TD control group. Additionally, neural activity patterns for the DLD group were consistent with decreased cognitive effort in the face of increased task difficulty. Specifically, as opposed to the children in the TD group, the children in the DLD group presented no task-related modulation in O 2 Hb concentration values in the IPL and the left DLPFC as a function of task difficulty. The left DLPFC was a significant predictor of accuracy across groups, supporting the role of the prefrontal cortex in filtering and monitoring incoming information as a memory task becomes more difficult. It appeared that children with DLD failed to recruit the left DLPFC for monitoring and filtering incoming information and the IPL for encoding, processing, and storing relevant stimuli when task difficulty increased with the 2-back task.
Though this result cannot conclusively assert that the failure was due to reduced WM capacity versus reduced engagement, it seems likely that both outcomes were at play. Regardless, this work demonstrates how neuroimaging can reveal what may be important idiosyncrasies in a clinical population that would otherwise be missed from behavioral data alone.

ACKNOWLEDGMENTS
The authors wish to thank the research assistants who helped collect and process the data as well as the children who participated in this study and their parents.