Verbal Problem‐Solving Difficulties in Autism Spectrum Disorders and Atypical Language Development

Children with autism spectrum disorders (ASDs) adopt less efficient strategies than typically developing (TD) peers on the Twenty Questions Task (TQT), a measure of verbal problem‐solving skills. Although problems with the TQT are typically associated with executive dysfunction, they have also been reported in children who are deaf, suggesting a role for atypical language development. To test the contribution of language history to ASD problem solving, TQT performance was compared in children with high‐functioning autism (HFA), children with Asperger syndrome (AS) and TD children. The HFA group used significantly less efficient strategies than both AS and TD children. No group differences were evident on tests of question understanding, planning or verbal fluency. Potential explanations for differences in verbal problem‐solving skill are discussed with reference to the development of inner speech and use of visual strategies in ASD. Autism Res 2014, 7: 720–730. © 2014 The Authors. Autism Research published by Wiley Periodicals, Inc. on behalf of International Society for Autism Research

Young people with autism spectrum disorders (ASDs) are often reported to have difficulty with spontaneously generating plans and strategies to solve new problems [Channon, Charman, Heap, Crawford, & Rios, 2001;Mackinlay, Charman, & Karmiloff-Smith, 2006;Minshew, Meyer, & Goldstein, 2002]. Compared with tasks with a fixed set of responses, children with ASD can struggle with more "open-ended" cognitive tasks where a range of strategies could be deployed to achieve a particular goal [White, Burgess, & Hill, 2009]. Knowing more about why this occurs is important in both the lab and the real world, as it has implications for adaptive skills and independent living [Kenworthy, Yerys, Anthony, & Wallace, 2008].

Problem Solving in People with ASD
A simple example of this is seen on the Twenty Questions Task (TQT), a verbal problem-solving 1 test based on the traditional guessing game [Mosher & Hornsby, 1966]. In the TQT, the experimenter selects a target from a picture array of everyday objects, and the participant asks a series of questions to establish its identity. Typically, the questions will narrow down possibilities via a categorical hierarchy, such as "Is it living?", "Is it an animal?" and so on. Compared with age and intelligence quotient (IQ)matched typically developing (TD) peers, highfunctioning children and adults with ASD take more guesses on the game and ask fewer category-based questions [Minshew, Siegel, Goldstein, & Weldy, 1994]. Moreover, the grouping questions used by ASD participants are often too specific: for example, they may ask "Is it something you eat soup with?" when it may be more effective to first ask "Is it something you eat with?" or "Is it cutlery?" [Alderson-Day & McGonigle-Chalmers, 2011]. Because many ASD individuals are able to identify basic categories when they are prompted to on other tasks [Tager-Flusberg, 1985;Ungerer & Sigman, 1987], it has been suggested that this reflects a specific problem with "concept formation," namely a difficulty in organizing a set of items into a new grouping heuristic when this needs to be done spontaneously [Minshew et al., 2002]. But the TQT-and problem-solving more generally-also involves a range of other, complex demands that could be affecting ASD performance.
First, efficient problem solving relies on executive functions (EFs); that is, the set of skills required to retain and 1 "Problem solving" is a term that has been applied to a wide range of tasks that can sometimes vary considerably [c.f. Rumsey, 1985;Soulieres et al., 2009]. Broadly, it is used to refer to tasks or puzzles where the solution is not made apparent in the task materials. More specifically, problem-solving tasks often require (a) the generation of a strategy to achieve success and (b) working through a series of moves or steps towards a solution [Newell & Simon, 1972]. manipulate information "on-line" during goal-directed tasks, such as planning, flexibility, selective attention, inhibition and working memory [Hill, 2004]. Two studies by Alderson-Day and colleagues studied the effects of these factors on TQT performance [Alderson-Day, 2011;Alderson-Day & McGonigle-Chalmers, 2011]. The typical TQT includes an array of pictures that do not change throughout the task, meaning that participants have to remember their questions "on-line" as they play [Mosher & Hornsby, 1966]. Alderson-Day and McGonigle-Chalmers [2011] tested what effect this has using a version of the TQT based on a Guess Who? board, where participants could knock down items as they searched. Compared with controls, a sample of high-functioning children with ASD had to ask more questions on average to reach the target when they were unable to physically eliminate items.
When items cannot be removed, participants not only have to remember questions, but they also have to selectively attend to relevant information in the visual array. To parse out these demands, a second study by Alderson-Day [2011] provided participants with a written reminder of their questions when knocking down items was prohibited. This eliminated the need for additional questions in the ASD group-even though the visual demands of the task had not changed-implying a problem with memory for questions rather than attention. In addition, the participants in Alderson-Day [2011] appeared to have difficulty with the planning demands of the TQT. Compared with controls, ASD participants could recognize good questions to ask in isolation but struggled to plan a series of questions in advance that would be likely to narrow down options. Thus, while the TQT may require some element of concept formation, problems with working memory and planning also appear to affect ASD problem solving in this case.

Effects of Language on Problem-Solving: The Comparison With Deafness
Another important factor to consider is the role of language skills, which is prompted by similarities in problem solving between ASD and deafness. In a study with deaf schoolchildren, Marschark and Everhart [1999] observed more guessing and less use of category questions in deaf participants compared with hearing participants, with similar problems being evident in a follow-up sample of deaf graduate students. Executive difficulties are sometimes evident in deaf children, usually presenting as problems with self-regulation and impulsivity [see Hauser, Lukomski, & Hillman, 2008, for a review]. But rather than explain their data in terms of EF skills, Marschark and Everhart proposed that they are likely to reflect the atypical language development that many deaf children experience. Deafness per se is not associated with delays or deficits: if deaf children have early access to language, usually by having deaf parents or relatives who can sign, they tend to develop very good language and cognitive skills [Mayberry, 2002]. However, over 90% of deaf children have hearing parents [Mitchell & Karchmer, 2004], meaning that many will not encounter skilled users of signing until school age, and some may only be encouraged to use spoken language rather than sign. Accordingly, there can be a range of delays in language skills for deaf children [e.g. Blamey, 2003;Moeller, Tomblin, Yoshinaga-Itano, Connor, & Jerger, 2007], and it has been suggested that this has consequences for languagerelated cognitive skills, particularly those more dependent on knowledge of spoken English [Marschark, 2006]. For instance, there is evidence of subtle differences in verbal reasoning, categorization and free recall in deaf adults when compared to hearing controls [Farjardo, Arfé, Benedetti, & Altoé, 2008;Koh, Vernon, & Bailey, 1971;Marschark, Convertino, McEvoy, & Masteller, 2004;McEvoy, Marschark, & Nelson, 1999;Ormel et al., 2010;Yi et al., 2011].
Given the presence of early communication difficulties in ASD [Boucher, 2012], it could be that similar factors affect verbal problem solving in autism. One way to test this is to compare TQT performance in young people with high-functioning autism (HFA) and Asperger syndrome (AS). In contrast to HFA, AS has typically been associated with the presence of intact structural language skills in the first 3 years of life [American Psychiatric Association, 1994;World Health Organization, 1993]. In most other respects, however, HFA and AS are considered to be alike [as indicated by the removal of AS as a separate diagnosis in DSM 5; American Psychiatric Association, 2013]. While some early studies reported greater EF skills and stronger verbal than nonverbal skills in AS compared with HFA [e.g. Szatmari, Archer, Fisman, Streiner, & Wilson, 1995], studies that have controlled for IQ generally find very few cognitive differences at all between the two groups, including similar performance on many EF tasks [Manjiviona & Prior, 1999;Mayes & Calhoun, 2004;Ozonoff, South, & Miller, 2000]. No studies, however, have compared verbal problem-solving skills of this kind between autism and AS.
If early language skills affect verbal problem solving in ASD, then children with AS should show intact verbal problem-solving skills compared with children with autism. The main aim of the present study was to test this by comparing children with HFA, AS and typical development in their TQT performance. The first hypothesis was that HFA but not AS participants would show impaired performance on the task compared with TD children.

Explaining Differences in Problem-Solving Performance
The second aim of the study was to explain why such a difference might exist by ruling out confounds and identifying potential markers of early language skills. Poor problem-solving performance could just result from problems with question understanding, planning ahead and coming up with new questions on the spot; none of which are necessarily indicative of early language skills [AS participants, for instance, in some cases show an advantage over HFA participants on tests of word fluency; Spek, Schatorje, Scholte, & van Berckelaer-Onnes, 2009]. To rule out such differences, three tasks were deployed: a question discrimination (QD) task and a plan construction (PC) task from Alderson-Day [2011], and a verbal fluency measure. Following prior evidence of generally similar executive and language skills in HFA and AS, we hypothesized that there would be no difference between the two ASD groups on these measures.
For early language skills to have an effect on later problem solving, they would plausibly need to shape how different strategies are internally considered and selected. For instance, early language delays could disrupt the development of inner speech, interfering with selfregulation and verbal deliberation [Diaz & Berk, 1992]. Alternatively, delays in language could lead to visually mediated cognitive strategies taking precedence over verbally mediated ones [Soulieres et al., 2009]. Arguably the most plausible route, though, is via semantic memory. Delays to early communication could disrupt the learning of new semantic groupings and the development of typical associations between exemplars and categories [Horton & Markman, 1980;Marschark et al., 2004]. To test this, a novel semantic decision task (SDT) was included in the testing battery. It was hypothesized that HFA but not AS participants would show atypical semantic decision skills and that this would be associated with group differences in problem solving.
Finally, a questionnaire measure of language milestones was deployed as an exploratory tool to assess possible links between language history and task performance. If semantic skills were not observed to explain problem-solving performance, then language milestones could still indicate the presence of an unspecified effect of language delay.

Participants
Fifteen children with AS (14 m; ages 9-16) and 15 children with HFA (14 m: ages 9-18) were recruited from the local area via parent groups and a local autism charity. Participants possessed a diagnosis of either autism or AS in accordance with ICD-10 research diagnostic criteria [World Health Organization, 1993]. All ASD participants were originally diagnosed via contact with local clinical services, where diagnoses are made based on agreement by a multidisciplinary panel and use of the Autism Diagnostic Observation Schedule [Lord et al., 2000] and Autism Diagnostic Interview-Revised [ADI-R: Lord, Rutter, & Couteur, 1994]. Five participants had also had their diagnosis confirmed within the past 3 years by a trained researcher using the ADI-R. Exclusion criteria included the presence of any other neurological conditions, specific language impairments (SLIs) or reading difficulties. 2 Fifteen TD children (10 m; ages 9-18) were recruited from a participant database to provide a neurotypical comparison group. All recruitment and study procedures were approved by the University of Edinburgh research ethics committee.
Cognitive abilities were estimated using the vocabulary, similarities and matrix reasoning subtests of the Wechsler Abbreviated Scale for Intelligence [WASI: Wechsler, 1999], providing scores for full-scale IQ (vocabulary and matrix reasoning) and verbal IQ (vocabulary and similarities). Pairwise t-tests indicated that the three groups did not significantly differ in IQ, although trends were observed for mean differences in VIQ (P = 0.089) and, to a lesser extent, FSIQ (P = 0.098) between HFA and TD participants specifically. While HFA and TD participants were age matched, the HFA group was significantly older than the group of AS participants (HFA > AS, t(28) = 2.157, P = 0.040) 3 (Table 1). 2 One HFA participant had also previously received a diagnosis of attention deficit hyperactivity disorder (ADHD). Because of the high comorbidity of ASD and ADHD [Leyfer et al., 2006], this participant was not excluded, but the data weremarked for later analysis in case of potential outliers in performance. However, all of the participant's data fell well within range for their group. 3 Parents were also asked to complete a version of the Autism Quotient [AQ-Adolescent; Baron-Cohen, Hoekstra, Knickmeyer, & Wheelwright, 2006] about their child as a further means of matching the groups. Questionnaires were available for all but one HFA participant. Both HFA and ASD participants scored higher than TD participants (P < 0.05). No difference was observed between the ASD groups (P = 0.596).

Materials and Procedure
The TQT. The first task attempted was the TQT. The task was presented on a board containing pictures of 24 everyday items, displayed in hinged frames (allowing for participants to eliminate items after each question). Participants completed three trials of Twenty Questions: the first two trials allowed item elimination during search by knocking down pictures that were no longer needed. On the last trial, elimination was prohibited, increasing the memory demands of the task. Alongside the game board, a 15" laptop was used to provide a "random selector" animation and audiovisual feedback during the game [for a full explanation of the TQT procedure, see Alderson-Day, 2011].
The primary outcome for the TQT was question quality (QQ), defined as the minimum proportion of items eliminated per question. For example, in a set of 10 items including five animals, "Is it an animal?" would eliminate at least half of the items irrespective of the answer, providing a score of 0.5. A direct guess ("Is it the dog?") would only be guaranteed to eliminate one item out of 10, scoring 0.1. For comparison with previous studies, the number of questions used per trial and percentages of grouping questions and guesses were also recorded.

QD and PC.
Following the TQT, participants attempted the QD and PC tasks from Alderson-Day [2011]. For QD, participants were presented with 10 hypothetical scenarios from Twenty Questions and asked to select which of two questions would be the best to ask first in each scenario. Five 12-item scenarios and five 24-item scenarios were presented using a stimulus book. The task was scored for the number of correct answers out of 10.
For PC, participants were presented with an array of 32 possible questions and asked to select five questions that would be useful to use "if we were to play the game again in a moment." Once five questions were selected, participants were asked to order them in terms of which question they would ask first, second and so on. Responses were scored based on the mean QQ for the five questions selected, assuming a 24-item TQT set. For example, a sequence asking about living things, animals and pets would be guaranteed to eliminate 12, 6 and 3 items on average from the set, and would be allocated scores of 0.5, 0.25 and 0.125. Greater scores indicate greater efficiency of plans.
Verbal fluency. To assess verbal fluency abilities, the letter and semantic fluency subtests from the Addenbrooke's Cognitive Examination-Revised [ACE-R; Mioshi, et al., 2006] were administered. Raw scores for letter fluency (words beginning with "P") and semantic fluency (animals) were used.

SDT.
The SDT was based on semantic association measures used by Gaffrey et al. [2007] and Marschark et al. [2004], and presented on a laptop using E-Prime [Schneider, Eschman, & Zuccolotto, 2002]. Participants viewed a target word (e.g. ANIMAL) and were then asked to judge whether a series of cue words was associated with the target (e.g. DOG, HAMMER, HORSE). In the category condition, the target word was a superordinate category term (such as ANIMAL or TOOL), and the cue words were all basic exemplars, only some of which belonged to the target category. In the exemplar condition, a basic exemplar was the target (e.g. DOG), and the cue words were all superordinate category terms (e.g. ANIMAL, PET, FRUIT). Participants completed three blocks of 10 trials in each condition. Each trial consisted of a target word (2-sec presentation), a 500-msec interval and a cue word, which would remain on screen until the participant responded. Responses were followed by a feedback page (showing "Correct!" or "Incorrect"). Based on prior evidence of intact category identification in ASD [Minshew et al., 2002], the reaction times for accurate responses (indicating semantic association) were used as the primary outcome of the task. In addition, accuracy scores were collected for each condition. 4 Language questionnaire. Parents were asked to indicate (a) age of first word, (b) age of first phrase of two or more words and (c) language ratings at age 3, 5, 7 and current age in relation to other children of the same age. Ratings were made on a Likert scale ranging from 1 ("Much worse than other children of the same age") to 5 ("Much better than other children"). Items (a) and (b) were chosen based on their standard use in the ADI-R [Lord et al., 1994]. Language ratings beyond age 3 were included to reflect the possibility of later language abilities also having important predictive value [see, e.g. Bennett et al., 2008].

Analysis
Unless otherwise stated, analysis of covariance (ANCOVA) was used to compare the three groups on the main task outcomes. Covariate analysis, using age and VIQ as covariates, was used to account for potential influences of age and general ability. VIQ but not full-scale IQ was included as a covariate because of (a) strong collinearity between scores for both and (b) the greater relevance of VIQ to verbal problem solving. Where dependent variables were nonnormal, nonparametric tests were used (specifically, Kruskal-Wallis tests with Mann-Whitney 4 Participants also initially completed a practice round of identifying four-, six-and eight-letter words without a semantic decision component, but that is not reported here. post-hoc tests when assessing group differences and Spearman's Rho for correlational analysis).
ANCOVA was first of all applied to performance on the TQT to test the hypothesis that HFA but not AS participants would be less efficient than TD participants in their problem solving. Second, ANCOVAs and Kruskall-Wallis tests were used to assess group differences in QD, planning, fluency and semantic decision. To test their effect on problem solving, they were then also included as covariates in a reanalysis of TQT performance. Finally, correlation and hierarchical regression analyses were used to test for potential predictors of problem-solving performance across all three groups combined.
P-values were not corrected across different tasks because there were deemed to be testing separate questions (namely do the groups differ in problem solving, is that because of clear confounds in other relevant skills, and is it because of a difference in semantic abilities?). Within each task, post hoc comparisons were made using P-values Bonferroni-corrected for the number of pairwise tests between groups.
For the secondary outcomes of the TQT, similar group differences were evident for the number of questions on each trial (group main effect: F (2, 40) = 4.056, P = 0.025, eta p 2 = 0.169), although only the HFA vs. TD contrast was significant (P = 0.032). Use of grouping was high in all groups (60-65%), and on average guesses were used twice as much by ASD participants, but Kruskal-Wallis ANOVAs (used because of skew in the rates of grouping and guessing) indicated no significant group differences (all P > 0.400). A mixed ANCOVA was also used to check for any changes in efficiency across the three task trials. Despite the switch from allowing (trials 1 and 2) to prohibiting elimination (trial 3), no significant trial effects or interactions were evident for QQ (all P > 0.05, all eta p 2 < 0.1), suggesting that overall group differences on these variables were consistent across trials.  difference on semantic fluency score (X2 (2) = 6.33, N = 45, P = 0.042) between the groups. In general, performance was best in TD participants and worst in AS participants (see Table 2), but no pairwise differences survived correction for multiple comparisons. To test for potential effects of fluency performance on problem solving, letter and semantic fluency scores were then added separately as covariates to ANCOVAs of TQT QQ. Neither significantly contributed to TQT performance, and all original main effects remained the same (all P > 0.600, all eta p 2 < 0.02).

SDT.
A 3 × 2 (group × condition) mixed ANCOVA was used to compare reaction times in each group on the SDT. Significant contributions of age (F (1, 40) = 10.774, P = 0.002, etap 2 = 0.212) and VIQ (F (1, 40) = 5.388, P = 0.025, etap 2 = 0.119) were observed, but no significant effect of group. Nominally, mean reaction times were slower for exemplar-to-category associations than the reverse (see Table 2), but no significant difference was observed between the two conditions (P = 0.154, etap 2 = 0.050) nor any group × condition interactions. Accuracy scores for the same task were nonnormally distributed. Kruskal-Wallis tests indicated no significant differences in accuracy on the exemplar condition (X2 (2) = 4.295, N = 45, P = 0.117), but a significant contrast for the category condition (X2 (2) = 8.462, N = 45, P = 0.012). Mann-Whitney U-tests indicated that AS participants were less accurate than TD participants (U = 49.50, N = 30, P = 0.042) in their identification of exemplars when provided with a superordinate category (e.g. Does it go with TOOL?). No other pairwise comparisons reached significance (all P > 0.05).
When SDT outcomes were included as covariates in the TQT analysis, no significant covariate effects were observed (all P > 0.300, all etap 2 < 0.03), suggesting that they could not explain group differences in problemsolving efficiency.
Early language ratings. Language milestones and parent ratings are displayed in Table 3. Spearman's corre-lations were used to assess the validity of language ratings for ages 3 and up, showing moderate correlations with full-scale (r = 0.26-0.29) and verbal IQ (r = 0.19-0.30). A hierarchical regression analysis was used to explore potential predictors of problem-solving performance, using mean QQ as the dependent variable. Block 1 included age and gender (as control variables), block 2 added ages of first word and first phrase, and block 3 added language ratings for 3, 5, 7 and current age. The only individual predictor to reach significance in any model was age of first phrase (stan. beta = −0.532, P = 0.029), and while block 2 showed a significant R 2 change over block 1 (ΔR 2 = 0.145, F(2,44) = 3.492, P = 0.043), none of the resulting models significantly predicted mean QQ (all P > 0.110).

Discussion
The main finding of the study was that HFA participants, but not AS participants, adopted less efficient strategies than TD children during verbal problem solving. As was hypothesized, HFA participants asked questions that eliminated fewer items each time, whereas AS participants performed at a similar level to TD children. This suggests that atypical language development may be important to explaining inefficiencies in the task performance of ASD participants and that prior evidence of problems on the TQT in ASD samples [Alderson-Day & McGonigle-Chalmers, 2011;Minshew et al., 1994Minshew et al., , 2002 may only apply to those with experience of language delay. There was also tentative evidence to suggest that age of first phrase acquisition was related to problemsolving performance, although in general early language milestones and ratings from parents did not significantly predict success on the TQT.
Alongside this, AS and HFA participants displayed a very similar profile on a range of other measures. No differences between ASD participants were observed in question understanding, planning and verbal fluency, in support of the hypothesis that such skills would not explain group differences in problem solving. This is consistent with prior reports of comparable EF and fluency skills in autism and AS [Manjiviona & Prior, 1999;Verté, Geurts, Roeyers, Oosterlaan, & Sergeant, 2006;cf. Spek et al., 2009]. It may have been expected that AS participants would be generally be more fluent than HFA participants and thus able to generate questions on the task, but the direction of results indicated the opposite. Furthermore, performance on the task was unrelated to problem-solving efficiency on the TQT.
These results add to the prior findings of Alderson-Day [2011] and Alderson-Day and McGonigle-Chalmers [2011] by suggesting that verbal problem solving might be a specific problem for HFA children, rather than ASD as a whole. Moreover, while those studies identified specific executive demands posed by the TQT, the present study suggests that language background may be more important to understanding why children with ASD struggle to use the most effective questions.
The final hypothesis-that differences on the TQT would map on to underlying differences in semantic skill-was not supported: performance on a SDT was unrelated to success on the TQT. Contrary to predictions, AS rather than HFA participants showed the most atypical performance on this task, scoring lowest for the identification of exemplars for specific superordinate categories. This is consistent with prior evidence of atypical semantic skills in AS compared with TD children (Kamio et al., 2007) but hard to explain in relation to HFA participants. Very few studies have directly compared categorization or other related lexico-semantic skills in AS and HFA, and those that have usually find HFA to be more atypical in profile than AS [e.g. Speirs, Yelland, Rinehart, & Tonge, 2011]. In any case, there is little evidence here to suggest that semantic skills provide the link between language history and later problem solving for children with HFA.
One process that could be implicated instead is inner speech (also known as silent speech or internal monologue). Inner speech is often argued to be developmental in origin and has been historically associated with problem solving and self-regulation [Vygotsky, 1987]. Problems with early communicative interaction would in theory impact upon inner speech and its developmental precursor, private speech [Fernyhough, 1996]. Intriguingly, use of private speech appears to be intact in children with ASD and can even enhance their performance on cognitive tasks relative to when they are silent [Winsler, Abar, Feder, Schunn, & Rubio, 2007]. However, a range of studies have indicated that inner speech is less likely to be utilized by people with ASD [Holland & Low, 2010;Wallace, Silvers, Martin, & Kenworthy, 2009;Whitehouse, Maybery, & Durkin, 2006], and this seems to be particularly the case for more complex planning and problem-solving tasks [Williams, Bowler, & Jarrold, 2012]. If the development and internalization of inner speech was more likely to be disrupted in HFA compared with AS, then this could have long-term consequences for activities like verbal problem solving.
Such an explanation is speculative, but it has specific implications that are testable. One prediction is that there would be differences in inner speech use within the autism spectrum according to language history, at its simplest varying as a function of language delay, or varying with the degree of early communicative impairment in some other way. Another implication is that we should expect similar problem-solving profiles in other children with a history of language difficulties, such as those with a SLI. There is initial evidence to suggest that children with SLI show intact use of inner speech but less internalized use of private speech during planning tasks, implying a delayed development of verbal strategy skills [Lidstone, Meins, & Fernyhough, 2012]. It may be that similar delays in the internalization of self-directed language skills affects ASD as well: a question for future research would be to examine how the relative proportions of private and inner speech use vary for ASD children in relation to their degree of language delay.
Another possibility, not mutually exclusive to the first, is that participants with HFA were more likely than AS or TD participants to adopt other, nonverbal strategies in their approach to the TQT. Anecdotally, there are many accounts of people with ASD preferring to "think in pictures" rather than speech [Grandin, 1995;Kunda & Goel, 2011]. Direct experimental comparisons are few, but there is some evidence to suggest HFA but not AS participants respond faster to visuospatial rather than verbal matrix reasoning puzzles [Sahyoun, Soulières, Belliveau, Mottron, & Mody, 2009]. If this were to explain differences in problem-solving skill, the implication would be that ASD individuals with language delay would be more likely to adopt visual strategies than those with more typical language development. As the TQT involves a visual array, visualizing potential groupings or basing questions on concrete and perceptual similarities represent possible ways of attempting the task, but also ones that may not identify the most abstract categories for questioning (such as organic vs. nonorganic entities). Dependence on visual or verbal strategies could be investigated by manipulating levels of perceptual similarity and abstractness in the test materials [for a preliminary example, see Alderson-Day & McGonigle-Chalmers, 2011].
It is of course possible that AS and HFA participants were differing in other ways on the task. Given its visual presentation, it could be that HFA participants were narrowly focusing on small groupings at the expense of more global categories, as would be typical of a "local-biased" processing style [Happé & Frith, 2006]. However, signs of local bias are generally evident across the autistic spectrum [e.g. Jolliffe & Baron-Cohen, 1997]. It is also not clear why, developmentally, the two groups would be more likely to differ in this regard, but not differ in other ways more closely related to language.
Before discussing the practical implications of these findings, some caveats must be acknowledged. First, the sample size tested here is small, and it was not possible to closely match the participant groups in age and IQ abilities. The analytic method used here to compensate for this (ANCOVA) adjusts for the effects of age and IQ, but it should not be interpreted as fully "controlling" for their influence [Miller & Chapman, 2001]. This is perhaps less of a concern regarding age, as HFA participants were significantly older than AS participants and yet still performed worse on the TQT. That being said, the relatively wide age range may have also obscured important differences in ability, given that executive skills and overall problem-solving competence can change considerably for ASD participants in adolescence [ Van den Bergh, Scheeren, Begeer, Koot, & Geurts, 2014]. The inequivalence of the groups is more important regarding VIQ, as theoretically this could have driven group differences in performance despite the statistical correction of using ANCOVA. In mitigation, it is worth noting that group differences between HFA and TD participants have previously been observed in samples closely matched for IQ [Alderson-Day, 2011;Minshew et al., 1994] and that HFA participants in the present study performed comparably on almost every other task. Nevertheless, these findings need to be replicated in a larger, more closely matched sample before the potential contributions of age and IQ to group differences in problem solving can be clearly ruled out.
Second, the study did not include a standardized measure of language skills, such as the Clinical Evaluation of Language Fundamentals (CELF) test [Semel, Wiig, & Secord, 1995]. To allow for other experimental tasks to be used in the time allowed, it was not possible to deploy an in-depth language battery in this instance: a larger study with an existing database of ASD participants should be able to achieve this. However, while a standardized language measure was not deployed here, the tasks used covered a range of relevant skills, including lexical knowledge (WASI vocabulary), category knowledge (SDT) and word fluency (ACE-R letter and semantic fluency). Thus, a number of language-dependent skills were accounted for, even if a standardized battery was absent.
Finally, the use of parent's retrospective reports of early language abilities-which may have occurred over 10 years ago-at best only offer a rough proxy for language skills at the time, and without additional data it is unknown how reliable those ratings truly are. The data provided by families generally fitted existing diagnoses, but only longitudinal data could fully demonstrate relationships between early language and later cognitive skills. Such data would also be important in assessing how problem-solving abilities may change with language skills over time for people with ASD.
Notwithstanding those limitations, the study has a range of potential implications for methods and practice. First, if the TQT and other measures of verbal problemsolving are used with ASD groups [as it is in the Delis Kaplan Executive Function System; Delis, Kaplan, & Kramer, 2001], then task performance needs to be considered in the context of current and past language skills. The TQT is not a simple measure of problem solving or concept formation: it is a complex task with considerable executive and linguistic demands. Other cognitive tasks where the most effective strategies are language dependent, and the executive load is high-such as certain types of free recall or counterfactual reasoning-are also likely to create similar problems for HFA individuals.
Second, although the recent changes to diagnostic criteria have eliminated the diagnosis of Asperger disorder [American Psychiatric Association, 2013], these data act as a reminder that variation in language skills and development across the spectrum are important and can impact upon cognition in subtle ways for people with ASD, even if the large majority of cognitive outcomes appear similar. This is likely to be particularly important in educational contexts for understanding what kinds of strategies are going to be most useful for facilitating verbal problem-solving skills in ASD individuals. In social problem-solving training [Solomon, Goodlin-Jones, & Anders, 2004], for example, young people with HFA who have good structural language skills but a history of language delay may still need considerable support for use of new verbal strategies. Alternatively, they may be more likely to benefit from use of visual materials such as decision trees, Venn diagrams or other graphical techniques that can be used to support decision making [Davies, Stock, & Wehmeyer, 2003;Dexter & Hughes, 2011]. AS individuals, in contrast, may be better placed to handle the language demands of such training, while still struggling with the social-cognitive aspects of its core content.
Any problem-solving task presents a range of complex demands: verbal problem solving often requires generating linguistic strategies and applying them flexibly to a new situation. The results presented here suggest that even a simple, game-based example of problem solving could be affected by an individual's developmental background. A replication of this result, with more closely matched groups and a wider age range, would test this more idea more comprehensively. Understanding how language development can selectively affect performance in a range of problem-solving contexts is crucial to developing better educational tools and better support for people with an ASD.