Assessment of verbal comprehension and non-verbal reasoning when standard response mode is challenging: A comparison of different response modes and an exploration of their clinical usefulness

Abstract Objective: Assessment of cognition is important for providing children with developmentally appropriate interventions. Often children with severe speech and motor disorders are not assessed, as standardized assessment is perceived as challenging. This study investigates how assessments of cognition can be conducted using alternative response modes. Method: The study has two parts. The response modes finger pointing (FP), gaze pointing (GP), and partner-assisted scanning (PAS) were compared when assessing verbal comprehension and visuospatial abilities in 27 typically developing (TD) children, aged 5; 10–6; 10 years. Clinical utility was investigated by assessing 69 children with cerebral palsy (CP), using FP or GP as response mode depending on motor functioning. Results: In TD first graders, using alternative response modes did not influence the test results (Wilks’s lambda = 0.871, F (2, 23) = 1.70, p = 0.204). PAS was more time consuming than FP and GP on some of the tasks. In the CP group, considerable individual variation was found both when using FP (standard scores 40–138) and GP (37–120). Conclusions: Type of response mode did not impact test results in typically developing children. Assessment of cognition is possible even when children have severe speech and movement impairments.


Introduction
Children with severe speech and movement impairments comprise a heterogeneous group. The reasons for their impairments might be congenital or acquired, and includes diagnoses such as cerebral palsy (CP), neurodegenerative disorders, muscular dystrophy, and severe brain injuries (Van Tubbergen, Warschausky, Birnholz, & Baker, 2008). The extent of the challenges they face in areas such as motor, cognitive and social functioning, and mental health, varies considerably (Andersen et al., 2008;Bjorgaas, Hysing, & Elgen, 2012). However, as a group, children with neurological illnesses and injuries are at significant risk of developing cognitive impairments and learning disabilities. The most common cause of severe speech and motor difficulties in children is CP, which is due to a nonprogressive brain damage acquired during early brain development (Rosenbaum et al., 2007;Van Tubbergen et al., 2008).
Children with CP are at risk of both specific learning disabilities and intellectual disability (Andersen et al., 2008;Fazzi et al., 2012;Fennell & Dikel, 2000;Frampton, Yude, & Goodman, 1998;Sigurdardottir et al., 2008;Smits et al., 2011;Straub & Obrzut, 2009). However, there are few studies of cognitive functioning in children which include children with the most severe speech and motor impairments. In a registry-based study from Norway it was reported that only 29 percent of the CP population had received ordinary assessment (Andersen et al., 2008), and it is particularly children with more severe speech or motor impairments who are left out (Sherwell et al., 2014;Sigurdardottir et al., 2008). For example, in one study where cognition was assessed with the nonverbal reasoning test Raven's coloured progressive matrices (Raven, 2008), 97% of the children with lesser motor impairments, and 37% of those with more severe motor impairment were assessed (Smits et al., 2011).
One reason for the scarcity of studies of cognitive functioning in children with severe speech and motor impairments might be that assessment is perceived as a challenge. Most neuropsychological tests require verbal expressive abilities, good fine motor skills, and rapid response (Beukelman & Mirenda, 2005;Sabbadini, Bonanni, Carlesimo, & Caltagirone, 2001). However, there are also authors arguing that it is possible to accommodate a test in a manner so that the influences of physical and/ or sensory impairment on test results are minimized without changing what the instrument measures (Visser, 2014). When considering whether a proposed test accommodation is valid, the test administer should carefully consider the purpose of the test, the skills to be evaluated and what inferences can be drawn from the test scores (Phillips, 1994). Examples of accommodations that can be applied are to use aided communication instead of speech, to alter test materials to make it more accessible and to alter response mode (Alant & Casey, 2005;Schiørbeck & Stadskleiv, 2008;Visser, Ruiter, Van der Meulen, Ruijssenaars, & Timmerman, 2013;Wasson, Arvidson, & Lloyd, 1997). These accommodations have been found to not influence performance on cognitive tests for healthy children, but to result in higher test scores for some children with disabilities (Visser et al., 2013).
One form of accommodation is alteration of response mode. Alteration of response mode can be achieved by deploying the methods that children use to access their communication aids.
Children with severe speech and motor impairments are in need of a form of alternative communication called aided communication. Aided communication is graphic symbols or letters presented on boards, in books or on speech generating devices (von Tetzchner & Martinsen, 2000). Children using The effect of altered response modes has been investigated on tests with a fixed number of answer options. The standard response form pointing with a finger has been compared to other means of pointing or to some form of scanning (Arvidson, 2000;Casey, Tonsing, & Alant, 2007;Miller, 1991;Spillane, Ross, & Vasa, 1996;Thurèn, 2010;von Tetzchner, 1987;Wagner, 1994;Warschausky et al., 2012). None of these studies found significant difference between response modes (see Table 1). This implies that accommodating a test by changing response form provides an opportunity for children with disabilities to show their true cognitive abilities.
However, none of the studies has compared more than two response modes at a time, and it has not been examined to which extent alternative response modes affect the time it takes to complete test tasks. Few studies have examined the effect of altered response mode on tests with different layout and from different cognitive domains (but see von Tetzchner, 1987;Warschausky et al.,2012). None of the studies have applied tests with altered response modes to the most severely speech and motor impaired children.

Research objectives
The main objectives of this study are (1) to compare two alternative response modes with the standard response, and (2) to explore the clinical usefulness of alternative response modes.
The first aim, to compare different response modes, was investigated by administering tests of non-verbal reasoning and verbal comprehension to typically developing (TD) children. Language and visuospatial reasoning are important areas to investigate in order to get knowledge about a child's general cognitive abilities (Flanagan, 2013). In this study, verbal comprehension and visuospatial abilities were examined with tests that require only pointing with a finger to an answer alternative as standard response form. The tests all have a fixed number of answer options. This type of tests was chosen so that the need for alterations of response mode could be kept at a minimum. It is necessary to compare the response modes in a group of TD children as only children without speech and motor impairment can use all types of response forms. If accommodations are to be considered valid, it is imperative to ensure that the scores obtained with and without accommodations are equivalent (Phillips, 1994). Based on results from earlier studies (Arvidson, 2000;Casey et al., 2007;Miller, 1991;Spillane et al., 1996;Thurèn, 2010;von Tetzchner, 1987;Warschausky et al., 2012) we therefore expect that response mode will not influence how healthy children score on standardized psychological instruments.
The different response modes were compared not only with regard to test results, but also with regard to time used to complete the tests. It is important to identify time efficient administration methods, as many children with severe physical disabilities suffer from fatigue or have attentional difficulties (Berrin et al., 2007;Dahlgren Sandberg, 2006). The different methods that are used to access communication aids, and especially partner-assisted scanning, are considered time consuming (Light & Drager, 2007;Schiørbeck & Stadskleiv, 2008). However, it is not known if the same applies when using the response modes for assessment, as this has not been investigated. We hypothesized that response mode would not influence test results, but that partner-assisted scanning would be more time consuming than other response forms.
The second aim, to explore the clinical usefulness of alternative response modes, was investigated by administering tests of non-verbal reasoning and verbal comprehension to children with CP. None of the prior studies of altered response mode have included the most severely speech and motor impaired children. It is therefore an aim to describe how clinically relevant test scores can be obtained using an alternative response mode (gaze pointing) when a child is too motor impaired to answer with the standard response mode (finger pointing). As the groups using finger pointing and gaze pointing are not matched, this part of the study is descriptive.

Method
The study has two parts, each with a cross-sectional design. The first part is the comparison of response modes in a group of TD children; henceforth termed "the response mode sample." The second part demonstrates how two different response modes can be used in assessing children with CP; henceforth referred to as "the clinical sample." The studies will be presented separately with regard to methods and results.

Participants
The participants were recruited from three primary schools in Oslo. After agreement with the management of the schools, the teacher distributed information letters, as well as consent and information forms, to the children in class and asked them to give it to their parents. Requirements for participants were normal vision and hearing (or corrected with glasses and hearing device), no known learning disability and Norwegian language skills good enough to follow ordinary classroom education. The teachers evaluated if the children filled the inclusion criteria and decided to which children to extend the invitation letters. Information about vision and hearing was also provided by the children's parents.
A total of 138 invitations to participate in the project were distributed to children fulfilling the inclusion criteria, and parents of 31 children (22.5%) responded to the invitation. The study aimed for a convenience sample of 30 TD children. Four children were not assessed due to absence from school, so the final number of participants are 27 children (17 boys [63.0%]), with a mean age of 6:4 years (range 5:10-6:10). Nineteen children had Norwegian as their native language and eight children spoke another language in addition to Norwegian.

Instruments
To compare three different response modes it was necessary with three different tests, as to avoid learning effects. It was chosen to invite first grade students (6-7 year olds) as for this age group there are three different tests of verbal comprehension and three different tests of non-verbal reasoning with Norwegian norms available. This made it possible to have comparable tests of language comprehension and visuospatial cognition for comparing the three different response modes.
Verbal comprehension was investigated with British Picture Vocabulary Scale, second edition, (BPVS-II;Dunn, Dunn, Whetton, & Burley, 1997;Lyster, Horn, & Rygvold, 2010), Test for Reception of Grammar, second edition (TROG-2; Bishop, 2009) and Receptive Vocabulary (RV) from the Wechsler Preschool and Primary Scale of Intelligence, third version (WPPSI-III; Wechsler, 2008). The three tests all have a similar layout; four pictures are presented on the screen and the child should show comprehension of one word or sentence by indicating one of the four pictures.
Visuospatial cognition was investigated with the subtests Matrix Reasoning (MR) from WPPSI-III and Wechsler Intelligence Scale for Children, fourth version (WISC-IV; Wechsler, 2009), as well as with Raven's Coloured Progressive Matrices. The layout of these three non-verbal tests is similar. A target figure, with a missing piece of information, is presented at the top of the page and the answer options for that missing information are presented as separate choices at the bottom of the page. There are five to six different answer alternatives for each task. The layout of the nonverbal tests implies that the answer options are of a smaller size than for the verbal tests (see Figure 1).
Norwegian versions of BPVS-II, TROG-2, WPPSI-III, and WISC-IV were used. Norms provided by the test publishers were used, and all tests have norms for the age groups 6-7 years of age. All tests are reported to have good psychometric properties. For BPVS-II, a Cronbach's alpha of 0.93 is reported (Dunn et al., 1997). The Norwegian version of TROG-2 has high internal consistency (Cronbach's alpha = 0.95) and BPVS-II and TROG are highly correlated (Bishop, 2009). Reliability coefficients for the Norwegian version of WPPSI-III are reported to be 0.89 for RV and 0.86 for MR (Wechsler, 2008), while it is 0.88 for MR from WISC-IV (Wechsler, 2009). WPPSI-III and WISC-IV are significantly correlated (0.89) (Wechsler, 2009). Reliability coefficient for Raven is 0.97 in the UK standardization sample (Raven, 2008).
In order to compare results from the tests, all results are presented as standard scores (mean of 100 and standard deviation of 15). Results lying within one standard deviation from the age mean are defined as being within the normal range.
The plan was to recruit children older than 6 years of age, but it turned out that of the participating children, two children were 5:10 and three children 5:11 at the time of assessment. As they were a bit younger than the available norms for the Matrix Reasoning subtest from the WISC-IV, the norms for the age group 6:0 years was used. The mean scores of the five children younger than 6:0 years were not significantly different from the 22 children older than 6:0 years of age (M(SD) = 52.0 (10.7) vs. 49.5 (6.5), t(25) = −0.667, p = 0.511). For the other tests, the norms for the age group 5:10-5:11 years were used.
The tests were administered on a computer with eye gaze equipment (a portable communication device of the type P10 from TobiiDynavox). This adaptation was done with the permission of the test company. It was taken care to make sure that the test tasks looked as close to the original test format as possible, and that the presentation was the same for all response modes. This implies that all answer alternatives for each task was always visible for the children, as it is in the standard test booklet, that the answer that the child gave was recorded manually on the record forms and that it was the test administer who switched between the tasks of the test.

Response forms
The three response modes finger pointing (FP), pointing with eye gaze (GP) and partner-assisted scanning (PAS) were compared. FP is the standard response mode and it implies that a child points with a finger at the answer alternative on the computer screen that the child thinks is correct. GP enables the child to intentionally point to an answer alternative by just looking. Infrared light is projected from the computer into the child's eyes, and based on the reflection from the pupils it is computed where on a screen a child is looking. A frame then appears on the screen around the answer alternative that the child is looking at. The frame thus makes it easy to see where on the screen a child is looking, for both the child and the test administrator. The interaction time, e.g. the time given to look at an object before it is marked, can be individually set. For the purpose of this study, it was set around one second. For PAS the test administrator systematically points out possible answer options on the screen and the child either confirms or denies each alternative in turn.
Each participant used one response form to answer two tests; one test of verbal comprehension and one test of non-verbal reasoning. ?
Responding by GP was a new experience for the children in the response mode sample. Before testing commenced, an introductory play activity lasting a couple of minutes was administered. The play task was to "find Winnie the Pooh" by looking at increasingly smaller pictures of him "hidden" among other Disney characters. In this activity the children experienced that the computer picked up where on the screen they were looking. All children in the response mode sample managed this task.

Procedure
The tests were administered individually to the children and all assessment of a child was performed in one session. At the start of the session, the children were given age appropriate information about the study, as well as instructions about the tests and the three response forms. The assessments lasted for about one and a half hours including a small break midway. The assessment process was filmed so that the administration process and time used to complete the tests could be analyzed afterward. The response mode sample was randomly divided into three groups. Each response form was used for two tests: first a verbal, then a visuospatial test. All three groups were given tests in different order, while order for response forms was the same (FP, GP, and then PAS) (see Table 2). Twenty-five children in the response mode sample completed all six tests and two children chose to complete only the four tests with FP and GP.

Statistics
Statistical package SPSS 22.0 was used for data analysis. To compare test scores and response time depending on response mode, one-way analysis of variance (ANOVA) and paired samples t-tests were performed. Preliminary analyses showed that the data were normally distributed and Levene's test indicated that the variance in scores and time used on tests was homogenous (p > 0.05). There were no extreme results lying more than two standard deviations from the mean. Post-hoc analysis with Tukey test was performed to correct for multiple comparisons.

Ethics
The response mode part of the study was approved by the Norwegian Social Science Data Services (NSD). The parents of the participating children were informed that the purpose of the study was to explore different response modes and not to assess the capabilities of the child. They were informed that they would be contacted and advised on referral practices if test results showed indications of

Participants
Participants in the clinical sample were recruited from the Pediatric habilitation unit at Oslo University Hospital, which offers follow-up to all children with CP in 12 of 15 districts of Oslo. All children born between 1995 and 2008 with a diagnosis of cerebral palsy (ICD-10 G80.0-80.9; World Health Organization, 2004) were invited. The study is part of an investigation of cognition in children with CP (Stadskleiv, Jahnsen, & von Tetzchner, 2016).
Seven of the 76 children are not included in this paper, as one child sadly died, consent for one child was withdrawn, one child had a co-diagnosis of Down syndrome, one child used multiple response modes and the hospital revoked the CP diagnosis for three of the children as incidents happening after 24 months of age had caused their motor impairments. According to Nordic traditions, the CP diagnosis is reserved for children with postnatal injuries happening before two years of age (Hagberg, 2000). The clinical sample therefore consists of 69 children with a diagnosis of CP.
Mean age in the clinical sample was 9:9 years (range 5:1-17:7), and 38 (54.3%) were girls. The majority of the children had a spastic form of CP (85.7%), and most had milder motor deficits (see Table 3).
It is common to use classification instruments to describe variability in function in children with CP. We classified gross and fine motor functioning according to the Gross Motor Function Classification System (GMFCS) (Palisano et al., 1997) and the Manual Ability Classification System (MACS) (Eliasson et al., 2006), speech according to the Viking Speech Scale (VSS) (Pennington et al., 2013) and communication according to the Communication Functioning Classification System (CFCS) (Hidecker et al., 2011). The GMFCS, MACS, and CFCS have five levels, and the VSS has four levels. All four classification instruments are organized from Level I (least impairment) to Level IV/V (most impairment).
Children with severe speech and motor impairments are defined as having GMFCS and MACS levels IV and V, indicating that they need wheelchairs for transportation and are severely restricted or not at all able to handle objects, and VSS level III and IV (very restricted and unclear or no understandable speech). Their communicative functioning will depend on their access to appropriate communication aids and can vary from CFCS level II (effective, but slow, sender and recipient with both familiar and unfamiliar partners) to V (seldom efficient sender and recipient, even with familiar partners).

Instruments
The same six tests as in the response mode sample were administered. The children were administered tests that were age appropriate, and were therefore either administered the Matrix Reasoning from WPPSI-III or the Matrix Reasoning from WISC-IV. The RV from WPPSI-III was administered the participants with severe impairments, even if they were older than 7 years of age.
All children cognitively capable of answering the tests were administered the BPVS-II, TROG-2, and Raven's matrices. This was determined by administering the BPVS-II or RV first, as they have norms from 2½-3 years of age. If the child was not able to answer any tasks on either of these two tests, the TROG-2, WPPSI-III Matrix Reasoning, or Raven were not attempted as they have norms from four years of age. Of the 69 children included, four children were not able to answer any tasks on BPVS-II or RV. They were instead assessed with a modified version of Bayley-III, and the results are not included in further analysis.

Response forms
The response modes FP and GP were used. All children using GP had prior knowledge about this response mode as they operated their communication devices in this manner.

Procedure
The children with CP needed two appointments to complete the assessments. It was not possible to compare response modes directly, as some of the children had too severe motor impairments to point at answer options with their hands. Instead, the group was split into a group using FP (N = 56) and a group using GP (N = 9). The two groups did not differ with regard to gender (χ 2 (1df) = 0.015, p = 0.903), but the GP group was somewhat older (M(SD) = 113.6 (38.7) vs. 142.5 (51.6),

Table 3. Characterization of the clinical sample (group of 69 children with cerebral palsy) and comparison of groups using finger versus gaze pointing
Note: Pearson chi-square. *p < 0.05. **p < 0.01. t(67df) = −6.068, p = 0.016). The difference in age is adjusted for using age normed tests, where each child is compared to what is expected at his or her age.

Finger pointing N(%) Gaze pointing N(%) Comparison of groups
As expected, the GP group had more severe subtypes of CP (spastic quadriplegia and dyskinesia), severe speech (VSS level III-IV), and motor impairments (GMFCS level IV-V, and MACS level III-V), as well as a somewhat slower communication pace (CFCS levels II-IV) than the FP group (see Table 3).

Statistics
To compare the distribution of categorical variables-physical severity of impairment and results from verbal and nonverbal tests-in the groups using FP and GP in the clinical sample, Pearson chisquare was computed. Assumptions for sample size and independence are fulfilled (Pallant, 2010;Wilson Van Voorhis & Morgan, 2007). Each participant appears in only one group and sample size is sufficiently large. No other statistical analysis was performed. The children with CP using GP have more severe impairments than children using FP, and the second study objective is therefore of a descriptive manner; to document the clinical utility and feasibility of using alternative response modes.

Ethics
In the clinical sample, parents were given written information about the project and asked to give consent to their child's participation in the project for children between six and sixteen years of age. Youth older than16 years, who did not have an intellectual disability, gave consent themselves. This part of the study was approved by the Norwegian Regional Committees for Medical and Health Research Ethics (#2012/1409).

Results
Results from the response mode sample showed that standard scores for the six tests, not taking response mode into account, lay around the age mean and ranged from 99.5-105.0 (see Table 4). One-way repeated measures ANOVA showed no significant difference between the results on the six tests; Wilks's lambda = 0.699, F (5, 20) = 1.72, p = 0.176. When comparing results between tests with paired samples t-tests, scores on BPVS were somewhat higher than on TROG (t(24) = 2.13, p < 0.05). The mean standard scores across the three response modes, collapsed over tests, ranged from 100.2 to 104.5.0 (see Table 4) and one-way repeated measures ANOVA showed no significant difference between the three response forms, Wilks's lambda = 0.871, F (2, 23) = 1.71, p = 0.204. Splitting the tests into verbal and non-verbal domains did not alter this (verbal tests F (2, 22) = 1.403, p = 0.267 and non-verbal tests F (2, 23) = 0.191, p = 0.827).
The time used was compared for tests and response modes for the response mode sample (see Table 5). Between-group ANOVA shows that there were statistically significant differences for the time spent to complete tests using different response forms, but that the results were not significant for all tests. Post-hoc analysis with Tukey test revealed that for WPPSI MR, PAS was more time consuming than FP (p = 0.031). For Raven, PAS was more time consuming than both FP (p = 0.004) and GP (p = 0.005). For WPPSI RV PAS was also more time consuming than FP (p = 0.002) and GP (p = 0.008).
In the clinical sample, severity of motor impairment is related to severity of cognitive impairment; GMFCS is significantly correlated with both verbal tests (r = −0.39, p = 0.001) and non-verbal tests (r = −0.37, p = 0.003). In Table 6, descriptive statistics for the groups with CP using FP and GP are shown. The mean test scores are somewhat lower in the GP group than in the FP group, but the range of standard scores are comparable. In the FP group they range from 40 to 138, compared to a range from 37 to 120 in GP group.

Table 4. Standard scores (mean and SD) from tests and across response modes in the response mode sample (group of typically developing children)
Notes: M = Average; SD = Standard deviation. 1 The matrix reasoning subtest from WPPSI-III. 2 The matrix reasoning subtest from WISC-IV. 3 Raven's coloured progressive matrices. 4 The receptive vocabulary subtest from WPPSI-III. 5 British picture vocabulary scale, second edition. 6 Test for reception of grammar, second edition. 7 FP = Finger pointing. 8 GP = Gaze pointing. 9 PAS = Partner-assisted scanning.  In order to provide the best possible help for a child and promote development, it is important to identify each child's strengths and challenges early. Assessment of cognition is an important part of this process, and for children with severe speech and motor impairments it has the added value of providing knowledge about the child's level of understanding. This is vital for providing the child with an opportunity to express himself on a level comparable to his level of understanding. However, assessment of cognition in the most speech and motor impaired children are often not done as they cannot respond in the standard manner to a test of cognition.

Mean (SD) Range
The most important finding of this study is that it is possible to substitute the standard response mode finger pointing with the alternative response modes gaze pointing or partner-assisted scanning on tests that have a fixed number of answer alternatives. Specifically, we found that type of response mode had no impact on the test results for TD children. This was found to be true when investigating six different tests from the two cognitive domains verbal comprehension and nonverbal reasoning.
If modified response form uses the same cognitive resources as a standard response form, it means that the child processes information in the same way by both response forms (Casey et al., 2007). Our findings indicate that cognitive and physical conditions for each response form did not affect either positively or negatively the performance of TD first grade children, thereby strengthening the argument that altering of response modes is a form of accommodation of tests and that the alteration is small enough for the tests original norms to still be applicable. The findings are consistent with and expand on results from previous studies that have shown that an alternative mode of response might be used on a single test (Arvidson, 2000;Casey et al., 2007;Miller, 1991;Spillane et al., 1996;Thurèn, 2010;Visser et al., 2013).
The second part of the study documents that using an alternative response mode such as gaze pointing, it possible to assess cognitive functioning in children with the most severe speech and motor disabilities. The results from the clinical sample reflect the wide variation in cognitive functioning that is expected to be found in a group of children with a neurological disability. The CP children responding with GP had more severe speech and motor impairments than the FP group, as confirmed by comparing subtype of CP and the levels on the classification instruments of gross and fine motor functioning, speech, and communication. Table 6. Standard scores (mean and standard deviation) in the clinical sample (group of 69 children with cerebral palsy); responding with either finger pointing or gaze pointing 1 FP: Finger pointing.
2 GP: Gaze pointing. 3 The matrix reasoning subtest from WPPSI-III. 4 The matrix reasoning subtest from WISC-IV. 5 Raven's coloured progressive matrices. 6 The receptive vocabulary subtest from WPPSI-III. We found a significant relationship between severity of cognitive impairment and severity of motor impairment, a finding also reported by others (Sigurdardottir et al., 2008). That the GP group gets somewhat lower scores than the FP group, even if only significantly so for the BPVS-II scores (results not showed), is therefore expected. However, this relationship, together with the lack of assessment of the severely motor impaired, sometimes leads to severe cognitive impairment being assumed in children with severe motor impairment (e.g. Hutton & Pharoah, 2002). Assumptions about cognitive functioning may be based on the type of school a child attends and not results from assessments (Vos et al., 2014), and degree of motor impairment have been used as an indicator of total disability load (Himmelmann, Beckung, Hagberg, & Uvebrant, 2006). Such imprecise methods do not yield a nuanced picture of child's cognitive profile and might lead to faulty estimations of the extent of intellectual impairments in the CP population. It also does not fully take into account that even though there are significant correlations between types of impairments, there is no one-to-one relationship between cognitive and motor functioning (Blair, 2010;Sigurdardottir et al., 2008). Therefore, the most important result from the clinical sample is the diversity of the results; it shows that it is possible to obtain meaningful and nuanced information about cognitive functioning using GP pointing as response mode. The test results demonstrate that even in a group of severely speech and motor impaired children there are children obtaining scores within the normal range, and even above the age mean on some tests (on TROG, BPVS, and Raven). Answering tests of verbal comprehension and non-verbal reasoning with GP as response mode makes it possible for these children to demonstrate their strengths, despite the severity of their speech and motor impairments. The results therefore underlines the importance of individually assessing all children with CP, as inferences about cognitive functioning cannot be made based on extent of impairment in other areas (Blair, 2010).
The results from the response mode sample showed some differences between the response forms as to time use to complete tests. PAS was the most time consuming response mode, but the difference could not be observed for all tests. Most importantly, the extra time used did not impact scores in a group of TD children. It cannot, however, be ruled out that it might have affected scores in children with motor impairments. For children with severe motor impairments, time taken to complete tests is of importance as prolonged assessment can lead to fatigue (Geytenbeek, Heim, Vermeulen, & Oostrom, 2010). One would therefore think that GP, and not PAS, should be recommended. However, time taken to complete tests is not the only consideration that needs to be taken into account. Children with CP often have eye motor impairments and challenges with visual perception (Fazzi et al., 2012). If the child has eye motor problems, such as strabismus, or is wearing glasses with thick lenses, this will make it difficult to calibrate an eye-tracking device, which is crucial if one uses GP as a form of communication. PAS is therefore the access method usually used by children with severe speech and motor impairments who has some eye motor or visual problems in addition (Beukelman & Mirenda, 2005). Our study confirms that PAS might be a viable clinical option, even if it took somewhat longer to administer some of the tests in this manner.

Implications of the findings
The study illustrates how tests can be accommodated by altering response mode. It has important implications for children who due to motor impairments are not able to answer tests in a standard manner. This is needed. Even though children with severe speech and motor impairments are in obvious need of interventions, they are often not assessed and are not always provided with the most basic interventions they need, such as AAC (Andersen, Mjøen, & Vik, 2010;Andersen et al., 2008). The possibility of obtaining reliable test results is also important for academic interventions, as they would otherwise be based merely on clinical estimations of cognitive level. In children with severe speech and movement impairments, such practice has been associated with underestimations of cognitive functioning (Sigurdardottir et al., 2008;Stadskleiv, Vik, Andersen, & Lien, 2015). It is therefore recommended that all children with CP, including those with the most severe speech and motor impairments, are assessed cognitively at regular intervals (Bøttcher et al., 2015).
Using eye gaze technology to assess children in a clinical setting is feasible. Providing consent from test developers/publishers for this form of test administration, the technology for GP on computers is available. The type of computer used in this study is the same type that children use for communication (speech generating devices). For tests designed to be administered on a computer, the newest types of speech generating devices offers the possibility of using gaze pointing in the same manner as a computer mouse. If one wishes to use the test material in its original form, PAS can be chosen as response mode for both paper and computer-based tests.
Our results point to the necessity of considering several factors when choosing the suitable response modality for assessing cognition in children with severe speech and movement impairments. There is no one preferable response mode that is applicable for all children. Instead the choice should be based on knowledge of the child's functioning, and factors such as eye motor functioning, visual perception, and fatigability should be considered in order to select the response mode that allows a child with disabilities to show his or her true cognitive potential.
Although the study is based on the children, there is no reason to assume that the described assessment methodology would not be suitable for adults. The youngest child using CP in our clinical sample was 5:1 and the oldest 17:6 years. This illustrates that alternative response modes are possible to use from a young age and that they can also be used for older children and adolescents. This implies that the findings have relevance not only for children with congenital disabilities, but also for children, adolescents, and adults who because of acquired brain injuries have severe speech and movement impairments.

Limitations of the study
The size of the response mode sample is small. The study aimed for a convenience sample of typically developing first graders. Out of 138 invitations provided, only 31 consented to participation. However, as the teachers were instructed to invite all children filling the inclusion criteria, we have no reason to believe that the participating children were significantly different from non-participating children with regard to the use of different response modalities. The reasons for the somewhat low response rate are thought to be that the children had just started school and some parents therefore were reluctant to withdraw the children from class, even if only for two hours, or that some of the invitations did not reach the parents but were accidentally lost by the children among all the other papers handed out at the start of a school year. This is of course a weakness, as values that are divergent from the average, together with small sample sizes reduce the strength of parametric tests and increase the likelihood of both Type I and Type II errors (Tabachnick & Fidell, 2007). In our study, Type II errors-reporting of non-significant difference when there is in fact a significant difference-is the most challenging. However, finding that the test scores were normally distributed with a mean as expected for age, and with no outliers exceeding two standard deviations, lessens this concern.
The clinical sample was recruited from a population-based cohort of children with CP, and is thus thought to be representative of children with CP living in Norway. The sample was relatively small, especially in view of its heterogeneity, which might limit its generalizability. However, the sample did not differ significantly from the total population of children with CP in Norway, where 43 percent have unilateral spastic CP, 44 percent bilateral spastic CP, 7 percent dyskinetic CP, and 5 percent ataxic movement pattern (Annual report CPOP and CPRN 2014, www.oslo-universitetssykehus.no/ cpop), compared to 49 percent with unilateral spastic, 41 percent with bilateral spastic and 10 percent with dyskinetic CP in our sample.
The response mode sample consists of typically developing children, and it was expected that the average score would lie around the age mean, as was also the finding, and not be significantly different on the six tests. However, a small but statistically significant difference between the scores on the tests BPVS and TROG was observed. It would seem plausible that this could be explained by 30 percent of the response mode sample having a different language than Norwegian as their mother tongue, thereby making understanding of advanced grammar more challenging. However, the test scores on TROG were not significantly different for those having and not having Norwegian as native language. The difference is therefore assumed to reflect some difference in the norming of the two tests.
Another weakness of the study design is that the order of the response mode was not systematically varied among the participants. Due to the small number of participants, it was deemed most important to vary the order of the tests so that all tests were answered with all three response modes. The results indicate that the obtained test scores did not systematically vary between the different response modes, indicating that the fixed order did not influence it. However, to investigate whether order of response form influenced time usage, further studies are called for.
In the clinical group, all children using GP to answer the tests also used this as access method to their communication aids. It was therefore seen as reasonable to assume that they were practiced eye gaze users. That no formal assessment of gaze pointing ability was undertaken, could still be considered a weakness. However, had gaze pointing been a severe obstacle for the children using it as response mode in the CP group, it is reasonable to expect that the scores on the non-verbal tests (with the visually more challenging layout), would be lower than the results on the verbal tests. Contrary to this, the mean scores for the GP group on the non-verbal tests are not significantly different from the mean scores on the verbal tests (M(SD) = 68.1 (16.5) vs. 69.1 (21.3), paired samples ttest t(7df) = 0.126, p = 0.513). This suggests that any challenges the children had with the tests were not due to access method.

Conclusion
Assessment of cognition in children with severe speech and movement impairments is challenging, but possible. Substituting the standard response mode of FP with the alternative forms GP or PAS, is regarded as minimal, but necessary accommodation of standardized psychological tests so that children with severe speech and movement disorders can have the necessary assessments. Results from this study show that alternative response forms GP and PAS did not influence how healthy children scored on tests of language comprehension and visuospatial reasoning. PAS seems to be a more time-consuming response form compared to FP and GP. Several factors should be considered when selecting an alternative response form, such as eye motor impairments, visual perceptual skills, and fatigability.