Dynamic Testing of Children’s Series Completion Ability: Cognitive Flexibility as a Predictor of Performance

Dynamic testing aims to explore a child’s potential to learn by assessing improvement after training. In this study we investigated the relationship between performance on a dynamic test of series completion and children’s cognitive flexibility. This was done using a pre-test-trainingpost-test control-group design with 95 children, aged 6-8 years ( M = 7;1, SD = 12.5 months). All children were tested with a measurement of cognitive flexibility. Half of the children were trained in series completion according to a graduated prompting model, while the other half only practiced. Based on initial ability and performance change after training, children were classified as non-learner, learner or high performer. The results showed that training improved series completion performance more than practice-only. Cognitive flexibility predicted static pre-test performance and instructional needs during training and might therefore be of importance in the assessment of learning potential.

, and is specifically characterized by the ability to identify patterns in series of letters, numbers or schematic representations. As compared to solving a letter or numbers series task (e.g., Ferrara, Brown, & Campione, 1986;Simon & Kotovsky, 1963) it is argued that solving pictorial series requires a more complex procedure as the schematic pictures do not have a fixed relationship to each other. Children are required to search for various strings of regularly repeating elements, in combination with unknown changes in the relationship between these elements, which is not necessarily a left-to-right process (e.g., Resing & Elliott, 2011;Resing et al., 2012b;Sternberg & Gardner, 1983).
The present study aimed to investigate the extent to which series completion skills taught during a training phase of dynamic testing are related to executive functions, cognitive flexibility specifically. Numerous research has established a relationship between dynamic testing and school achievement in which individual differences in performance on dynamic measures provide additional information about an individual's cognitive potential and instructional needs (e.g., Caffrey, Fuchs, & Fuchs, 2008;Resing et al., 2012b). Exploring however whether learning potential, as measured with dynamic testing, is related to executive functioning and cognitive flexibility in particular is scarce. This nevertheless is an important question since knowing more about this relationship may provide a basis for understanding which children do and do not profit from training and may therefore contribute to the fostering of learning potential in children or indicate possibilities of compensation for persistent deficits.
Executive Functions (EFs) refer to inter-related mental processes that are necessary for the regulation of thinking and acting. Three core EFs are usually distinguished; working memory, inhibition and cognitive flexibility (e.g., Collette et al., 2005;Miyake, 2000). Cognitive flexibility, also referred to as set-shifting or mental flexibility, has been defined as the ability to change perspectives to a problem and to flexibly adjust to changing rules or priorities. In addition it has been described as the ability to learn from mistakes and feedback, to generate alternative strategies and to process multiple sources of information simultaneously (e.g., Anderson, 2002;Diamond, 2013). In many different models regarding executive functioning (e.g., Crone, Ridderinkhof, Worm, Somsen, & Van der Molen, 2004;Davidson, Amso, Anderson, & Diamond, 2006), cognitive flexibility is hypothesized to be strongly related to working memory capacity and inhibition control. Many researchers have found a strong relationship between fluid intelligence and executive functions, in particular working memory capacity and inhibition control (e.g., Conway, Kane, & Engle, 2003;Duncan et al., 2008;Roca et al., 2010). To date, little research examined the relationship between cognitive flexibility and fluid intelligence, but the few results show promising correlational values (e.g., Van der Sluis, De Jong, & Van der Leij, 2007;Roca et al., 2010). Van der Sluis et al. (2007) reported that performance outcomes on naming-based shifting tasks and a trail-making task predicted a significant amount of variance in non-verbal reasoning ability. Roca et al. (2010) found strong correlations between independent measures of cognitive flexibility-a complex set-shifting task and a verbal fluency task-and measures of fluid intelligence. These outcomes support the assumption that fluid intelligence, described as the ability to solve problems, reason and see patterns or relations among items (Ferrer, O'Hare, & Bunge, 2009) shows many similarities with the problem solving and reasoning components of the executive functions (Diamond, 2013).
In the present study it was examined whether cognitive flexibility would predict children's inductive reasoning ability as measured with a dynamic series completion test. Previous research has shown that executive functions, working memory capacity in particular, have some relationship with dynamic test outcomes, especially when graduated prompting training effects of series completion and analogical reasoning tasks were explored (e.g., Resing et al., 2012b;Stevenson, Heiser, & Resing, 2013a;Tunteler & Resing, 2010). Reported inter-relations between executive functions (e.g., Crone et al., 2004;Davidson et al., 2006) and earlier established correlations between cognitive flexibility and fluid reasoning ability in a static testing context (e.g., Van der Sluis et al., 2007) raised the expectation that cognitive flexibility would be related to the ability to reason inductively (more in particular the ability to solve incomplete series) in the static pre-test of the dynamic test and that flexibility would moderate the effect of training on children's ability to reason inductively.
In the commonly used dynamic testing pre-test-training-post-test design, structured feedback is provided during one or more training sessions and is considered a way of uncovering potential cognitive abilities (Sternberg & Grigorenko, 2002). Gain scores (post-test minus pre-test score) are often used as an indication of children's potential abilities but they have been considered to be unreliable in the context of the classical test theory (Cronbach & Furby, 1970;Embretson, 1991). The main problem with using performance change scores in a dynamic test setting is that pre-test and post-test scores, both not having optimal reliability, are often highly correlated. Furthermore, the scores are considered to be sensitive to bottom and ceiling effects and to regression to the mean (e.g., Embretson & Reise, 2000;Guthke & Wiedl, 1996). To anticipate these methodological problems, a typicality logic model of analysis will be applied to the expected performance change scores (e.g., Schöttke, Bartram, & Wiedl, 1993;Wiedl, 1999). This model suggests distinguishing between "Learners", "Non-Learners" and "High-Scorers" with regard to the applied training method. Classification of participants according to their learner status is often done using the Number of Correct Responses on pre-test and post-test (e.g., Schöttke et al., 1993;Wiedl & Wienöbst, 1999).
The current study investigated the effect of a graduated prompts training method (e.g., Sternberg & Grigorenko, 2002;Stevenson, Hickendorf, Resing, Heiser, & De Boeck, 2013b) on children's series completion performance while examining the role of cognitive flexibility as measured with the Modified Wisconsin Card Sorting Task (M-WCST; Nelson, 1976;Schretlen, 2010). During the pictorial series completion tasks the children were prompted -if necessary-to complete the series. Feedback was given in the form of graduated prompts which were provided to the children whenever they encountered difficulties in solving the tasks (e.g., Campione & Brown, 1987;Fabio, 2005;Resing & Elliott, 2011).
In accordance with previous research utilizing former versions of the dynamic series completion task we expected (hypothesis 1) performances on the series completion task to be greater in children trained with graduated prompts than when only practicing with the items (e.g., Resing & Elliott, 2011;Resing, Tunteler, & Elliott, 2015;Resing et al., 2012b). In line with earlier established relationships between executive functions-working memory and cognitive flexibility-and inductive reasoning (e.g., Roca et al., 2010;Van der Sluis et al., 2007), we hypothesized that children with lower cognitive flexibility performance would on average display a weaker initial performance on the dynamic series completion task as compared to children with higher cognitive flexibility performance (hypothesis 2a). In addition, given previously found relationships between executive functions and dynamic test outcomes (e.g., Resing et al., 2012b;Tunteler & Resing, 2010) we hypothesized that lower cognitive flexibility scores would predict a less efficient learner status (i.e., non-learner) in practice-only children but not in trained children (hypothesis 2b), indicating a moderator effect of cognitive flexibility on the relationship between training and children's series completion performance. Lastly, because of this moderator effect, a differential need in instruction during training based on cognitive flexibility scores was expected (hypothesis 2c).

Participants
Participants were 95 children (44 boys, 51 girls) from first and second grade primary schools (M = 7; 1, SD = 12.5 months). All children were native Dutch speakers from four elementary, middle class schools in the Western part of the Netherlands. Schools and children were selected based upon their willingness to participate. Written informed consent was obtained from all parents.

Design & Procedure
A pre-test -training-post-test control-group design with randomized blocking was employed. Randomization was based on a test of visual exclusion. Children were, per blocked pair, randomly allocated to one of two conditions: (1) training with graduated prompts and (2) a practice-only group. All children were administered the Modified Wisconsin Card Sorting Task (Nelson, 1976;Schretlen, 2010) during the first session. During the second session, all children solved the pre-test items. In the following two experimental sessions, trained children received the graduated prompts training whereas the practice-only group solved dot-to-dot tasks. During the last session all children were provided with the post-test, a parallel version of the pre-test. Sessions took place weekly in a quiet location at the child's school and lasted approximately 30 minutes per session.

Visual Exclusion
The RAKIT subtest Visual exclusion (Resing, Bleichrodt, Drenth, & Zaal, 2012a) was administered to measure children's initial inductive reasoning ability. The children were asked to induce a rule to determine which of four figures did not belong to the other ones.

Modified Wisconsin Card Sorting Task
The M-WCST (Nelson, 1976;Schretlen, 2010) was utilized to assess cognitive flexibility. Using four stimulus cards (one red triangle, two green stars, three yellow crosses and four green circles), the children were asked to sort 48 response cards according to color, shape or number. The children were informed whether or not their sort was correct, without making suggestions regarding the sorting criterion. The M-WCST was administered according to Nelson's criteria, implying that after six consecutive correct sorts the child was explicitly told that the sorting criterion had changed. The first and second sorting criteria chosen by the child were considered correct, implying that the third criterion was automatically established by the choice of the first two criteria. After the three sorting criteria were correctly completed the subsequent three criteria were requested in the same order. After completion of the three categories twice (possible by sorting 36 cards consecutively correct) or after sorting all 48 response cards, the procedure was completed. According to Nelson's criterion (1976), the percentage of perseverative errors was used as an index of cognitive flexibility; a perseverative error occurs when the child persists sorting according to the previously incorrect sort. Errors made when the child did not switch sorting criterion after being told that the criterion had changed were also included in this criterion (Ciancetti, Corona, Foscoliano, Contu, & Sannio-Fancello, 2007).

Series Completion Task
A dynamic series completion test was used to measure children's inductive reasoning skills. The test utilized in the current study consisted of a selection of items from a more comprehensive, electric console version of the dynamic series completion test (Resing et al., 2015) and contained the same procedural guidelines and prompting protocol. The task presented was based on an analytic model of series completion (Sternberg & Gardner, 1983) and involved solving pictorial series completion tasks. The tasks administered in this particular study were provided as open-ended construction tasks without the use of the electric console. The construction principles and analytic model of series completion that were the basis for construction of the series completion test have been described in Resing and Elliott (2011). All items consisted of a schematic puppet series that the children were asked to complete (see Figure 1). The children were required to construct the last puppet in each series by encoding the different task elements of the series while simultaneously identifying the changing relationships between these task elements. The changes in task elements were represented by changes in the gender of the puppet (male/female), in the color of the different body parts (blue, green, yellow and pink) and in the design of the different body parts (stripes, dots, no design). The task difficulty of the items was determined by the frequency of recurring patterns -periodicity-in the series and the number of recurring pattern transformations. Answers were constructed by choosing 8 plastic body parts-representing every possible combination of body parts (head, 2 arms, 3 belly-parts, 2 legs) and design (stripes, dots, no design)-which were used to construct a puppet on a plasticized paper puppet shape. Both pre-and post-test consisted of 12 series completion items, increasing in difficulty. The post-test was a parallel version of the pre-test regarding item difficulty; the items only differed in gender, color and design. During the pre-and post-test, the children did not receive any feedback or prompts regarding their performance. After construction of each answer the child was asked to explain his/her reasoning.

Series Completion: Dynamic Training
The two training phases both consisted of six series completion items each, increasing in difficulty. The children received help if they encountered difficulties while solving the task according to a graduated prompting procedure (Resing, 2000;Resing et al., 2012b;Tunteler & Resing, 2010). This procedure consisted of small structured steps, gradually changing from very general to task specific instructions. After one example series, each item was presented with a general instruction. The child responded by constructing his/her response with the plastic body parts and then received feedback on the response. If the answer was correct, the child was asked to explain his/her reasoning. If the child's response was incorrect, one prompt was provided according to the standardized protocol. This was repeated until the child constructed the correct answer or the final prompt had been given. The graduated procedure started with a metacognitive prompt, followed by two more specific cognitive hints and finally a step-by-step scaffold to solve the problem. After each correct answer the child was asked to explain the correct solution. Qualified undergraduate psychology students, trained in advance in all testing and training procedures, implemented the prompting procedure.

Scoring
In order to evaluate children's performances several measures were obtained: (1) whether the solution was correct or incorrect and (2) the number of prompts required per training item. The overall difference in number of correct responses on pre-and post-test -the overall gain score-was used to determine training effectiveness. The number of correct responses on pre-and post-test and the corresponding standard deviations were used to determine the child's learner status. The total number of prompts required per item was used to determine the amount of help required to complete the training.

Psychometric Properties
Cronbach's measure of internal consistency was α = .74 on the pre-test. Internal consistency on the post-test was calculated separately for practice-only and training condition with α = .81 and α = .66 respectively. The test-retest reliability as measured by the correlation of the pre-test and post-test total number correct for the practice-only condition was r = .58, p < .001.

Training Effectiveness in Improving Series Completion Performance
Our first research question concerned the effect of the graduated prompts training in improving children's performance on the series completion task. Analyses regarding effectiveness were conducted using (1) pre-to post-test progression and (2) children's learner status.

Pre-Test to Post-Test Progression
We expected that graduated prompt techniques would lead to greater improvement in series completion scores (1). This was investigated using a repeated measures analysis of variance (RM ANOVA) with series completion performance scores per session as dependent variable, with Session as within-subjects factor and Condition (graduated prompts vs. practice control) as between-subjects factor (see Table 2 for basic statistics). The main effect for Session was significant (Wilks's λ = .65, F(1,93) = 49.51, p < .001, η p 2 = .35) showing that children, on average, progressed in series completion performance across sessions. The significant interaction effect for Session X Condition (Wilks's λ = .76, F(1,93) = 29.53, p < .001, η p 2 = .24) indicates that children in the conditions differed in their degree of progression. As can be seen in Figure 2, children in the graduated prompts condition showed more accuracy in solving series problems than children in the practice-only condition, supporting our first hypothesis.

Learner Status
Number of correct responses on pre-and post-test were used for the classification of children according to their learner status. Schöttke et al. (1993) described an algorithm that identifies learners as those subjects who improve their performance from pre-test to post-test by 3.63 correct answers (1.5 SD). High-scorers are identified as those children who score between the pre-test upper level of 10 and a lower level of 6.37 (upper level-1.5 SD) correct responses on the pre-test. Non-learners do not meet either criterion. According to this classification system, in the current study 35 participants were classified as learner, 8 participants were classified as high-scorer and 52 participants were classified as non-learner.
Multinomial logistic regression analyses with Learner Status (learner, non-learner or high-scorer) as dependent variable and Condition as factor showed that condition significantly predicted whether children were classified as learner or as non-learner (b = -2.03, Wald χ 2 (1) = 16.29, p < .001). The odds ratio showed that as condition changed from practice-only (0) to graduated prompts (1) the change in the odds of being a learner to being a non-learner is 0.13. In other words, the odds of a child in the graduated prompts condition being a learner compared to being a non-learner were 1/0.13 = 7.69 times more likely than a child in the practice-only condition. Condition did not significantly predict whether children were classified as high-scorer or as non-learner (b = -1.32, Wald χ 2 (1) = 2.80, p = .09). The odds of a child in the graduated prompts condition being a high-scorer compared to being a non-learner were not significantly different than for a child in the practice-only condition. The same non-significant result applied to classifying a child as a high-scorer or a learner based on condition (b = 0.71, Wald χ 2 (1) = 0.72, p = .40), implying that the odds of a child in the graduated prompts condition being a high-scorer instead of a learner were not significantly different than the odds of a child in the practice-only condition (see Table 3).
In sum, the graduated prompts training positively influenced children's series completion ability-training significantly increased the odds of a child being a learner-whereas high-scorers seem uninfluenced by the effects of training. These results support our expectations regarding training effectiveness.

Role of Cognitive Flexibility in Learning Potential
The second main research question pertained to the role of cognitive flexibility in learning potential, examining the role of cognitive flexibility on children's learner status and instructional needs. The aim was to analyze whether the graduated prompts training would moderate the effect between cognitive flexibility and improvement in series completion performance. It was expected that higher flexibility scores would predict better pre-test performance and a more efficient learner status (i.e., learner or high-scorer) whereas lower flexibility scores would predict a weaker pre-test performance and a less efficient learner status (i.e., non-learner) (2a). Secondly it was expected that cognitive flexibility would moderate the relationship between condition and series completion performance; lower flexibility performance would be related to less efficient learner status in the practice-only condition but not in the trained condition (2b). Regarding instructional needs it was expected that higher flexibility scores would negatively predict the number of required prompts during training where children with lower flexibility performance would require significantly more prompts during training than children with higher flexibility performance (2c).

Pre-Test Performance
A linear regression analysis with number of correct responses on pre-test as dependent variable and flexibility performance as independent variable showed a significant result (F(1,94) = 6.87, p = .004) in which flexibility performance accounted for 7 % (R 2 = .069) in the variability of the number of correct responses and proved to have a weak but significant relationship with the number of correct responses on the pre-test (b = -0.04, t = -2.62, p = .004).

Learner Status
Multinomial logistic regression analyses with Learner Status as dependent variable, Condition as factor and Flexibility scores as covariate revealed that flexibility scores significantly predicted whether children were classified as non-learner or as high-scorer (b = -0.08, Wald χ 2 (1) = 5.91, p = .023). The odds ratio showed that as perseverative errors would decrease with one point, the change in the odds of being a high-scorer rather than a non-learner was 0.28, indicating that children were more likely to be a high-scorer when their flexibility scores were higher. Significant results applied to classifying a child as high-scorer instead of learner as well (b = -0.06, Wald χ 2 (1) = 3.36, p = .015), indicating that as perseverative errors decreased with one point, the change in the odds of being a high-scorer (rather than being a learner) was 0.94. Flexibility scores did not significantly predict whether children were classified as non-learner or learner (b = -0.02, Wald χ 2 (1) = 1.70, p = .19) implying that the odds of a child with less perseverative errors being a learner (rather than a non-learner) did not significantly differ from the odds of a child with more perseverative errors (see Table 4).
However, these influences of cognitive flexibility in children's learner status did not depend on whether the children received the graduated prompts training; no significant interaction between condition and flexibility performance was reported in the logistic regression model. The graduated prompts training does not seem to moderate the effect between cognitive flexibility as measured with the M-WCST and series completion performance.

Instructional Needs
A univariate ANOVA was conducted to determine whether the total number of required prompts during training (dependent variable) was related to learner status (between-subjects factor). The results showed significant differences in number of required prompts (F(2,46) = 11.35, p < .001) between the three learner types. Post hoc comparisons using the Tukey HSD test indicated that the mean need for prompts for the non-learner group (M = 20.56, SD = 9.03) was significantly higher than the mean score for the learner group (M = 11.08, SD = 6.03) and the high-scorer group (M = 5.60, SD = 8.79). The learner group did not significantly differ from the high-scorer group.
A linear regression analysis with number of required prompts during training as dependent variable and flexibility performance as independent variable showed a significant result (F(1,45) = 10.37, p = .006) in which flexibility performance accounted for 18% (R 2 = .18) in the variability of the number of required prompts and proved to have a significant, moderate relationship with the number of required prompts during training (b = 0.43, t = 3.22, p = .006).

Discussion
The main aim of this study was to explore the role of cognitive flexibility in children's instructional needs and responsiveness to training during a dynamic test of series completion skills. Dynamic testing aims to establish a child's amount of learning after a short training procedure, in order to provide insight into the child's potential in learning. Progress in series completion skills was compared between children who were trained and children who only practiced without guidance. In line with previous studies utilizing the dynamic series completion test (e.g., Resing & Elliott, 2011;Resing et al., 2012b) we found an overall improvement in performance, regardless of condition, and trained children showed greater progression in series completion performance than practice-only children. In order to prevent reliance on statistically unreliable gain scores we assessed learning potential with a typological model of learner status classification (e.g., Budoff, 1968;Wiedl & Wienöbst, 1999), describing the degree of performance change from pre-test to post-test on a subgroup level where post-training score was adjusted for pre-test level. The results indicated that training increased children's odds to being a learner instead of being a non-learner, supporting the effectiveness of the series completion training. However, the training did not differentiate between non-learner and high-scorers, possibly indicating that non-learners may not have learned regardless of the condition. The graduated prompts approach used may not be sensitive enough for these children. Previous research has shown that non-learners who do not profit from the usual dynamic intervention do profit from other training based on principles of errorless learning (e.g., Kern, Liberman, Kopelowicz, Mintz, & Green, 2002). Errorless learning, a learning approach in which the negative effects of making incorrect choices are reduced, has previously been demonstrated to be effective for typical children and children with difficulty in easily adapting to a change in cognitive rules or behavioral repertoires (Schreibman, 1975;Venn et al., 1993). In addition it might be the case that part of the children does not need training as these children are consistent high scorers.
Regarding the influence of cognitive flexibility on children's learner status we investigated whether perseverative behavior was a source of subgroup differences. Previous research with children has shown that executive functions-working memory, inhibition control and cognitive flexibility-are to a certain degree related to fluid intelligence and inductive reasoning (e.g., Duncan et al., 2008;Roca et al., 2010) and our results support this as we found a predictive value between cognitive flexibility performance, i.e., perseverative behavior and children's initial (static) pre-test performance. In regards to dynamic test performance, perseverative behavior played a significant role in children's instructional needs where less perseverative behavior predicted less prompts required during training. This finding is in line with research conducted by Resing et al. (2012b) and Stevenson et al. (2013a) where relationships were found between executive functioning -working memory in particular-and dynamic test outcomes. The substantial relationship found between cognitive flexibility and instructional needs could easily be supported by extensive literature describing cognitive flexibility as "being flexible enough to adjust to changed demands or priorities" (Diamond, 2013, p. 149) and "utilization of feedback" (Anderson, 2002, p. 72). Our results appear to show that this cognitive construct plays a role in the ability to profit from a short graduated prompting procedure and support our hypothesis that cognitive flexibility is related to children's instructional needs. This is an important issue as it points to differential aspects in designing trainings for practical in-classroom applications.
Inductive reasoning ability and cognitive flexibility are both well-known constructs in intellectual ability tests and appear to be related to a certain degree. Performance change due to training the child's learner status in this particular study is less often included in the assessment of cognitive abilities. Our surprising finding that cognitive flexibility did not moderate the effect of training raises the suspicion that the assessment of cognitive flexibility as measured by the M-WCST might not have been optimal. The M-WCST, as compared to the original WCST, contains regular announcement of change of category which by itself is a dynamic intervention that for some subjects compensates for low flexibility (Wiedl, 1999). In addition, the effect of flexibility may have been attenuated in both conditions of our dynamic test, because the effects of (un)guided training could possibly compensate for differences between children which are due to differences in flexibility. The instruction to explain the reasons for their solutions during pre-and post-test in both groups might be considered being a dynamic intervention by itself, which may improve performance for part of the children (e.g., Carlson & Wiedl, 1992). As a consequence, it is still open what internal characteristics of children make them a learner or non-learner. With regards to the classification of children according to learner status, a comprehensive typology encompassing more subtypes (e.g., Waldorf, Wiedl, & Schöttke, 2006) might have provided better insight into the differentiating effect of flexibility on children's learner status.
In sum, the dynamic series completion test distinguishes children between non-learners and learners based on their fluid reasoning ability. Analyzing the results at subgroup level contributed to recognizing the need for special interventions in both the non-learner group and the high-scorer group. Cognitive flexibility appears to influence children's series completion performance as it plays a role in children's initial performance and predicts the instructional need during training. In future studies it would be interesting to further investigate instructional aspects of dynamic versus static testing in relation to the effects of executive functioning on children's learner status. This may provide further insights into children's potential to learn as measured during dynamic testing and into the application of assessment information in educational practice.