Assessment of temperament in children with profound intellectual and multiple disabilities. A pilot study into the role of motor disabilities in instruments to measure temperament1

Abstract Research on temperament has evolved substantially throughout the last years. Assessing temperament in a child gives information about why the child reacts differently in different situations and can be seen as one of the variables playing a role in determining adaptive and maladaptive outcomes. Insight into the temperament of the child, therefore, facilitates the adaptation of support or child-rearing practices to the specific needs and wishes of the child. The current study aimed at reviewing existing temperament instruments among young children with profound intellectual and multiple disabilities (PIMD). An inventory of the existing instruments, which can determine temperament, was made based on a literature review. A total of 138 articles were found in which temperament was measured. None of these studies included children with PIMD. The Infant Behavior Questionnaire-Revised (IBQ-R) very short form and the Child Behavior Questionnaire (CBQ) very short form seem to be the most appropriate forms to measure temperament. Because motor disabilities are one of the main characteristics of these children, assessment instrument must be accommodated to minimize impairment bias, without altering what the test measures. Therefore, a pilot-study with 12 children with PIMD (age between 1.8 and 4.9 years) was conducted to analyze the bias of motor disabilities on these instrument. Results showed that seven (19.4%) of the CBQ items and nine (24.3%) of the IBQ-R items contained motor behavior which biased the validity of the instrument. A proposal is made regarding the adaptation of the nine IBQ-R items.


ABOUT THE AUTHORS
At the department of special needs education and youth care (University of Groningen; the Netherlands), we carry out research into the support of children and adults with profound intellectual and multiple disabilities (PIMD) (supervised by Annette van der Putten). The main focus is at the development of interventions to increase the quality of support of these people. However, we also focus at more fundamental knowledge such as the assessment of pain and the prevalence of challenging behavior in these people. Finally, projects are aiming at the collaboration with parents, families, and health care professionals in practice.
The current study elaborates on the question how to optimize assessment in people with PIMD. The study is one of the projects related to the research line science in motion. Science in motion aims at to optimize the outcomes of movementoriented activities and motor behavior and functioning in children and adults with PIMD who are characterized by severe or profound motor disabilities.

PUBLIC INTEREST STATEMENT
Children with profound intellectual and multiple disabilities (PIMD) have severe motor disabilities. Their mobility skills are limited and they hardly can use their hands functionally. Assessment plays an important role, as it yields information about strengths and weaknesses to focus on in the support. In assessment procedures, one have to take these motor disabilities into account to get a reliable and valid picture of the skills measured. Therefore, standard instruments must be accommodated. Many tests, however, consists of items that are related to motor functioning. This also holds true for temperament instruments. The current study reviewed temperament instruments among children with PIMD. A pilot study analyzed the bias of motor disabilities in two of these commonly used instruments. Results showed that in both instruments nearly 25% of the items contained motor behavior. A proposal is made regarding the adaptation of one of the instruments.

Introduction
During the last decades, research on temperament has evolved substantially and the importance of incorporating temperament as a variable in child development is widely acknowledged . Although many classificatory schemes for temperament have been developed, there is no general consensus .  define it as "temperament traits are early emerging basic dispositions in the domains of activity, affectivity, attention and self-regulation, and these dispositions are the product of complex interactions among genetic, biological and environmental factors across time" (Shiner et al., 2012, p. 437).
Assessing temperament in a child gives information about why the child reacts differently in different situations rather than the way it reacts (Hepburn, 2003) and can be seen as one of the variables playing a role in determining adaptive and maladaptive outcomes (Rothbart, 2011). Insight into the temperament of the child facilitates the adaptation of support or child-rearing practices to the specific needs and wishes of the child. In temperament research, this is described as the "goodness of fit theory": attuning child-rearing practices to the temperament of the child can, for example, reduce challenging behavior in difficult situations. Support and the attunement to the needs of a child are specifically relevant if a child has developmental disabilities. A specific group of children with developmental disabilities are children with profound intellectual and multiple disabilities (PIMD). These children have an estimated intelligence quotient of less than 25 points or an estimated developmental level up till 24 months with a higher calendar age (Nakken & Vlaskamp, 2007). Furthermore, due to their brain damage, they have severe or profound motor disabilities. This means that they have a limited functional use of their arms, hands and legs and are restricted in their mobility skills (Nakken & Vlaskamp, 2007). Moreover, they suffer from sensory problems and general health problems (van Timmeren, van der Putten, van Schrojenstein Lantman-de Valk, van der Schans, & Waninge, 2016).
Assessment plays an important role in the support of these children, as it yields information about relative strengths and weaknesses to focus on in the support, as well as needs that need to be taken into account. In the assessment of children with PIMD, the instruments and tests used should be adapted to the child's (dis)abilities in order to get a valid and reliable test result. Therefore, standard instruments must be accommodated to minimize impairment bias, without altering what the test measures (Visser, Ruiter, van der Meulen, Ruijssenaars, & Timmerman, 2014). In the assessment of temperament in children with PIMD, this means that the influence of the motor disabilities on test results needs to be eliminated to ensure construct validity.
To our knowledge, hardly any studies have been carried out into the assessment of temperament in children and adults with PIMD. The current study focuses at the assessment of temperament in children with PIMD and analyses the role of the motor disabilities in this assessment. We focus on young children (under the age of five years), because suitable support appears especially effective at this young age (Shonkoff & Phillips, 2000). Because early diagnosis in these children is often difficult, we focus on young children with significant cognitive and motor delays. We refer to these children as "young children with PIMD".
This study is a first step in the development of an instrument with sound psychometric properties to assess temperament in children young children with PIMD. The main goal of the current study is to increase the construct validity of temperament instruments by reducing the influence of motor behavior on the assessment result. The central research question of this study is: "How can we measure temperament in young children (0.6-5.0 year) with PIMD?" We focus at the following questions: (1) Which instruments are available to measure temperament in young children? (2) Which items contain motor behavior? and (3) How can we accommodate these items to eliminate the influence of the motor disabilities when assessing temperament?
The current study consists of two parts: a literature review and a pilot study.

Method
The aim of the review was to get an overview of available instruments to measure temperament in children aged between 0.6 and 5.0 years (research question 1).

Databases and search terms
In the literature review, we searched for national (Dutch language) and international (English) manuscripts in the database EBSCOhost complete. We used five sets of search terms, Dutch and English. Table 1 presents the English terms used.
Using the Boolean search term AND, we combined set 1 with set 2, set 1 with set 3, set 2 with set 3, set 2 with set 4 and set 2 with set 5. Terms were separately combined if a set consisted of multiple terms. The following criteria were used in order to select the manuscripts: (1) in the manuscript, an instrument that measures temperament was described, (2) participants were children aged between 0.6 and 5.0 years, (3) the manuscript is peer-reviewed, (4) full text available, (5) published between 2000 and 2015. After duplicates were deleted, manuscripts were selected by title, abstract and full content, respectively.
On the basis of the retrieved articles, we developed an overview of available instruments to measure temperament in children aged between 0.6 and 5.0 years.

Temperament instruments found in the literature
A total of 2,497 hits were found, of which 306 were duplicates. Based on title, 1946 were excluded. Based on abstract and full text, another 56 and 32 were excluded, respectively, and 19 were not full text available. This resulted in 138 manuscripts. In none of the 138 manuscripts, children with PIMD were involved. The manuscripts focused on children without developmental disabilities (n = 124; 89.9%) and children with several developmental disabilities such as children with autism (n = 7), language disorders (n = 3), cerebral palsy (n = 1), Fragile X syndrome (n = 1), Down's syndrome (n = 1) and dysmature born children (n = 1). In none of the manuscripts, adaptations were made to the instruments to measure temperament, nor do the authors refer to the psychometric properties for the population involved in the studies.
In total, 25 different instruments to measure temperament were described and/or used in these manuscripts. Ten of these instruments are only described in one study; six are used in eight or more manuscripts. The most frequently used instrument is the Infant Behavior Questionnaire (IBQ); this instrument is described in 23 of the 138 (17.0%) manuscripts. Ten (13.8%) of the 138 manuscript focused on the psychometric quality of an instrument. In the remaining 128 manuscripts (86.2%), temperament was measured as a variable. All 25 described instruments were developed in the English language.
Of the 25 instruments found, we analyzed seven in further detail (see Table 2). We selected six instruments that were described in at least eight manuscripts, as this forms an indication of the psychometric quality: the Infant Behavior Questionnaire (IBQ), Infant Characteristics Questionnaire (ICQ), Carey Temperament Scales (CTS), Child Behavior Questionnaire (CBQ), Lab-TAB and Emotionality, Activity, Sociability Questionnaire/Survey (EAS). In addition, we selected the instrument of which sufficient psychometric properties were described in at least one manuscript: the Behavior Inhibition Questionnaire (BIQ).
To identify the most suitable instrument(s) in this overview for measuring temperament in young children, we used the following criteria: (1) the psychometric properties are available and described in at least one of the manuscripts; (2) the instrument is translated in English or in Dutch; (3) the instrument is freely available in full form; (4) the instrument measures temperament via a questionnaire that should be filled in by parents. Based on these criteria, we selected the most appropriate instrument to measure temperament in children with PIMD (see Table 3).
Both the IBQ and CBQ fulfilled all criteria. Moreover, both instruments were in a short form available; the IBQ-R 2 and the CBQ very short form.

Method
Based on the literature review, the IBQ-R and CBQ very short form were analyzed in a pilot study (in November 2015). In this pilot study, two of the authors independently judged for each item of these two instruments if it contained motor behavior and hence if assessing temperament could be biased by the motor disabilities of the child. The inter-rater reliability was calculated with Cohen's Kappa and judged as k > 0 (no agreement); 0-0.20 (slight); 0.21-0.40 (fair); 0.41-0.60 (moderate); 0.61-0.80 (substantial) and 0.81-1 (almost perfect agreement) (Landis & Koch, 1977).  As a next step, the same two authors independently scored the items with an insufficient reliability with use of video observations of 12 children with PIMD. There were seven girls and five boys and the mean age was 3.2 years (sd 1 year; range 1.8-4.9 years). The age of one child was unknown. The children all had severe developmental problems and a high risk of developing PIMD. Four children had a genetic syndrome (1p36 deletion, Rett, Allan Herndan Dudley, Phelan McDermid), one child had had an accident, two children had oxygen shortage during birth, and for the other children, the cause of the problems was unknown. Most of the children had a motor impairment in the form of hypotonia; three children had spasticity. Three children also had a visual impairment, one had an auditory impairment, four did not have known sensory impairments, and for four children, this information was not available. All of the children also had medical problems, like epilepsy, gastrointestinal problems and breathing problems.

IBQ ICQ CTS CBQ Lab-TAB EAS BIQ
On the basis of short videos of the 12 children, both raters judged for each item if children with PIMD are possibly able to fulfill the particular skill motorically. Cohen's Kappa was calculated in order to assess the inter-rater reliability. Items about which both researchers agreed that children with PIMD cannot master the motor skills needed, they developed an alternative formulation for the item. By this accommodation of items, we aimed to reduce the motor component in the items of the instrument, without actually changing the content of the item (Alant & Casey, 2005). Items that were formulated in a positive way in the original instrument (e.g. grasps something) were also formulated in a positive way in the alternative formulation and vice versa (does not grasp).

Motor items in the IBQ-R and CBQ
The IBQ-R consists of 37 items (Putnam, Helbig, Gartstein, Rothbart, & Leerkes, 2014). There was an almost perfect agreement between the two raters in their judgment upon whether the item consists of a motor component (Cohen's Kappa 0.86). For two items (11 and 20), both researchers did not agree. After observing videos of the 12 young children with PIMD, the researchers reached consensus: both items did not contain any motor behavior. Finally, of the 37 items of the IBQ-R, nine items (24.3%) were judged as containing motor behavior (see Table 4).
The CBQ very short form consists of 36 items (Putnam & Rothbart, 2006). Again, there was an almost perfect agreement between the two raters in their judgment upon whether the item consists of a motor component (Cohen's Kappa 0.84. For two items (18 and 30), both researchers disagreed. After observing the videos, they decided that both items did not contain motor behavior. This resulted in seven items of the CBQ short form (19.4%) containing motor behavior.
Because children with PIMD have a developmental age below 24 months (Nakken & Vlaskamp, 2007), the IBQ-R (age range 0.6-1.0 year) would be more applicable for these children compared with the CBQ with a age range of 3.0-8.0 year. Therefore, we decided to adapt the items containing motor behavior of the IBQ-R very short form instead of the CBQ.

Adapting the motor items of the IBQ-R
With use of video observations of 12 young children with PIMD, both researchers scored the nine items of the IBQ-R short form (see Table 4). For each child, it was scored whether or not the child could master the motor behavior of the particular item. Cohen's Kappa as measure of agreement between both raters was 0.84.
Results showed variety between the children and between mastering the different items. In total, 11 of the 12 children did master item 15. Item 7 was only mastered by one child. Item 1 (squirm and/ or try to roll away), 13 (squirm/turn body when placed on his/her back) and 37 (squirm and turn body when placed in a seat or car seat) refer to the same motor skills. Consequently, all 12 children will have the same score on these three different items. This is also seen in item 4 (cling to a parent when introduced to an unfamiliar adult) and item 33 (cling to a parent when in the presence of several unfamiliar adults). If a child is able to master item 1 motorically, he or she will also be able to master item 13 and 37.
To minimize impairment bias, we accommodated the items of the IBQ-R very short form. As a basis for these accommodations, we used our observations of the video recordings. The accommodations consisted of replacing all motor behavior described in the items by alternative, nonmotor behavior that expresses the same temperament characteristics as the motor behavior. The construct the item aims to measure thus remained unchanged for all items.
Our observations of the video recordings made clear that nonmotor behavior includes facial expressions and nonverbal behavior such as sounds, mourning, groaning, cooing, smiling and grimacing. The videos showed that these children are hardly able to effectively aim for an object, for example, for which a change in body position is needed. The children do attempt to rotate, but usually do not manage to do so as a consequence of their disabled arm and leg movement or reduced strength. Instead, they follow objects by turning their head and/or by following it with their eyes. Another factor that one should be aware of is the extended, slow and possibly different reaction on auditory stimuli (e.g. sounds of a rattle near the left ear when the child is looking toward the right side). Mostly, it takes some time before the child responses by, for example, turning his head towards the sound. Neglecting this extended reaction can bias the temperament scores of the child.
In Table 4, we have formulated an accommodated version of the nine items that contain motor behavior. The items are formulated in a more general way instead of describing specific motor behavior to take the unique behavior of these children and their heterogeneity into account (Hogg, Reeves, Roberts, & Mudford, 2001). We thereby replaced the motor components by nonmotor behavior as described earlier. Also, we removed the time component if a quick response of the child was required, as in item 7. In accordance with the original version of the IBQ-R very short form, each item can be scored on a 7-point Likert scale ranging from "never" (score 1) to "always" (score 7) (see Table 4 for details).

Table 4. Original and accommodated items of the IBQ-R very short form
Notes: In accordance with the original version of the IBQ-R very short form, each item can be scored on a 7-point Likert scale: 1 = Never; 2 = Very rarely; 3 = Less than half the time; 4 = About half the time; 5 = More than half the time; 6 = Almost always; 7 = Always and NA = Does not apply.

No. IBQ-R very short form Adaptation
1 When being dressed or undressed during the last week, how often did the baby squirm and/or try to roll away? How often did your child struggle when you (un) dressed here during the last week?
4 When introduced to an unfamiliar adult, how often did the baby cling to a parent? How often did your child look for protection when he or she was introduced by an unknown person?
6 How often during the last week did the baby play with one toy or object for 5-10 min?
How often did your child show interest for 5-10 min for a toy or other object the last week? 7 How often during the week did your baby move quickly toward new objects?
How often did your child show interest in a toy or other object the last week?
13 When placed on his/her back, how often did the baby squirm and/or turn body? How often did your child struggle when he or she was laying down on her back the last week?

15
How often does the infant look up from playing when the telephone rings?
How often did respond your child when a phone was ringing?
28 When introduced to an unfamiliar adult, how often did the baby refuse to go to the unfamiliar person?
How often did you child refuse in any way when he or she was introduced to an unknown person?

33
When in the presence of several unfamiliar adults, how often did the baby cling to a parent? How often did your child look for protection with you or the other parent when unknown people were around?

37
When placed in an infant seat or car seat, how often did the baby squirm and turn body? How often did your child struggle when he or she was placed in an (adapted) child seat of care seat?

Discussion and conclusion
The current study focused on assessment of temperament in young children with PIMD. The aim was to minimize the test bias that is caused by the limited motor disabilities of these children. A literature review revealed 25 instruments to measure temperament in young children (0.6-5.0 years). None of the instruments were especially accommodated for children with PIMD. A number of manuscripts found in the literature review focused on children with other types of disabilities, including cerebral palsy and Down's syndrome, for example. However, these manuscripts did not contain information about the suitability of the temperament instrument for the concerned target group, nor was the instrument accommodated in any way for the target group.
The IBQ-R very short form and the CBQ very short form seemed to be the most appropriate instruments to measure temperament in young children with PIMD. The results of the pilot study showed that 19.4 and 24.3%, respectively, of the items of the CBQ and IBQ-R contained motor behavior, which threated the validity of the instrument for children with PIMD. We have accommodated the nine IBQ-R items containing a motor component, without altering what the items measure.
Due to the main focus on motor behavior, our study adds to the small number of studies into accommodating assessment instruments for motor impairment. Another recent example of a study in this field was focused on the Dutch version of the Bayley-III (Bayley, 2006;Van Baar, Steenis, & Verhoeven, 2014). The accommodated version is called the low motor/vision, which appeared to have improved validity for children with motor and/or visual impairments (Visser, Ruiter, Van der Meulen, Ruijssenaars, & Timmerman, 2013;Visser et al., 2014). The study results also showed large individual differences in the need for accommodations, which is reflected in the results of the current study as well.

Limitations of the current study
The current study has some methodological strengths and weaknesses. The literature review gives an extensive and thorough overview of instruments (n = 138) available to assess temperament in children with and without disabilities. The number of participants included in the pilot study was, however, rather low (n = 12). As a reference: the total population of people with PIMD in the Netherlands only consists of around 10,000 adults and 4,000-6,000 children (Vlaskamp, Poppes, & van der Putten, 2015). This limits the generalizability of the results found. Therefore, one should keep in mind that this study is a pilot study. However, the high inter-rater reliability of the judgment if a particular item of both the CBQ and IBQ-R contains motor behavior, is promising.

Future research
Further studies into the psychometric properties of the accommodated version of the IBQ-R are needed. For example, the feasibility can be analyzed by interviewing parents and/or health care professionals working with children with PIMD and adding or adapting information about temperament of these children. Furthermore, the agreement between mothers and fathers could be studied as part of the research into the inter-rater reliability. The current study only focused on the possible bias related to the motor disabilities that are prevalent in children with PIMD. However, also visual and auditory problems and general health problems can be related to the way these children are able to show behavior that can be linked to their temperament. Further studies should therefore include larger samples, which allow analyzing subgroups of children within the heterogeneous group of children with PIMD (e.g. related to age and/or functional abilities) (Nakken & Vlaskamp, 2007). Then, we can analyze, for example, whether or not the adapted version of the temperament measure is sample-independent or not.

Implications for daily practice
By assessing temperament in a reliable and valid way, we can analyze the relation between different temperament types and the motor-, communicative and social and emotional development. Until now, information about this and the relation between these variables is still lacking. Furthermore, developing an instrument to assess temperament with sound psychometric properties for children