of Child & Adolescent Behavior “ Show Me How Your Wrist Moves, and I Will Tell You How Active You Are! ” Towards an Objective Measure of Preschoolers ’ Motor Activity

Objective: Although a certain level of motor activity is considered to be typical in preschoolers, in the most severe cases it interferes with the child’s social and academic development. Valid assessment procedure of children’s motor activity is therefore a very important issue. The current study aims to validate the Triaxial Accelerometry for Preschoolers (3AAP), a method using the measurement of children’s wrist acceleration as a way to estimate their motor activity. Method: Data were collected from a community sample of 226 preschoolers and from a sample of 32 preschoolers clinically referred for externalizing behavior concerns. The participants’ motor activity was assessed using a triaxial accelerometer (a sensor worn on the wrist) in three different conditions of assessment, i.e. at school, in a lab session and during a computerized task administration. Results: The 3AAP variables, i.e. the peak, the mean level, the intra-individual variability, and the median of motor activity as well as the percentage of time spent in the lower range and conversely in the higher range of motor activity, were highly intercorrelated and normally distributed. They were significantly correlated with externalizing behavior-related scales from the CBCL, the SDQ and the UCG, and low correlations were reported with internalizing behavior-related scales from the same instruments. Test-retest correlations after a 10-week interval were moderate to high. Significant differences were displayed between the three conditions of assessment as well as between referred and normally-developing preschoolers. Conclusion: The 3AAP scores are good candidates for an objective, low-cost and reliable measurement of preschoolers’ motor activity that could be helpful both for research and clinical purposes.


Introduction
Although a certain level of motor activity is considered to be typical in preschoolers [1], in the most severe cases it interferes with the child's social and academic development [2]. The assessment of children's motor activity can therefore be considered as a very important issue. According to the usual distinction between internalizing and externalizing behavior problems, motor overactivity is hence viewed as a core component of externalizing behavior problems (EB) [3]. The current study aims to validate the Triaxial Accelerometry for Preschoolers (3AAP) among normally-developing preschoolers and among preschoolers who had been referred for externalizing problems. The 3AAP consists in a method using the measurement of children's wrist acceleration to estimate their motor activity.
The most widely employed assessment of motor activity in preschoolers consists of parents' or teachers' questionnaires containing EB subscales or questionnaires related to externalized syndromes such as Attention Deficit Hyperactivity Disorder (ADHD). For instance, the Child Behavior Checklist [4], the Strengths and Difficulties Questionnaire [5] and the Conners scale [6] are frequently used in research. An alternative to questionnaires is provided by observational paradigms. Rather than measuring life-time motor activity as in the case of questionnaires, they assess current motor activity [7]. Observational paradigms are intentionally structured to increase the likelihood that a range of clinically relevant motor activity will emerge. An example is the Snap Game consisting of a rigged competitive card game between two children designed to expose them to the threat of losing [8]. The Snap Game has been designed to elicit spontaneous agitation as well as negative affect and aggression in a realistic context. Such paradigms are very helpful in the context of research where a multi-informant, multi-method procedure is needed [9]. Limited agreement between informants and methods has however been consistently reported in previous literature [10] For example, the correlations between the Snap Game and the Child Behavior Checklist [11] ranged from .09 and . 16 when completed by mothers and from . 16 to .21 when completed by teachers [11]. Variations in motor activity assessment according to informants and methods can not only be explained by the variations of children's motor activity across contexts or the focus of the instruments on motor activity in daily life or on current motor activity, but also by informants' subjectivity or bias [12,13]. The 3AAP is presented here not only as a supplement to existing methods but as a good candidate to provide an objective measure of motor activity that affords a benchmark for conventional assessment.
The accelerometry method requires the use of recorders which are mainly conceived in the form of wrist-or ankle-worn sensors. They can easily be used for a few hours during the day in diverse relevant settings or during the night. Accelerometry also called actigraphy, has been predominantly used in adults and children for the determination of sleep and wake cycles. It has for example been used for the study of sleep patterns in children [14], subjects suffering from chronic fatigue syndrome [14][15][16][17], disruptive children [18], and ADHD patients [19]. Its predictive power was highlighted in prospective research studying the effect of sleep quality on executive functioning in ADHD children [20], later internalizing and externalizing behavior among preschoolers and children's psychological adjustment [20][21][22][23][24].
Other frequent use of accelerometry is the study of motor activity in neonates [25], infants and children [25], adolescents with chronic pain [26] or obesity [27][28][29][30]. Since overactivity is conceptualized as one of the core features of the DSM-IV ADHD diagnosis [31], it is not surprising that accelerometry specifically designed to capture motor activity, is also employed among diagnosed ADHD subjects. Two main research types are found in this context. Some studies focus on diagnostic issues, in particular to what extent accelerometry could be relevant to identify ADHD symptoms. Other studies focus on clinical issues, in particular treatment efficacy for reducing ADHD symptoms. With regard to the diagnostic issues, comparative studies report significant differences between ADHD participants and controls in motor activity measured with accelerometers. High level of accuracy in categorizing ADHD participants and controls is also reported [32][33][34][35][36]. These studies tend to confirm that an objective screening of ADHD patients can be obtained with a simple sensor worn for just a few hours. They document the usefulness of additional information about intraindividual temporal stability of motor activity for the identification process of ADHD patients [36][37][38]. These studies also report about the sensitivity of accelerometry with regard to the setting features where the measurement is conducted. Intra-individual variations have for example been found between three school courses, i.e. mathematics, native language and arts [38] as well as between school and playground [39]. They finally argue for the inclusion of an objective assessment within a multi informant multi method approach as a good method for generating a quantitative reliable trait of hyperactivity [40]. With regard to the clinical issues, several studies document drug efficiency for the reduction of objective motor activity in ADHD patients in clinical setting during test sessions [26], in school setting [41], or at the playground [39] as well as during the night [42][43][44][45]. Other studies demonstrate the reduction of objective motor activity among ADHD children after intervention programs such as yoga sessions [21]. Accelerometry is finally shown to be of a clinical usefulness. It has been employed to give an activity-level feedback to ADHD 8-to-9-year-old boys in classroom setting leading to a reduction of 20 to 47% of baseline levels for the majority of the participants [41].
Far less research has been conducted among undiagnosed preschoolers, for whom we still need to accumulate evidence for the validity of an objective measure of motor activity. This could be helpful for the early identification of EB problems through its inclusion in a multi-method, multi-informant approach and by providing a benchmark for conventional questionnaire-based assessment, which is affected by respondents' subjectivity [23]. It may also be helpful for objectively estimating treatment or intervention efficacy [26].

Method Sample
This study was part of the H2M (Hard-t(w)o-Manage) research program conducted at the University of Louvain, which received the approval of the Ethics Committee of the Psychological Sciences Research Institute. Data were collected from a community sample of 226 children as well as from a sample of 32 children referred by their parents for EB concerns. Informed consent was obtained from the parents of children participating.
The community sample (N=226) was recruited when the children were in the first to third kindergarten years in several elementary schools. Three subsamples corresponding to three different conditions of assessment composed the community sample. For the first condition at school (N=103), parents were informed about the research program through leaflets distributed in surrounding elementary schools. For the second condition of assessment during a lab session at the university (N=47), parents were informed about the research program through posters and a website and Facebook page created for this study. For the third condition of assessment during a standardized computerized task administration (N=76), parents were informed about the research program through leaflets distributed in surrounding elementary schools. The referred sample (N=32) (30% girls) was recruited from pediatric units. Exclusion criteria were used in order to select children whose EB was the core mental health problem. We therefore excluded children with overall developmental delay or intellectual disability. This applied to children born prematurely (before 37 weeks), or with autism, dysphasia or substantial language delay according to an examination by a speech therapist, or with an IQ below 80 tested using four subtests, i.e. Information, Matrix reasoning, Block design, and Picture concepts of the WPPSI-III (Wechsler, 2004). Mean IQ was of 10.44 (sd=2.36). Note that all children were referred without medication. Sociodemographic information about the samples and statistical comparisons between samples are given in Table 1. Statistical comparisons show that in the community sample, the control participants involved were comparable according to children's age and gender, as well as to the mothers' and fathers' educational level when data were available. Statistical comparisons between control and referred children both assessed in a lab session condition were comparable with regard to the parents' educational level. However, the referred children were slightly younger than the controls and the frequency of girls was lower among the referred group than among the controls.

Procedure
Five research assistants who had been intensively trained in sampling procedure undertook the data collection. The degree of motor activity of 103 children was estimated during a one-hour session at school (mean time 55.00 minutes, sd=12.78). In this first condition, children held the wrist-worn sensor from the beginning of the activities in the classroom at 9:00 in the morning until the first break of the day when they went into the playground at 10:00. For 47 others, motor activity was estimated while they were interacting with their mother in a 30-minute laboratory session (mean time 26.25 minutes, sd=7.23). In this second condition, children were given the wrist-worn sensor to wear as soon as they went into the lab, and kept it on while they played with their mothers following a standardized procedure [10]. For 76 others, motor activity was estimated during a computerized 10-minute task [36] administered during a school visit in an isolated, quiet room, in the presence of a research assistant. In this third condition, children were given the wrist-worn sensor to wear as soon as they began the task, and kept it on until they completed the task (mean time 7.45 minutes, sd=1.28). After a 10-week interval, the same computerized task was readministered to the same subsample (mean time 7.54 minutes, sd=1.15) for test-retest purposes; there was one drop-out (N=75). The 32 referred children's motor activity was evaluated while they were interacting with their mother in a laboratory session which was similar in all respects to the second condition of assessment described above for the community sample (mean time 18.25 minutes, sd=4.25).

Instruments Accelerometry
In order to have maximum control over the acquisition of the acceleration data, we designed our own measurement platform. A miniaturized module, fixed like a watch by a bracelet to the wrist of the child, worked as a data logger. It sampled the acceleration on three axes and stored the data in a non-volatile memory. A USB interface made it possible to program the module and to transfer the recorded data to a host computer. The sensor used was the LIS3DH from STMicroelectronics, an ultra-low-power, high-performance, triaxial linear accelerometer, capable of sampling acceleration data at rates from 1 Hz to 5 kHz with a selectable full scale of ± 2,4,8 or 16 g at 12-bit resolution. It was configured to send its data via an I2C or SPI bus. A Microchip PIC16F648 microcontroller transferred the data from the LIS3DH to a bank of four 1-Mbits EEPROMs using the I2C bus. When connected to the host computer via an USB interface, the microcontroller stored several parameters specific to the LIS3DH, such as sample rate and full range sensitivity, as well as parameters specific to the experiment such as the identity of the subject and the recording start time. Finally, the sensor and the microcontroller synchronized their clocks with a precision of ± 1 millisecond, so as to be able to merge the acceleration data with triggers recorded by the computer during the experiment. At the end of the recording, the module was reconnected to the USB interface and transferred the data to the computer.
According to the recommendations of STMicroelectronics, every module was tested and calibrated to make sure that it produced the same values for each of the three axes and measured -1g and +1g when the axis was in the vertical orientation and 0g in the horizontal orientation. In this study, the sample rate was fixed at 20 Hz and full range sensitivity at ± 2g ( ± 19.62 m/s 2 ). A software program written in C++ in a Windows environment acted as an API between the module and the experimenter. It managed communication with the module, stored the collected data in files, carried out signal processing and exported data to be used with any other software.

Signal processing
1) On each of the three components X, Y and Z, a median filter with a window of 1.5 sec (30 samples at 20 Hz) was used to isolate the gravitational acceleration and subtract it from the original signal.
2) The modulus or magnitude (M) of the resultant X,Y,Z acceleration vector was then calculated by the formula: Interesting periods in the signal delimited by boundaries based on the triggers or from the timing noted by the experimenter were then cropped and processed.

4)
This phase of the processing consisted of calculating different 3APP scores, which we hoped could be used to characterize the level of motor activity of the child during the selected period and which were used for the statistical analyses described in this article: the mean, standard-deviation (intra-individual variability around the subject's mean), and median of all samples, the minimum and the maximum samples values (the minimum was always 0, and we consider the maximum value as the peak score), the percentage of time spent at a low, medium, and high level of activity. Cut-points that fixed the limits for these three levels were determined beforehand by merging and sorting all samples in the same vector, V [0.
.k]. The acceleration value at index k/3 was used as the low cut-point (LowCut), and the acceleration value at the nearest integer value of index 2*k/3 was used as the high cut-point (HighCut).
For each subject, we then scanned all samples of the M [0..n] vector by incrementing three variables: These cut-points were estimated separately for each of the three conditions, except for the study of age-and gender-related effects, where the condition was used as a predictor.

Questionnaire-based assessment
The French preschool version of the Child Behavior Checklist (CBCL) [4] was administered in the community sample to the parents of 113 of the 123 children involved in the second (N=40) and third (N=73) assessment conditions as well as to the parents of the 32 referred children. The CBCL provides three-point Likert scales: not at all present, moderately present, or often present. For the current study, the data collection was limited to two first-order scales of the CBCL, i.e. the "attention problems" and "aggressive behavior" scales, enabling us to calculate an externalizing behavior total score to build the second-order "externalizing behavior" scale. We also considered the first-order "anxiety" scale, relating to internalizing behavior, with a view to establishing a contrast between the moderate correlations that were expected between EB-related scales (attention problems and aggressive behavior) and 3AAP scores on the one hand, and the low or insignificant correlations expected between this IB-related scale and 3AAP scores. The CBCL is a widely used instrument with good psychometric properties. In our sample, the internal consistency was good both in the community sample and among the referred subjects, with α values of.88 and .76 respectively for the "aggressive behavior" scale, 0.70 and 0.70 for the "attention problems" scale, 0.89 and 0.75 for the "externalizing behavior" second-order scale, and 0.71 and 0.70 for the "anxiety" scale.
The French version of the Strength and Difficulties Questionnaire (SDQ) [15] was completed by the preschool teachers for 102 of the 103 children involved in the first school condition. The SDQ provides threepoint Likert scales: not true at all, somewhat true, or completely true. For the current study, the data collection was limited to three of the five scales of the SDQ. The "hyperactivity" scale was of particular interest, but we also considered the "emotional symptoms" and "prosocial" scales with a view to contrasting the moderate correlations expected between the EB-related scale, i.e. hyperactivity, and 3AAP scores on the one hand, and the low or insignificant correlations expected between the IB-related scale, i.e. emotional symptoms, and 3AAP scores, and the low or insignificant associations with prosociality on the other hand. The SDQ is a widely used instrument with good psychometric properties [1,14,30]. In our sample, the internal consistency was good, with α=0.73 for the "hyperactivity" scale, α=.74 for the "emotional symptoms" scale, and α=.89 for the "prosocial" scale.

Paradigm-based assessment
The Unfair Card Game (UFG) [36] is a computerized and standardized frustrating game designed to elicit spontaneous agitation, inattention, and negative and positive affect in the context of play interaction with a virtual peer. The administration of the UCG is video-recorded and coded following standardized guidelines set out in a manual. The UCG provides ordinal scores ranging from 1 to 5 for positive affect, negative affect, agitation and inattention. Good psychometric properties have been reported in the validation study, with four factors having been extracted that fit the four scales perfectly and explain 45.20% of the variance, and high inter-rater agreement (intra-class correlations) ranging from 0.77 to 0.94. In the current study, the children's observed behavior was coded by trained research assistants for the 76 children involved in the computerized task condition.

Results
The statistical analyses have been conducted with SPSS.22.

Preliminary analyses
Correlations between the 3AAP scores computed in the total sample (N=229) are displayed in Table 2. The results show that the peak of motor activity, the mean level, the intra-individual variability, and the median are strongly correlated to each other, with r ranging from 0.72 to 0.96. The percentage of time spent in the lower range of motor activity was also highly negatively correlated with the peak, the mean level, and the intra-individual variability (r from -0.77 to -0.92), while the reverse was true for the percentage of time spent in the higher range of motor activity (r from 0.79 to 0.97).

Sensitivity to assessment condition
The comparisons between the 3AAP variables collected in the school setting, during the lab session and during the computerized task administration were conducted with one-way ANOVAs. Significant inter-individual differences were displayed, indicating that the peak was higher in the school setting than during the laboratory session, where it was in turn higher than during the computerized task, F (2,227)=226.73, p<.001. The same was true for the mean level, F (2,227)=156.37, p<.001, the intra-individual variability, F (2,227)=155.84, p<.001, and the median, F (2,227)=106.82, p<.001. Post-hoc tests indicated that for all of these 3AAP variables, the three conditions were significantly different from each other. With cut-points estimated separately for each condition, no significant effect was found for the percentage of time spent in the higher range of motor activity, F (2,227)=.66, p>.05, and for the percentage of time spent in its lower range, F (2,227) =.29, p>.05. Descriptive statistics of the 3AAP scores according to the three conditions are presented in Table 2.

Normality tests
Tests for normality and homogeneity of variances were conducted on the 3AAP variables separately for the three conditions of assessment. The Kolmogorov-Smirnov test as well as extra data plots were conducted in order to make a decision about the extent of non-normality, and Note: Peak is the peak of motor activity variable; Mean is the mean level of motor activity variable; Variability is the intra-individual variability in motor activity variable; Median is the most represented value; Low range is the percentage of time spent in the lower range of motor activity variable; High range is the percentage of time spent in the higher range of motor activity variable.

Criterion-related validity
As expected, correlations between the 3AAP scores and EBrelated scales were significant for the SDQ "hyperactivity" scale, with coefficients ranging from 0.22 to 0.26, p<.05, with an exception for the peak. Conversely, low and insignificant correlations were displayed for the internalizing behavior-related "emotional symptoms" scale, as well as for the "prosocial" scale, with coefficients ranging from -0.16 to 0.10, p>.05. A negative significant correlation of -0.22 was also found between "emotional symptoms" and the intra-individual variability in motor activity.
Significant moderate correlations were found between the 3AAP scores and the CBCL, with coefficients ranging from 0.35 to 0.41 for the first-order "aggressive behavior" and the second-order "externalizing behavior" scales. Lower coefficients were found for the first-order "attention problems" scale, with coefficients ranging from .20 to 0.25, and for the "anxiety" scale, with coefficients ranging from 0.13 to 0.22.
Finally, with only a few exceptions, significant moderate correlations were found between the 3AAP scores and the UCG EB-related scores, with coefficients ranging from 0.21 to 0.41 for the "agitation" scale, from 0.26 to 0.29 for the "negative affect" scale, and from .30 to .45 for the "inattention" scale. Only the percentage of time spent in the lower and in the higher range of motor activity was weakly associated with negative affect, with r of -0.12 and 0.16. With regard to the "positive affect" scale, coefficients showed weaker associations, with correlations ranging from 0.03 to 0.15. Only the percentage of time spent in the lower and in the higher range of motor activity was associated with positive affect, with r of -.20 and .26. Correlations for conceptual validity are displayed in Table 3.

Test-retest Analysis
Test-retest correlations were computed after a 10-week interval. They were moderate to high, ranging from .53 to .77.

Discriminant properties
Initially, differences in externalizing behavior between the control and referred participants were tested with the CBCL scales. As expected given the recruitment procedure, the two samples were significantly different from each other with regard to the "externalizing behavior" second-order scale, F (1;144)= 63.82, p<.001, but not to the "anxiety" scale, F (1;144)= 1.56, p>0.05. In particular, the referred children had higher EB problems (M=27.71, sd=5.70) than the controls (M=14.89, sd=8.54) but they displayed a similar level of anxiety (M=4.62, sd=2.92) to the controls (M=3.94, sd=2.64).
Discriminant properties of the 3AAP scores were appraised with one-way ANOVAs comparing the referred and control children. The results showed that referred children scored significantly higher on the mean and the median. When the cut-points of the lab session condition were considered, which had been estimated in the community sample where the 3AAP scores were seen to be normally distributed, the referred children were also seen to spend more time in the higher range but less time in the lower range of motor activity. Effect sizes were large. Descriptive statistics and the results of ANOVAs are presented in Table 4.

Discussion
The main objective of the current study was to validate the 3AAP as a method using the measurement of children's wrist acceleration to estimate their motor activity. These results led to the main conclusion that the 3AAP scores are good candidates for an objective measurement of preschoolers' motor activity. In particular, they were consistently Median is the most represented value; Low range is the percentage of time spent in the lower range of motor activity variable; High range is the percentage of time spent in the higher range of motor activity variable. related to each other, and sensitive to the content of the tasks that were set for the children in the three conditions. Higher motor activity was found to occur when children were interacting with their classmates in natural school settings than in a space-limited laboratory where children and their mothers were interacting. Motor activity was also more limited in the condition where children were in an isolated quiet room in front of a screen, interacting with a virtual peer in the presence of an experimenter. However, even in the space-limited laboratory condition, the 3AAP scores made it possible to distinguish between the referred and control children. Although it cannot be completely ruled out that differences between the referred children and their counterparts were due to age-or gender-related differences between the two samples, these results suggest that the 3AAP is an assessment procedure that enables us to identify young children displaying overactivity and to differentiate them from normally-developing peers. Besides this diagnostic purpose, the existence of a valid objective measure of preschoolers' motor activity which is context-sensitive should also help us to document intra-individual variability across ecological developmental niches as a function of their specific situational demands.
In addition, the fact that the 3AAP scores were normally distributed within the three conditions with only a few exceptions supports the view that EB is continuously spread among preschoolers. From this point of view, preschoolers' EB, in particular overactivity, should not be considered as a diagnostic category in itself, but rather as a relatively intense and frequently observed level of motor activity. In this way, the data collected for the 229 children involved in the current study may constitute preliminary norms for the 3AAP scores. Norms are of particular interest for clinical use, as they help situate a target child's motor activity in comparison with representative peers.
The pattern of correlations found not only for EB-related but also for non-EB-related scales is in favor of the inclusion of the 3AAP assessment procedure in a multi-method, multi-informant approach [12]. Limited agreement between informants and methods has been consistently reported in previous literature [2,16]. The correlations that have been reported in the current study between the 3AAP scores and EB-related scales were slightly higher than those usually found between methods and informants, with only a few exceptions. In particular, unexpected associations were found with UGC scores for positive affect, which may reveal that in a computer game of this kind, a certain overlap exists between children's pleasure expression and motor activity.
Thanks to these interesting patterns of results found in the validation analyses, the 3AAP assessment procedure may serve as a benchmark, making it possible to control for the risk of false positive or negative identifications of young children at risk of developing severe EB symptoms, or to estimate the extent to which caregivers' representations are affected by subjectivity.
Finally, accelerometer can probably be considered as a low culturesensitive assessment procedure in comparison with questionnaire-or paradigm-based approaches, and this should stimulate cross-cultural research in the field of preschoolers' EB, and more generally in the field of developmental psychopathology research.
While important from both clinical and research perspectives, this study is by no means definitive. It is important to note a few practical limits of the use of the 3AAP. For instance, some children displaying oppositional behavior refused to wear the bracelet or removed it during the assessment procedure. Also, some children with attention disorders were easily distracted by the bracelet on their wrist. In the future, replication of the current study is needed with older children or even adolescents, not only to test age-related differences but also to point to specificities in each developmental period with regard to the relevance of the 3AAP assessment procedure and to provide additional reference norms. Future research should also answer the crucial question of the optimal length of time which guarantees the validity of accelerometry measurement in each developmental period. Another possible line of research would be to study the cultural invariance in motor activity development in childhood and adolescence. In short, in line with Kam et al. any kind of effort contributing to the elaboration of low-cost and reliable screening procedures is to be welcomed, in view of both the numerous research perspectives such procedures open up and their importance from a public health perspective [23].  ** p<.01;***p<.001 Note: Peak is the peak of motor activity variable; Mean is the mean level of motor activity variable; Variability is the intra-individual variability in motor activity variable; Median is the most represented value; Low range is the percentage of time spent in the lower range of motor activity variable; High range is the percentage of time spent in the higher range of motor activity variable.