Reliability and Concurrent Validity of a Chinese Version of the Alberta Infant Motor Scale Administered to High-Risk Infants in China

The Alberta Infant Motor Scale (AIMS) is widely used to screen for delays in motor development in high-risk infants, but its reliability and validity in Chinese infants have not been investigated. To examine the reliability and concurrent validity of AIMS in high-risk infants aged 0-9 months in China, this single-center study enrolled 50 high-risk infants aged 0-9 months (range, 0.17-9.27; average, 4.14±2.02), who were divided into two groups: 0-3 months (n=23) and 4-9 months (n=27). A physical therapist evaluated the infants with AIMS, with each evaluation video-recorded. To examine interrater reliability, two other evaluators calculated AIMS scores by observing the videos. To measure intrarater reliability, the two evaluators rescored AIMS after >1 month, using the videos. Concurrent validity was assessed by comparing results between AIMS and the Peabody Developmental Motor Scale-2 (PDMS-2). For all age groups analyzed (0-3, 4-9, and 0-9 months), intraclass correlation coefficients (ICCs) for AIMS total score were high for both intrarater comparisons (0.811-0.995) and interrater comparisons (0.982-0.997). AIMS total scores were well correlated with all PDMS-2 subtest scores (ICC=0.751-0.977 for reflexes, stationary, locomotion, grasping, and visual-motor integration subsets). However, the fifth percentile of AIMS total score was only moderately correlated with the gross motor quotient, fine motor quotient, and total motor quotient subtests of PDMS-2 (kappa=0.580, 0.601, and 0.724, respectively). AIMS has acceptable reliability and concurrent validity for screening of motor developmental delay in high-risk infants in China.


Introduction
Certain infants are considered to be at high risk of growth and developmental delay during the prenatal, intrapartum, and postnatal periods. These high-risk infants are particularly vulnerable to cerebral injury and abnormal brain development [1,2], which can result in permanent sequelae such as cerebral palsy and intellectual disability. Neuronal plasticity is enhanced in the developing brain, particularly during the first year of an infant's life [3]. The early detection of motor deficits and developmental delay in an infant allows for appropriate interventions to be instigated at the earliest possible opportunity, maximizing the potential for clinical benefit. Indeed, early intervention with suitable programs has been reported to have a positive effect on motor development in infants with or at high risk of developmental delay [4]. Clinical assessments of high-risk infants provide important guidance and support to the caregivers during the critical first year of an infant's life. However, it is crucial that these assessments are able to identify infants with developmental delay as accurately and as early as possible.
Peabody Developmental Motor Scales-2 (PDMS-2) is an alternative assessment for children aged 0-6 years [5] that is standardized and normed for infants/toddlers/children in the USA. PDMS-2 is a comprehensive motor function assessment scale widely used in high-risk infants. Previous studies have provided evidence for the reliability and concurrent validity of PDMS-2 when used to assess motor development and identify motor deficits in high-risk infants [6][7][8]. PDMS-2 has also been used to assess high-risk infants under Chinese socioeconomic and cultural constraints [9]. Since PDMS-2 has been shown to be a reliable and validated tool [5,[10][11][12][13], it is widely used in China as a discriminative measure for evaluating motor development in the clinical setting. However, PDMS-2 contains a large number of assessment items and requires a long administration time of 45-60 minutes. This is a major disadvantage in China, where the population of high-risk infants has gradually increased [14] and the demand for assessment is high. Indeed, the limited medical and social resources in China mean that it would be practically challenging to use PDMS-2 to screen all infants at high risk of motor development delay. Thus, alternative screening tools are needed that are not only simple and quick to administer but also reliable and validated in the clinical setting.
The Alberta Infant Motor Scale (AIMS) is an assessment tool that measures the motor maturation of infants from birth to the age of independent walking and incorporates the neuromaturational concept and dynamic systems theory [15]. AIMS was originally developed to screen motor development in Canadian full-term and preterm infants [15]. Although not as comprehensive as PDMS-2, major advantages of AIMS are that it is straightforward to administer, its use involves only observation, and the assessment can be completed within 20 minutes. Because of its practicality and psychometric characteristics, AIMS is widely used in the clinical setting [16]. The reliability and validity of AIMS have been investigated in infants in Canada, Brazil, Japan, and Taiwan [16][17][18][19][20]. However, no previous studies have examined the reliability and concurrent validity of AIMS when used to screen highrisk infants in China.
The purpose of this study was to investigate the intrarater and interrater reliabilities and concurrent validity (compared to PDMS-2) of AIMS when used by physical therapists to screen high-risk infants in China.

Study Design.
This was a prospective study carried out at the Pediatric Rehabilitation Outpatient Department of the Children's Hospital Affiliated to Zhejiang University School of Medicine, Hangzhou, Zhejiang, China, between October 2013 and December 2013. A total of 50 infants aged 0-9 months, who were at high-risk of developmental delay, were enrolled in the study based on the inclusion and exclusion criteria. The inclusion criteria were (i) age (corrected for gestation) ≤9 months and (ii) being at high risk of developmental delay due to the presence of one or more of the following factors: low birth weight (<2500 g), prematurity (gestational age <37 weeks), polyembryony, intrauterine infection, intrauterine hypoxia, hypoxic-ischemic encephalopathy, asphyxia, neonatal hyperbilirubinemia, neonatal intracranial hemorrhage, or use of a ventilator due to lung dysplasia. The exclusion criteria were (i) visual or auditory impairment, (ii) hereditary metabolic diseases or myogenic diseases, and (iii) congenital malformations or severe congenital heart disease.
The study was approved by the Ethics and Human Research Committees of the Children's Hospital of Zhejiang University School of Medicine. Written informed consent was obtained from the parents/guardians of all infants before their inclusion in the study. . The total sample consisted of 21 girls and 29 boys, corrected age from 5 days to 9 months. For assessments of the reliability of AIMS, the study participants were divided into two groups based on age (0-3 months, n = 23; 4-9 months, n = 27) to ensure a relatively equal representation of different levels of motor performance. For the purposes of this study, infant age was taken as the age corrected for gestation.

Participants
The following demographic and clinical characteristics were recorded: gender, age, gestational age, birth weight, length of hospital stay, and the presence/absence of polyembryony, birth asphyxia, intracranial hemorrhage, hyperbilirubinemia, and use of a ventilator.

Selection of Raters
The three raters (A, B, and C) were rehabilitation therapists with more than 3 years of professional experience, including extensive experience of pediatric patients and administration of development assessment scales. All raters had received specialized training at the Rehabilitation Department in infant motor development theories and the use, application, and scoring of AIMS. During training, the raters were provided with instructions and demonstrations of the AIMS testing procedures and rating criteria. Following the training session, the raters were required to administer the AIMS to 10 highrisk infants (the data obtained from the infants examined during the training sessions were not included in the final analyses of reliability and validity); the consistency of the AIMS scores exceeded 0.8, which was deemed satisfactory.
The PDMS-2 assessment was carried out by a chief physician in child rehabilitation, who had 8 years of work experience, extensive knowledge of child development theory and child assessment theory, and substantial practical experience in the administration of PDMS-2.

Process of Infants' Evaluation with
Both Tools 4.1. Overall Procedure. All infants were initially given AIMS on-site by rater A, while a videographer video-recorded the infant's performance throughout the examination. Following this, on the same day, a rehabilitation physician completed the PDMS-2 assessment [5], which was used to investigate concurrent validity. These evaluations were carried out in a quiet, undisturbed, well-lit room at a temperature of 20-30 ∘ C; the children were clothed in one or two layers and encouraged to play at their best level. Subsequently, raters B and C (who were blinded to the evaluation of rater A) independently evaluated the AIMS score for each infant by observation of the video recordings; these scores and the scores by rater A were used in the assessment of interrater reliability. Raters B and C reevaluated the AIMS score for each infant one month later (again by observation of the video recordings); these scores were used in the assessment of intrarater reliability. A time interval of one month was considered long enough to minimize the memory bias of the rater. Due to the use of a video-recorded evaluation, raters B and C did not have to handle the child, which eliminated one potential source of error.

AIMS.
AIMS is a behavioral motor assessment tool that requires careful observational techniques and minimal infant handling. The scale consists of 58 items that are categorized by four subscales: prone (21 items), supine (9 items), sitting (12 items), and standing (16 items) [15]. For each test item, the examiner must identify and observe three key descriptors: weight bearing, posture, and antigravity movement. The sum of the observed criteria for each subscale comprises the total raw score (0-58 points). The final raw scores can be converted into percentile ranks and compared with the ranks of agematched peers. Infants below the fifth percentile are identified as having movement dysplasia [17,21]. The Chinese version of AIMS used in the present study had been translated from the English version but had not been subjected to cultural adaptation.

PDMS-2
We used PDMS-2 as a standard against which to examine the concurrent validity of AIMS. PDMS-2 consists of a gross motor scale and fine motor scale, each of which is divided into skill subtests that detect typical motor tasks for each age. Test items are scored on a scale of 0-2 points with a score of 1 indicating partial success. The performance of the test piece is summarized and analyzed by employing motor quotients derived by adding the subtest standard scores and converting the sum to a quotient that has a mean value of 100 and a standard deviation of 15. The motor quotients include the gross motor quotient (GMQ), fine motor quotient (FMQ), and total motor quotient (TMQ: comprised both GMQ and FMQ). The GMQ includes reflexes (RE, for infants aged 0-12 months) or object manipulation (OB, for infants aged >12 months), and stationary (ST) and locomotion (LO) subtests, while FMQ includes grasping (GR) and visualmotor integration (VI) subtests. Motor quotient scores <90 were interpreted as indicative of movement dysplasia.

Data Analysis
Statistical analysis was performed using SPSS16.0 (SPSS Inc., Chicago, IL, USA) and MedCalc Software (MedCalc 9.2.10, Belgium) for Bland-Altman analysis. Data were tested for normality. Normally distributed data are presented as means ± standard deviations (SDs), nonnormally distributed data as medians and ranges or interquartile ranges (IQRs), and categorical data as n (%). Intrarater and interrater reliability were examined by calculation of the intraclass correlation coefficient (ICC) and the 95% confidence interval (95%CI) of the ICC. Interrater ICCs were calculated on the subsections and the total scorings of AIMS for each age group by three raters (A, B, and C) when assessing the same infant; the ICC value between each pair of raters was also calculated. Intrarater ICCs were calculated on the repeated scorings in one-month interval by rater B and rater C. Interrater and intrarater reliability were also analyzed using Bland-Altman plots. Because of the high degree of correlation between the development of gross motor and fine motor skills [22], concurrent validity was assessed also by calculation of the ICC and the 95%CI of the ICC between the AIMS total raw score and the PDMS-2 raw score for each subtest (including gross motor and fine motor). The kappa concordance coefficient was calculated to analyze the qualitative consistency between the AIMS percentile and the PDMS-2 GMQ, FMQ, and TMQ scores. As described by Portney

Demographic and Clinical Characteristics of the Study Participants.
A total of 50 infants (21 females and 29 males) with an average age of 4.14 ± 2.02 months (range, 0.17-9.27 months) were included in the study. The demographic and clinical characteristics of the study participants are listed in Table 1.

Interrater
Reliability. The AIMS total and subsection (prone, supine, sitting, and standing) scores recorded independently by each rater (A, B, and C) and analyzed for all patients as well as for subgroups based on corrected age (0-3 months and 4-9 months) are shown in Table 2. The  interrater ICC values are also listed in for all patients. The reliability of AIMS was further examined between each pair of raters ( Table 3). The ICC value between each pair of raters exceeded 0.9 across all age groups for all AIMS subsections, except for the standing subsection scores (across all age subgroups) for rater C versus rater A or B. The overall ICC (total AIMS score for all patients)  Interrater reliability of the AIMS total score for all patients between each pair of raters was further examined using Bland-Altman analysis (Figures 1, 2, and 3). For rater A and rater B, the mean difference was 0.08, the SD of the differences was 0.70, and the lower and upper limits were -1.28 and 1.44, respectively. For rater A and rater C, the mean difference was 0.38, the SD of the differences was 1.35, and the lower and upper limits were -2.27 and 3.03, respectively. For rater B and rater C, the mean difference was 0.30, the SD of the differences was 1.33, and the lower and upper limits were -2.31 and 2.91, respectively.

Intrarater Reliability.
The intrarater ICC values for AIMS total score and subsection scores (prone, supine, sitting, and standing) are shown in Table 4 for rater B and Table 5 for rater C. For raters B and C, respectively, intrarater ICC values for total AIMS score were 0.872 (95%CI: 0.699-0.946) and The ICC values for both raters were generally lowest for the standing subsection of AIMS in both age groups. Furthermore, the ICC values for the subsection scores were generally lower for the 0-3 months' age group (0.580-0.869) than for the 4-9 months' age group (0.728-0.991).
Intrarater reliability of the AIMS total score for all patients was further examined using Bland-Altman analysis (Figures  4 and 5). For rater B, the mean difference between ratings was -0.24, the SD of the differences was 1.25, and the lower and upper limits were -2.70 and 2.22, respectively. For rater C, the mean difference between ratings was -0.50, the SD of the differences was 1.37, and the lower and upper limits were -3.19 and 2.19, respectively.

Concurrent
Validity. PDMS-2 was administered to 47 infants, for whom the AIMS total score was 13.68 ± 8.95; 3 infants were not assessed with PDMS-2 because they were <1 month of age. The PDMS-2 raw scores for these 47 infants and the ICC value between the AIMS total score and the PDMS-2 raw scores for the RE, ST, LO, GR, and VI subtests are shown in Table 6. The ICC value exceeded 0.9 for all PDMS-2 subtests, except for the RE subtest scores (ICC: 0.751, 95%CI: 0.553-0.861). Correlation analysis suggested a good positive correlation (all ICCs> 0.75).
The correlation coefficients between the fifth percentile of the AIMS total score and each of GMQ, FMQ, and TMQ are presented in Table 7. In this assessment of qualitative consistency, 17 high-risk infants were on or below the fifth percentile of the AIMS total score, while 30 infants were above the fifth percentile. GMQ was <90 in 16 infants and ≥90 in 31 infants; FMQ was <90 in 11 infants and ≥90 in 36 infants; and TMQ was <90 in 17 infants and ≥90 in 30 infants. The kappa concordance correlations of the AIMS fifth percentile with GMQ, FMQ, and TMQ were 0.580, 0.601, and 0.724, respectively, suggesting a moderate correlation.

Discussion
An important finding of our study was that the intrarater and interrater reliability of total and the various subsections of AIMS score were high in infants with a corrected age of 9 months or less. In addition, AIMS total score was well correlated with the various PDMS-2 subtest scores (ICC: 0.751-0.977), although the fifth percentile of AIMS total score was only moderately correlated with the GMQ, FMQ, and TMQ subtests of PDMS-2 (kappa values of 0.580-0.724). Overall, our study provides evidence that AIMS shows excellent reliability and concurrent validity when used to screen for motor developmental delay in infants aged ≤9 months in China.   The findings of our study regarding the reliability of AIMS were similar to those reported previously in many countries, not only for normally developing infants [15,19,20,[24][25][26] but also for high-risk infants [27]. Thus, numerous previous investigations have demonstrated high levels of intrarater and interrater reliability when AIMS is used to evaluate motor development in both normally developing infants and highrisk infants. It was reported previously that the AIMS scores of infants at dual risk of motor delays or disabilities were very similar between novice examiners and experienced examiners in the USA (ICC values of 0.98-0.99) [28]. This suggests that it is relatively straightforward to rapidly train medical staff in the correct administration of AIMS. Our study also indicated that AIMS could be reliably and easily administered by physical therapists to high-risk infants in China after only a short training course in the theories of motor development and the use of AIMS. When each of the AIMS subtests (prone, supine, sitting, and standing) were examined independently for the different age groups (0-3 months, 4-9 months, and 0-9 months), the interrater ICC values between any two raters were all >0.92 (and many values were ≥0.98), with the exception of interrater ICC values for the standing subtest, which were notably lower (rater C versus A or B: 0.880 for ages of 0-3 months, 0.814 for ages of 4-9 months, and 0.867 for ages of 0-9 months). Intrarater reliability for the standing subtest was also low for both raters, particularly for younger infants (0.580-0.754 for ages 0-3 of months, 0.728-0.922 for ages of 4-9 months, and 0.819-0.923 for ages of 0-9 months). These findings are consistent with those reported previously in Canada [15] and China [26]. The lower reliability for the standing subtest of AIMS in younger infants, particularly those aged 0-3 months, may reflect the difficulty in assessing standing movements in young infants and thus greater variability between scores than for other subtests. Furthermore, younger infants are able to perform fewer items than older infants, which may also contribute to the weaker correlations observed in infants aged 0-3 months. Nonetheless, the overall interrater and intrarater ICC values exceeded 0.99, confirming that AIMS is a very stable and reliable method for evaluating motor development in highrisk infants.

BioMed Research International
A validity study in Canada found a high correlation between AIMS and PDMS-2 (a correlation coefficient of 0.99) when the tests were concurrently applied on full-term infants [15]. Similarly, a report in China determined that the correlation coefficient between AIMS and PDGMS-2 in highrisk infants aged 1-9 months reached 0.91 [29]. Our results also showed high degrees of correlation between the AIMS score and all the PDMS-2 subscale scores in high-risk infants. The correlation coefficients between total AIMS score and each of the PDMS-2 subtest scores (RE, ST, LO, GR, and VI) were all above 0.75, with the highest correlation being between the PDMS-2 LO subscale and AIMS total score. These findings support the concurrent validity of AIMS and PDGMS-2, particularly for the LO subscale, in agreement with a previous study in the USA [28]. Gross and fine motor skills in infancy are important parts of human intelligence and depend on the development of feeling and cognition [30]. The development of gross motor skill is assessed by the RE, ST, and LO subtests of PDMS-2. RE represents a fundamental basis of gross motor development in infants because of its complex involvement in the regulation of ST and LO by the nervous system. Fine motor skills are developed by accessing basic ST and LO abilities. Visual function also affects the development of ST and LO, which together promote the development of fine motor abilities [22]. Therefore, gross motor development is closely related to fine motor development, with each promoting the other. A high degree of correlation between AIMS and PDFMS-2 was found in our study, which suggests that AIMS is a reliable motor assessment scale for high-risk infants. However, the correlations between the fifth percentile of AIMS and the motor quotients of PDMS-2 (GMQ, FMQ, and TMQ) were only moderate-to-good. This discrepancy may be due to sampling bias. Standardization of AIMS was established in infants in Canada, and being below the fifth percentile was considered to be indicative of motor dysplasia, while PDMS-2 was designed using American norms, with the lowest 12% considered as having motor dysplasia. Furthermore, different approaches to correcting for gestational age between the two scales (40 weeks for AIMS and 37 weeks for PDMS-2) may also have contributed to the moderate degree of correlation between the fifth percentile of AIMS and the motor quotients of PDMS-2.
Screening motor development in high-risk infants in China is a challenging task because of the large population and relative lack of medical resources. Therefore, it is imperative to select an infant motor assessment scale that can monitor motor development and detect motor disorders in high-risk infants with high sensitivity. AIMS has the important advantages that it is straightforward to administer and requires observation of the infant only. Furthermore, AIMS is sensitive in identifying children with subtle movement problems and could potentially identify motor deficits in high-risk infants at an early stage. The reliability and concurrent validity of AIMS determined in our study suggest that physical therapists could choose either AIMS or PDMS-2 for evaluating the motor development of high-risk infants. AIMS might better fulfill the current need in the field of highrisk infant motor assessment because the process and quality of movement as well as the achievement of specific milestones are considered. Furthermore, the ease of administration of AIMS and the relatively short time required may make this instrument more feasible for use in follow-up clinics for infants at risk of motor delays.

Limitations of this Study
This was a single-center study, so the generalizability of our findings remains unknown. Furthermore, the small sample size limits the statistical power of the study. Since our evaluators participated in a training session and practiced so as to achieve a certain level of agreement, our results may not be representative of those that would be obtained by therapists in general practice. Due to the use of a videorecorded evaluation, raters B and C did not have to handle the infant, eliminating a potential source of error; in general practice, differences in handling skills between therapists may lead to lower reliability. In our study, AIMS and PDMS-2 were administered to each infant only once, so longitudinal data for motor function, including assessments made after therapeutic intervention, were not available. Information regarding the predictive validity of AIMS was also limited. In a future study, criterion-related validity could be investigated by performing longitudinal follow-up of motor development in infants at high risk of motor delays, with the infants evaluated with AIMS and PDMS-2 at corrected ages of 6 and 12 months.

Conclusion
Our results indicate that AIMS is a reliable and stable instrument for evaluating motor function in high-risk infants in China. Because of its straightforward administration and low cost, AIMS could be used to monitor motor development during follow-up of high-risk infants. Furthermore, AIMS can guide early intervention for developmental disorders.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare no conflicts of interest.