Behavioral interventions for reducing head motion during MRI scans in children

&NA; A major limitation to structural and functional MRI (fMRI) scans is their susceptibility to head motion artifacts. Even submillimeter movements can systematically distort functional connectivity, morphometric, and diffusion imaging results. In patient care, sedation is often used to minimize head motion, but it incurs increased costs and risks. In research settings, sedation is typically not an ethical option. Therefore, safe methods that reduce head motion are critical for improving MRI quality, especially in high movement individuals such as children and neuropsychiatric patients. We investigated the effects of (1) viewing movies and (2) receiving real‐time visual feedback about head movement in 24 children (5–15 years old). Children completed fMRI scans during which they viewed a fixation cross (i.e., rest) or a cartoon movie clip, and during some of the scans they also received real‐time visual feedback about head motion. Head motion was significantly reduced during movie watching compared to rest and when receiving feedback compared to receiving no feedback. However, these results depended on age, such that the effects were largely driven by the younger children. Children older than 10 years showed no significant benefit. We also found that viewing movies significantly altered the functional connectivity of fMRI data, suggesting that fMRI scans during movies cannot be equated to standard resting‐state fMRI scans. The implications of these results are twofold: (1) given the reduction in head motion with behavioral interventions, these methods should be tried first for all clinical and structural MRIs in lieu of sedation; and (2) for fMRI research scans, these methods can reduce head motion in certain groups, but investigators must keep in mind the effects on functional MRI data. HighlightsIn young children, movie watching during MRI scans reduces head motion.Real‐time head motion feedback also reduces motion during MRI scans in young children.Motion effects were specific to younger (5‐10 years) not older children (11‐15 years).Movies, but not feedback, significantly alter functional connectivity MRI data.


Introduction
The advent of MRI has revolutionized human brain imaging, for both clinical and research purposes. Given its high spatial resolution, absence of radiation, and numerous sequence options that can resolve different structural and functional brain properties (e.g., T1, T2, FLAIR, DTI, BOLD), MRI has become a preferred diagnostic and investigative tool for neurologists, radiologists, psychiatrists, neurosurgeons, and neuroscientists. One of the biggest limitations of brain MRI is patient/participant motion during scanning, which causes artifacts in both structural and functional MRI (fMRI) data, effectively blurring the images.
Children and patients have the highest head motion during MRI scans . Thus, current clinical practice in pediatrics commonly uses sedation to ensure that patients hold still during scans. Sedation incurs the time and costs of a trained anesthetist and carries risks for the patient. Time sensitive tests may suffer due to unavailability of an anesthesiologist. Moreover, there is evidence to suggest that anesthesia can have negative effects on neurodevelopment in young children (Sanders et al., 2013;Andropoulos and Greene, 2017;Coleman et al., 2017). Therefore, it would be highly preferable to obtain high-quality, minimal-motion brain MRIs in all patients, even children, without the need for sedation.
For most research brain MRIs, the medical risks of sedation make its use unethical. Therefore, MRI data collected for research purposes are often contaminated to some degree by motion artifact. Researchers studying populations that tend to have high amounts of movement (e.g., children, neuropsychiatric patients) often must discard vast amounts of MRI data contaminated by head motion (e.g., Greene et al., 2016bGreene et al., , 2017. Given the high cost of MRIs (e.g., > $600/hour at many institutions in the U.S.), such data loss is extremely uneconomical. Some cohorts, such as children under the age of five years have so much head motion during awake MRI scans, that they have essentially been excluded from fMRI research.
Recent work has demonstrated that the effects of head motion are even more problematic than previously understood. Even small amplitude micro-movements of the head from one data frame to the next lead to systematic distortions in quantitative MRI analyses, including functional connectivity (Power et al., 2012;Satterthwaite et al., 2012;Van Dijk et al., 2012;Ciric et al., 2017), volumetric (Reuter et al., 2015), and diffusion imaging (Yendiki et al., 2013) approaches. In functional data, motion can be indexed by calculating absolute displacement of an fMRI frame away from a reference frame, as in root mean squared head position change (RMS), or by calculating relative displacement from one frame to the next, as in framewise displacement (FD). While RMS was commonly used to assess motion for the purposes of quality control and group matching, it does not distinguish between qualitatively different types of movement that can affect the data differently, and thus, is poorly correlated with motion related artifacts (Power et al., 2012). By contrast, FD is closely related to motion artifacts, making it a much more useful estimate of problematic motion (Power et al., 2012;Satterthwaite et al., 2012;Van Dijk et al., 2012;Ciric et al., 2017). Multiple data processing approaches have been developed to reduce the effects of head motion on data quality, some based on FD Tisdall et al., 2012;Power et al., 2014;Ciric et al., 2017). While implementing such methods is necessary for minimizing artifact in motion contaminated data, they either fail to completely remove head motion artifacts or do so by removing large amounts of data.
Methods to reduce head motion during the acquisition stage of MRI scanning are critically important for improving the availability and safety of diagnostic and research brain MRIs. While a number of approaches for encouraging children and patients to hold still in the scanner have been tried, the empirical evidence is limited (Greene et al., 2016a). For example, there are several physical head restraint methods, including head molds, bite bars, thermoplastic masks, vacuum packs, and cushioning, each with varying levels of tolerability for the patient. Yet, few studies test the efficacy of these methods for human MRI, and those that do, have only assessed gross movement artifact by visual inspection or repositioning error (Bettinardi et al., 1991;Schultke et al., 2013). In addition, some of these devices (e.g., thermoplastic masks) do not fit inside the 32-and 64-channel head coils that are becoming standard with newer MRI scanners.
Aside from physical restraints, there is some evidence that head motion as measured by FD in children is lower during engaging, fastpaced tasks compared to less engaging tasks or to rest (i.e., lying awake, doing no goal-directed task in the scanner) (Engelhardt et al., 2017). Movies can be highly engaging and fast-paced, and therefore, may reduce head motion during scans. In fact, showing movies to patients and research participants is often touted anecdotally as useful for improving MRI data quality and is used in many clinical settings during scanning of children. However, we are aware of only a small number of studies that experimentally examined movement during movie watching. One study collected fMRI data as children, ages 4-10 years old, viewed clips from the television show Sesame Street and as they performed a traditional behavioral matching task (Cantlon and Li, 2013). While the main purpose of the study was to investigate brain development during naturalistic, educational stimuli, the authors also compared head motion during these two conditions. They found significantly less head motion (translation and rotation) when viewing the Sesame Street clips than when performing the task. Another study that specifically aimed to investigate motion during movies compared to rest found lower mean FD during movies than rest in adults and children (4-7 years old) (Vanderwal et al., 2015). For the children, however, head motion was measured using a sensor system (MoTrak Head Motion Tracking System, Psychology Software Tools, Inc.) in a mock scanner, rather than using motion estimates from actual fMRI images; these sensor systems are susceptible to movements of the muscles of facial expression, such as raising one's eyebrows. While this study is arguably the best to date in evaluating the effects of movie watching on head motion in children, more research is needed during actual MRI scanning and over an expanded age range.
Another potential method for reducing head motion is providing feedback to the individual being scanned about his/her motion. Until recently, scanner operators almost never read out accurate motion estimates and provided real-time feedback about motion to the individual being scanned. Setting up a system for real-time feedback about head motion in the scanner is non-trivial, as it requires particular hardware and software infrastructure, and as such is uncommon, especially in clinical settings. Scanner operators sometimes provide verbal feedback between scans to let participants know that they seem to be moving too much, albeit based on visual detection of frame-toframe motion on the scanner console, which must be large enough to detect by eye.
Many research groups attempt to train participants to hold still prior to scanning in a mock scanner using software such as MoTrak to provide online feedback about head movement. Yet, standard MoTrak is based on absolute head displacement rather than frame-to-frame displacement, and the sensor systems are not ideal, as mentioned previously. A study by Lal et al. did investigate the effects of watching one's own motion traces on subsequent resting state fMRI scans (Lal et al., 2016). In a sample of 7 adults, they found no (or inconsistent) improvement in the number of spikes during resting state scans before and after feedback. On the other hand, another study found a significant reduction in head motion when adding real-time feedback to an N-back task during fMRI in 12 adult males (Yang et al., 2005). Thus, there is some indication that real-time feedback regarding head motion may help participants hold still during resting-state fMRI scans.
In the present study, we measured frame-to-frame in-scanner head motion in children and adolescents between 5 and 15 years old, while viewing movies and/or during real-time head motion (FD) feedback. We used a newly developed software package, Framewise Integrated Realtime MRI Monitoring (FIRMM), that allows for real-time computation of FD during scans , and developed a method to feedback the FD information visually to participants. We predicted that both movies and feedback would reduce head motion during scans in this population, yet we found that these results were strongly dependent on age. We also tested whether or not out-of-scanner movement and sleep could predict a child's ability to hold still during MRI scanning. Using out-of-scanner movement data collected from accelerometers worn on the wrist, we examined relationships between real-world daily activity counts and sleep and in-scanner head motion, but failed to detect any significant relationships. By developing, validating, and promoting safe methods for reducing head motion during MRI, and testing for predictors of in-scanner head motion, we aim to increase the quality of unsedated clinical and research brain MRIs, while simultaneously reducing their costs.

Participants
Twenty-four children and adolescents (10 female, 5-15 years old, mean age 11.1 years) recruited from the local Washington University community participated in this study. Participant characteristics are summarized in Table 1. Participants completed the Tics, Obsessive Compulsive Disorder (OCD), and Autism Spectrum Disorders (ASD) modules of the Kiddie Schedule for Affective Disorders and Schizophrenia (KSADS) (Kaufman et al., 1997), current and lifetime attention-deficit/hyperactivity disorder (ADHD) Rating Scale (Conners et al., 1998), the Multidimensional Anxiety Scale for Children (MASC) (March et al., 1997), the Social Responsiveness Scale (SRS) (Constantino et al., 2003), the Kaufman Brief Intelligence Test II (K-BIT II) (Kaufman and Kaufman, 2004), the Barratt Simplified Measure of Social Status (BSMSS), and the Edinburgh Handedness Inventory (Oldfield, 1971). Assessments were collected using REDCap [Research Electronic Data Capture] hosted at Washington University (Harris et al., 2009). Of the 24 participants, 6 did not complete the KSADS, 1 did not complete the K-BIT, and 3 did not complete the ADHD Rating Scale, SRS, MASC, or BSMSS, all due to time constraints.
Participants were excluded for parental-reported psychosis, mania, ASD, cerebral palsy, epilepsy, intellectual delay/disability and cortical visual impairment. Participants were also excluded for any contraindications to MRI, including a history of abnormal heart rhythm, pacemaker, metallic object(s) in body, extensive dental work, claustrophobia (as determined by asking the child whether he/she has ever experienced symptoms of claustrophobia such as feelings of anxiety/ panic when in a confined space), and concussion with loss of consciousness > 5 min. Participants were not excluded for tic disorders, anxiety disorders, ADHD, taking psychoactive medications, or handedness. Two of the participants had a previous diagnosis of ADHD, both of whom were taking stimulant medications. No other children were taking psychoactive medications. One participant met diagnostic criteria for OCD and one met diagnostic criteria for Provisional Tic Disorder after the KSADS.
A parent or guardian gave informed consent and all children and adolescents assented. All participants were compensated for their participation. The Washington University Human Research Protection Office approved the study.

Image acquisition
Participants were scanned on a Siemens Tim Trio 3.0 T MAGNETOM scanner (Siemens Medical Solutions, Erlangen, Germany) with a Siemens 12-channel Head Matrix Coil. A high-resolution T1-weighted MPRAGE structural image (resolution ¼ 1 Â 1 Â 1 mm) was acquired for each participant. Functional images were acquired using a BOLD contrastsensitive echo-planar sequence (TE ¼ 27 ms, flip angle ¼ 90 , in-plane resolution 4 Â 4 mm; volume TR ¼ 2.5 s). Whole-brain coverage was obtained with 32 contiguous interleaved 4 mm axial slices. Participants completed seven 6-min 50-s long BOLD runs.

Experimental design
Participants completed Rest runs, during which they viewed a fixation crosshair, and Movie runs, during which they viewed movie clips. For each of these Stimulus conditions (Rest, Movie), they received three Feedback conditions: None, Fixed, and Adaptive. During the Fixed and Adaptive Feedback conditions, participants received online feedback about their head motion. Thus, the experiment consisted of a 2 (Stimulus) X 3 (Feedback) design, resulting in six conditions. The first BOLD run always consisted of a baseline Rest run in order to obtain a baseline assessment of each participant's movement during a standard eyes-open resting state scan. The following six runs consisted of the six experimental conditions, the order of which was counterbalanced across participants.
Participants were instructed to relax and hold as still as possible during all scans. During Rest scans, they were told to look at the "plus sign" and during Movie scans, they were told to watch the movie. For the feedback scans, participants were told that a game was added such that the scanner will tell them if they are moving too much with a yellow/red plus sign (Rest) or box (Movie), and their goal was to keep the plus sign white (Rest) or keep the boxes away (Movie). For the Adaptive Feedback condition, they were also told that when they hold still well, the scanner will take the game to the next level and make it a little harder.

Stimuli
Stimuli were presented using the Psychophysics Toolbox Version 3 in Matlab, and back-projected onto a MR-compatible rear-projection screen at the end of the scanner bore, which the participants viewed through a mirror mounted onto the head coil. The screen size was 1024 Â 768 pixels. MR-compatible headphones were worn to dampen the noise of the scanner and to listen to the movies during the Movie conditions.
A schematic of the stimuli is shown in Fig. 1. During the Rest conditions, stimuli consisted of a centrally presented white crosshair (subtending <1 visual angle) on a black background. For the Rest conditions that included feedback, the feedback consisted of the crosshair changing color to yellow for "medium" motion or red for "high" motion. Motion was determined using framewise displacement (FD; see below for description). The criteria for medium and high motion was tailored to the individual by extracting the individual participant's FDs during the baseline rest scan. The FDs for each frame of the baseline rest scan were sorted highest to lowest; the FD corresponding to the top 10% of frames was used as the high motion threshold and the FD corresponding to the top 25% of frames was used as the medium motion threshold. Floor thresholds were set to 0.3 mm (high) and 0.2 mmm (medium). For the Fixed Feedback condition, the thresholds were held constant for the duration of the run. For the Adaptive Feedback condition, the thresholds were held at these starting values for the first 20 frames of the run, after which they were recalculated according to the same criteria (10 and 25%) using the previous 20 frames of the current scan, and recalculated for each subsequent frame based on the previous 20 frames. New FD threshold values replaced the previous FD threshold values only if they were lower than the current ones (i.e., stricter). Thus, participants could decrease the FD threshold values until the end of the run or until reaching the floor thresholds of 0.3 and 0.2 mm.
During the Movie conditions, stimuli consisted of clips of cartoon blockbuster movies edited for our specific research purposes (our custom video clips are available for research purposes at www.dosenbachlab. wustl.edu). Three movies were used to make a total of seven movie clips that were shown to participants in a randomized order. Two clips were taken from Big Hero 6 (Disney Movies), two were from Despicable Me (Illumination Entertainment, Universal Pictures), and three were from Finding Nemo (Disney Movies, Pixar). Clips were chosen on the basis of being engaging, but not overly exciting or upsetting, as determined by the experimenters (authors VW and NUFD). For each participant, a different clip was shown for each Movie condition. For the Movie conditions with feedback, the feedback consisted of a yellow rectangle centered on the screen (500 Â 375 pixels) for medium motion, or a larger red rectangle centered on the screen (800 Â 600 pixels) for high motion that occluded the movie while it continued to play. The criteria for feedback during the Fixed and Adaptive Feedback conditions were the same as that for the Rest Feedback conditions.

Framewise Integrated Real-time MRI monitoring (FIRMM) software
To monitor head motion and to present feedback based on realtime calculation of head motion, we used a software suite recently developed in our laboratories, FIRMM. For details about the software, see Dosenbach et al. (2017). Briefly, FIRMM reads in the DICOM files from the scanner as the frames are acquired, converts the files into 4dfp format, and realigns the data using a speed-optimized version of the 4dfp cross_realign3d_4dfp algorithm (Smyser et al., 2010). FD is then calculated from the head realignment parameters as in Power et al. (2012) for each frame. FIRMM optimizes these steps for computational speed, allowing for real-time feedback of FD to the scanner operator and to the participant when desired (i.e., during Feedback conditions). FIRMM has been most extensively used with self-built computers running Linux (Ubuntu 14.04 LTS) and the following hardware specifications: Intel Core i7 4790 K 4.0 GHz processor, 16 GB DDR3 memory, Samsung 850 EVO 120 GB SSD and NVIDIA GTX 960 GPU. For additional details regarding system requirements, visit www. firmm.io.

Image analysis Preprocessing
Functional images from each participant were preprocessed to reduce artifacts (Shulman et al., 2010), including (i) sinc interpolation of all slices to the temporal midpoint of the first slice, accounting for differences in the acquisition time of each individual slice, (ii) correction for head motion within and across runs, and (iii) intensity normalization to a whole brain mode value (across voxels and TRs) of 1000 for each run. Atlas transformation of the functional data was computed for each individual using the MPRAGE T1-weighted scan. For one participant, the T1-weighted scan contained too much motion artifact for adequate registration, and thus, a T2-weighted image was used. Each functional run was resampled in atlas space on an isotropic 3 mm grid combining movement correction and atlas transformation in a single interpolation (Shulman et al., 2010). The target atlas (described in Greene et al., 2014) was previously created from MPRAGE scans of thirteen 7-9 year old children (seven males) and twelve 21-30 year old adults (six males), collected on the same Siemens 3T Trio used in this study. This atlas was made to conform to the Talairach atlas space using the spatial normalization method of Lancaster et al. (1995).

Functional connectivity preprocessing
For resting-state functional connectivity MRI analyses, additional preprocessing steps were used to reduce spurious variance unlikely to reflect neuronal activity (Fox et al., 2009). These steps included (i) demeaning and detrending, (ii) multiple regression of nuisance variables from the BOLD data (nuisance variables included motion regressors derived by Volterra expansion (Friston et al., 1996), individualized ventricular and white matter signals constructed using Freesurfer's segmentation, brain signal averaged across the whole brain, and the derivatives of these signals), (iii) temporal band-pass filtering (0.009 Hz < f < 0.008 Hz), and (iv) spatial smoothing (6 mm full width at half maximum). For the one participant with excessive movement contaminating the T1 image, the T2-weighted image was used for creation of the nuisance regressor masks using FSL's fast segmentation.

Motion censoring
As the goal of this paper is not to tackle different denoising strategies, we selected a method used in our lab that has been shown to effectively minimize motion artifact, namely global signal regression þ volume censoring . In fact, the combination of these procedures has been shown to best account for the motion artifact that contaminates studies of group or individual differences, making this approach ideal for developmental studies Satterthwaite et al., 2017). Here, volumes with FD > 0.3 were identified and censored from the data. The threshold of 0.3 was chosen because at this movement threshold, even the best performing subjects received the "red" warning that movement was too high during the feedback conditions. Given this approach, we were able to index head motion by calculating both mean FD and the number of frames retained after censoring.

Seed maps
Imaging data were analyzed from 17 participants, all of whom retained at least 72 frames (3 min) of data in each condition after motion censoring (note that we do not have enough data per participant for reliable estimates at the level of the individual (Laumann et al., 2015;Gordon et al., 2017)). The other participants did not have enough data in one or more conditions for analysis. Importantly, the amount of data and mean FD post motion censoring did not differ significantly between conditions in these 17 participants (all p's > 0.1). We first aimed to test whether or not the data yielded expected functional connectivity results across the different conditions. Therefore, seed maps were constructed for six canonical seed regions: left motor cortex (Talairach coordinates: À38, À29, 57), right motor cortex (39, À19, 56), left angular gyrus (À46, À63, 31), left precuneus (9, À56, 16), right ventromedial prefrontal cortex (7, 37, 0), and dorsal anterior cingulate cortex (À1, 10, 46). Seeds with a 10 mm diameter centered on the canonical coordinates were created, and the timecourses in the seed regions were then cross-correlated with all other voxels in the brain. Seed maps were generated for each condition (Rest No Feedback, Rest Fixed Feedback, Rest Adaptive Feedback, Movie No Feedback, Movie Fixed Feedback, Movie Adaptive Feedback) and were compared using paired-samples t-tests.

Network construction
Functional connectivity (FC) correlation matrices were constructed for the 17 subjects with adequate imaging data. For each participant, FC timecourses were extracted from 264 previously defined regions of interest (ROIs) (Power et al., 2011). The cross correlations between all 264 ROIs (10 mm diameter spheres) were computed. These correlations can be viewed in matrix form, with the regions organized according to a previously described functional network scheme (Power et al., 2011). Correlation matrices were constructed for each participant for each condition and normalized using Fisher r-to-z transform. Matrices were averaged across participants to check for the expected block structure (i.e., strong within network correlations) in each condition.

Comparing FC across conditions
In order to test whether or not the behavioral interventions significantly affected FC, we statistically compared the correlation matrices across conditions. First, paired-samples t-tests were performed on each of the 34,584 functional connections represented in the 264 Â 264 correlation matrices, excluding connections with jrj < 0.1. False Discovery Rate correction was applied to account for multiple comparisons (p(FDR) < 0.01). Given the many tests and strict multiple comparisons correction, we also used an omnibus approach for comparing "connectomes" as a whole. Specifically, we used a paired version of objectoriented data analysis (OODA)a method for contrasting connectomes described in La Rosa et al. (2012) and La Rosa et al. (2016). Briefly, OODA treats each correlation matrix as a single object and computes average weighted matrices (G*) following the Gibbs distribution for each condition. Then the matrices are contrasted by computing the Euclidean distance between them. To assign a p-value to the observed differences, the samples are bootstrapped (N ¼ 1000 times) creating a distribution of distances. Thus, we can assess significance of the differences between conditions for the matrices as a whole. To interrogate the specific network-to-network blocks of the matrix contributing to the omnibus effect, we implemented a post-hoc permutation analysis approach. For each paired comparison that was significant at the omnibus level using OODA, the condition labels (e.g., Rest No Feedback, Movie No Feedback) were randomly permuted (N ¼ 1000 times). The average absolute difference for each block of the matrix was computed for all permuted pairs, creating a distribution of differences for each block. The true condition comparison was then compared to this distribution to obtain a p-value. FDR correction was applied to correct for multiple comparisons across blocks.

Accelerometry
Accelerometry data outside the scanner were also collected for 19 of the 24 participants. Water-resistant tri-axial accelerometers worn on the wrist were used to measure upper extremity physical activity. A total of 100 h of data were collected for each participant in 25 h blocks over 4 sessions (Lang et al., 2007;Bailey and Lang, 2013). Children wore the wrist watch sized devices bilaterally just proximal to the ulnar styloid to increase successful data collection from at least one upper limb.
The accelerometers are sensitive to AE6 g-force and detect linear movement recorded at 30 Hz (ActiGraph, wGT3X-BT; ActiGraph LLC, Pensacola, FL). Data were stored as activity counts where 1 count ¼ 0.001664 g and binned into 1 s epochs. For each epoch, activity counts were combined across three axes ( ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x 2 þ y 2 þ z 2 p ) into a single vector magnitude. A custom algorithm analyzed data in 30-min increments to determine if the accelerometer failed to collect data at any point during the wearing period. Activity counts were used to calculate the total hours of use and sleep quality of each child. Total physical activity was characterized with the sum of seconds where the activity count >2 (Uswatte et al., 2000;Bailey et al., 2014).
We tested for relationships between in-scanner movement, as measured by mean FD and number of frames retained after censoring, and accelerometry metrics. Metrics of daily movement included total activity counts, mean activity counts per second, variance of activity counts per second, percent of time in movement, mean activity counts per second during times of movement (defined as seconds with 10 þ counts), and variance of activity counts per second during times of movement. Sleep metrics included sleep efficiency (amount of time spent awake during the longest sleep period), total time in bed, total sleep time, number of awakenings, and amount of time during awakenings. For each metric, measurements were calculated separately for the left and right hands.

Mean FD
To test the effects of real-time feedback and movie watching on FD, we ran a repeated-measures ANOVA with mean FD as the dependent variable and with the within-subjects factors Stimulus (Rest, Movie) and Feedback type (None, Fixed, Adaptive). There was a significant main effect of Stimulus, with lower FD for Movie (M ¼ 0.28, SD ¼ 0.30) than for Rest (M ¼ 0.60, SD ¼ 0.91), F(1, 23) ¼ 4.77, p ¼ .039. There was a significant main effect of Feedback type, with the lowest FD for the Fixed condition (M ¼ 0.26, SD ¼ 0.23), then the Adaptive condition (M ¼ 0.45, SD ¼ 0.61), and highest for No Feedback (M ¼ 0.61, SD ¼ 0.98), F(2, 46) ¼ 3.8, p ¼ .03. The Stimulus x Feedback type interaction was not significant (p ¼ .15).
Given the potential effects of age and sex on in-scanner head motion, we conducted the same Stimulus x Feedback type ANOVA with the additional between-subjects factors Age group (younger [5-10 years old, n ¼ 11], older [11-15 years old, n ¼ 13]) and Sex (male [n ¼ 14], female [n ¼ 10]). Again, there were significant main effects of Stimulus, F(1, 20) ¼ 8.26, p ¼ .009, and Feedback type, F(2, 40) ¼ 4.95, p ¼ .012. There was also a significant main effect of Age group, such that the younger group (M ¼ 0.74, SD ¼ 0.79) had higher FD than the older group (M ¼ 0.18, SD ¼ 0.73), F(1, 20) ¼ 6.36, p ¼ .02. There was no main effect of Sex (p ¼ .995). There was also a significant Stimulus x Age group interaction, F(1, 20) ¼ 8.92, p ¼ .007, and a significant Feedback x Age group interaction, F(2, 40) ¼ 3.61, p ¼ .036. No interactions with Sex were significant. Finally, the Stimulus x Feedback x Age group interaction was close to significant, F(2, 40) ¼ 3.14, p ¼ .054. Fig. 2a depicts the nature of this interaction, demonstrating that the effects of movie watching and feedback on FD were driven by the younger children (for individual subject results, see Supplementary Fig. 1). Post-hoc t-tests on the simple effects demonstrated in the younger age group a significant difference between Rest No Feedback and Movie No Feedback, t(10) ¼ 2.4, p ¼ .037, and a trend toward a significant difference between Rest No Feedback and Rest Fixed Feedback, t(10) ¼ 2.2, p ¼ .053. In the older age group, no simple effects were significant.
Though we counterbalanced the order of the conditions, we tested for an effect of time in the scanner by conducting a One-way ANOVA with Run as the within-subjects factor (7 levels for 7 runs, the first was the Baseline Rest run). There was no significant effect of Run (p ¼ .67).

Number of frames retained after frame censoring
We also evaluated the effects of viewing movies and receiving online feedback on the number of frames retained (i.e., with FD < 0.3 mm) using the frame censoring approach described in the Methods. We ran repeated-measures ANOVAs with number of frames retained as the dependent variable. The Stimulus (Rest, Movie) x Feedback type (None, We also included Age group and Sex as between-subjects factors. Again, we found a significant main effect of Stimulus, F(1, 20) ¼ 22.73, p < .001, and Feedback type, F(2, 40) ¼ 4.26, p ¼ .021. There was also a significant main effect of Age group, such that fewer frames were retained in the younger group (M ¼ 117.7, SD ¼ 44.3) than in the older group (M ¼ 146.1, SD ¼ 40.7), F(1, 20) ¼ 5.34, p ¼ .032. There was no main effect of Sex (p ¼ .32). The Stimulus x Age group interaction was significant, F(1, 20) ¼ 22.74, p < .001. Fig. 2b shows that, as with mean FD, the effects were driven by the younger children. Post-hoc t-tests on the simple effects were consistent with the post-hoc results on mean FD. In the younger age group, there were significant differences between Rest No Feedback and Movie No Feedback, t(10) ¼ 3.67, p ¼ .004, and between Rest No Feedback and Rest Fixed Feedback, t(10) ¼ 2.37, p ¼ .039. In the older age group, no simple effects were significant.
As with mean FD, we ran a One-way ANOVA with Run as the within-subjects factor, and found no effect of Run (p ¼ .62).

No correlations between in-scanner motion & activity metrics (accelerometry)
Since age was significantly correlated with amount of sleep (total time in bed, p ¼ .011; total sleep time, p ¼ .048), we included age as a covariate when calculating correlations between sleep metrics and inscanner motion. We found no significant correlations between any accelerometry metrics of daily movement or sleep and overall mean FD, mean FD during Rest runs, mean FD during Movie runs, overall number of frames retained, number of frames retained during Rest runs, and number of frames retained during Movie runs (all p's > 0.1). Thus, we did not find evidence that these real-world measures of daily movement or sleep predicted in-scanner head motion.
Seed maps and network structure are preserved qualitatively across conditions FC maps of the six predefined, canonical seed regions exhibited the expected FC profiles. For example, a seed placed in the left angular gyrus produced correlations with other regions belonging to the default-mode network, including the homotopic angular gyrus and posterior cingulate cortex. Fig. 3 displays group averaged seed maps for the left angular gyrus and for the right motor cortex seed regions (see Supplementary  Fig. 2 for all seed maps). The RSFC seed maps looked qualitatively similar across scan conditions. Direct statistical comparisons of the seed maps across conditions yielded no significant differences.
Correlation matrices constructed from 264 previously defined ROIs demonstrated the expected network structure, with strong withinnetwork correlations and lower between-network correlations. This network structure was present in all conditions (Fig. 4).   -19, 56) shown for the 17 subjects with useable FC data in every condition; additional seed maps in Supplementary  Fig. 2. D.J. Greene et al. NeuroImage 171 (2018) 234-245 large number of tests and the need for multiple comparisons correction, these analyses were very conservative and may not reveal all true differences. OODA allows us to directly compare the correlation matrices as a whole between conditions, and therefore, may be more sensitive to detecting differences. These analyses revealed a significant difference  Fig. 5 displays the differences in FC between conditions. In order to interrogate the nature of the significant difference between Rest No Feedback and Movie No Feedback conditions, we ran post-hoc permutation analyses to identify specific network-to-network blocks that differed. Fig. 6 displays the results, demonstrating specific and systematic effects of movie watching. There were significant differences involving frontoparietal network FC with many other networks, including sensorimotor processing networks (somatomotor, auditory, visual), top-down control networks (cingulo-opercular, dorsal attention), and the default-mode network. There were also differences in FC within and between the visual network and between the auditory network and other networks. These results demonstrate that watching a movie alters FC within and between specific functional networks, involving both sensorimotor processing and top-down control.

Discussion
Presenting engaging movies and providing real-time visual head motion feedback during MRI scanning reduces head motion in children. These effects were dependent on age, such that both of our outcome measures of motion (mean FD, number of frames with FD < 0.3 mm) were improved (lower mean FD, increased number of frames retained) in children younger than 11 years old, but not in older children. These results validate the anecdotal lore about the effectiveness of movies, demonstrate the success and feasibility of providing real-time head motion feedback, and provide insight into the ages that benefit. In addition, we found that movies significantly altered functional connectivity data compared to standard "pure" rest, while head motion feedback did not. Finally, we did not find that our measures of out-of-scanner daily movement or sleep predicted inscanner head motion. Fig. 4. Correlation matrices display functional connectivity between 264 previously defined regions of interest organized by network. Data are shown for the 17 subjects with useable FC data in every condition. The expected block structure is present for all conditions, demonstrating higher within than between network correlations. Aud ¼ auditory; CB ¼ cerebellum; CO ¼ cingulo-opercular; DAN ¼ dorsal attention network; DMN ¼ default mode network; FP ¼ frontoparietal; PMN ¼ parietal memory network; Sal ¼ salience; SC ¼ subcortical; SM ¼ somatomotor; SM(lat) ¼ somatomotor lateral; VAN ¼ ventral attention network; Vis ¼ visual.

Clinical MRI
In clinical settings, nearly all brain MRIs are ordered for the purpose of identifying or ruling out clinically significant anatomical alterations (e.g., infarct, tumor) using structural MRI (i.e., T1, T2, etc.). In children and many other patient populations, images acquired when the patient is unsedated contain significant motion artifact. Thus, pharmacological sedation has become commonplace practice, particularly in pediatrics. Our findings show that movie watching and real-time head motion feedback significantly reduce movement during MRI scans in younger children, suggesting that these behavioral strategies should be utilized during clinical MRIs in order to maximize the number of children who can undergo brain MRI without the risks of sedation. Interestingly, we did not find a significant compounding effect of combining movies and feedback. However, we did not evaluate the full search space of feedback parameters, and optimized parameters may further reduce head movement. Even so, perhaps selecting one strategy may be sufficient, or implementing a flexible approach by which one or the other method (or both) is tried until an adequate image is acquired. We find it quite encouraging that a strategy as simple as presenting a movie could reduce the need for sedation in some children, as it is a pleasant, cheap and completely safe method for acquiring better quality MRI images.
The use of sedation leads to less flexibility in potentially time sensitive tests, as the scans cannot be conducted outside of working hours for the anesthesiologist or when the anesthesiologist is busy with another patient. In addition, repeated and prolonged exposure to anesthesia may have adverse effects on neurodevelopment (Sanders et al., 2013;Coleman et al., 2017). In light of these data and a recent FDA "Drug Safety Communication" warning (www.fda.gov/Drugs/DrugSafety/ ucm532356.htm; Andropoulos and Greene, 2017), we propose that the clinical standard to conduct any brain MRI, particularly in pediatrics, should include presentation of an age appropriate movie. Anecdotally, our clinical experience indicates that some adolescents prefer to listen to music during the scan. Thus, it would be worth testing the effects of listening to music on head motion and including music as an option in clinical scans. This observation further emphasizes the need for age appropriate stimulation, which may be auditory rather than a movie.
Many pediatric radiology facilities currently have the capability to Fig. 5. Differences in FC between key conditions. Data are shown for the 17 subjects with useable FC data in every condition. Differences between movies and rest were structured (and significant); less (and not significant) differences between feedback conditions. Aud ¼ auditory; present movies and music and already do so. However, some hospitals and MRI centers are not yet set up for visual and auditory presentation. Therefore, our findings suggest that all MRI facilities that scan children should be equipped with visual and auditory presentation capabilities. Otherwise, brain MRIs for children under the age of 11 years will be unnecessarily degraded in quality or require sedation, incurring unnecessary risks. The benefits far exceed the small up-front cost of setting up audiovisual presentation capabilities, and will ultimately save money, allow for more flexibility in time of scans, and most importantly, be safer for the patient.

Research MRI
Most research institutions are equipped to present visual and auditory stimuli during MRI acquisition, and most investigators allow research participants to watch a video or listen to music during structural brain MRI scans. Anecdotally, many investigators claim that presenting visual and/or auditory stimuli improves tolerability of the scans and helps to reduce head motion. Our data validates the efficacy of using movies to help with image quality in children 5-10 years old, providing empirical evidence to support some of these subjective claims. Interestingly, we did not find a similar benefit in children older than 10 years. Though we might expect that the benefits will extend to older ages in neuropsychiatric conditions known to increase movement (e.g., ADHD, Tourette syndrome, ASD). Of course, there is no apparent detriment to presenting a movie to older children as well as to adults during structural MRI scans, and making the scan experience as pleasant as possible is important.
For functional scans, there are additional factors to consider. Watching a movie or receiving real-time feedback will influence brain function to some degree, making the decision to use such strategies less straightforward than for structural MRI. Here, we compared movement during movie watching and feedback to movement during "rest" (i.e., relaxing with eyes open, viewing a fixation cross). Resting state functional connectivity (RSFC) MRI has become an increasingly popular approach for studying functional brain networks; the method measures correlations of the fMRI signal between brain regions while participants are at rest . Unfortunately, RSFC is vulnerable to motion artifacts (Power et al., 2012;Satterthwaite et al., 2012;Van Dijk et al., 2012), which is particularly problematic when studying populations prone to movement (e.g., children, clinical populations; see Fair et al., 2013). Therefore, our findings are likely to be exciting for investigators who study FC networks in younger children and potentially in other populations with increased movement.
Even though we found that standard FC seed maps showed the expected connectivity patterns and that correlation matrices demonstrated the expected block structure across conditions, FC differences between conditions emerged. Comparing rest to movie scans revealed significant, systematic differences in the correlation matrices, demonstrating that certain within and between network connections were more affected by movie watching than others. Specifically, visual and auditory network FC was altered, which is not surprising given that participants were watching a movie vs. looking at a fixation crosshair. In addition, frontoparietal network FC with many other networks was altered, reflecting changes in top-down control during movie watching (Dosenbach et al., 2006(Dosenbach et al., , 2007(Dosenbach et al., , 2008. It is worth noting that these results included imaging data from both the younger and older children in our sample, as each group was not large enough for separate analyses. Thus, it is possible that differences between age groups in FC may be revealed with enough power. Nevertheless, our findings are consistent with previous reports of FC differences between movies and rest in children and adults (Betti et al., 2013;Emerson et al., 2015), and between task and rest, despite preserved network organization (Gratton et al., 2016). Given these alterations in FC, researchers should keep in mind that FC during movie watching cannot be equated to FC during standard rest.
Interestingly, there were no significant differences in the correlation matrices between the no feedback and feedback conditions. In our data processing, we removed data points during feedback using a volume censoring approach, and in doing so, likely removed much of the effects of feedback on brain function. Still, head-motion feedback may place the participant in a feedback task state that is not identical to pure rest, but we demonstrated that the effect on FC was not significant when volume censoring was applied.
Our results have important implications for future developmental neuroimaging studies. For structural MRI scans, there is no apparent reason not to use these behavioral methods to reduce head motion. For functional scans, investigators must make carefully informed decisions. If measuring FC during pure rest (viewing a fixation cross only) is the goal, presenting movies will help reduce motion artifact in younger children, but will also affect the functional data itself, whereas real-time head motion feedback may not. Given that we did not find a significant benefit in children older than 10 years old, studies in typically-developing children 11 þ may not need to present movies or feedback and can use pure rest to avoid potential effects on FC.

Predicting in-scanner head motion
Our accelerometry-based measures of real-world activity (movement and sleep) failed to predict in-scanner head motion. Being able to predict which children will and will not be able to hold still well enough to produce high quality MRI images would be quite useful. From our null results, we cannot conclude that real-world motor activity and inscanner head motion in children are completely independent, since we may have simply been underpowered. If our null results were to be confirmed by larger studies, it would go against the notion that there is a head motion endophenotype, which contends that some differences between higher and lower-motion subjects are due to systematic brain differences related to their general propensity to move (Zeng et al., 2014). Though, one could propose that this endophenotype is limited to in-scanner motion only, without any relationship to real-world motor activity.

Limitations and future directions
The youngest child in the present study was 5 years old. Future work should test behavioral strategies for holding still in even younger children. It is important to test whether such strategies make unsedated MRIs more feasible in 3 and 4 year old children. The need for clinical MRIs does not discriminate by age, and MRI research almost entirely excludes children younger than 5 years (unless sleeping, which brings its own issues and confounds). If something as simple as playing an ageappropriate movie with or without feedback training would make unsedated clinical scans and research scans feasible in even a fraction of very young patients/research participants, it would be well worth it. In addition, future work can test the generalizability of our results across different demographics, including aging cohorts and different patient populations.
While our sample included some children with neurodevelopmental disorders (ADHD, OCD, tics), most were typically developing and high functioning. The participants' average IQ was above average and their socio-economic status (measured by an assessment of social status) was relatively higher than average. Therefore, it is possible that the older children who did not show a significant benefit from movies and feedback were already particularly good at following the direction to hold still in the scanner. In addition, it is likely that our sample differs demographically from patients. Thus, we might expect to find a larger reduction in head motion during movies and feedback, or even a compounded effect of both strategies, in children undergoing clinical MRIs. Future studies should test the effects of movies and head motion feedback on clinical populations across a range of ages in order to fully explore the clinical utility of these methods.
Our analyses were based on head motion during functional, not structural, MRI scans. Yet, we do not see a reason to assume that motion would differ for structural scans. In fact, volumetric navigator sequences, developed for prospective motion correction in T1 scans, insert functional volumes in between structural data collection in order to track motion Tisdall et al., 2012). Therefore, we contend that extending our results to structural MRI is appropriate, feasible, and valid.
Future work must strive to further optimize behavioral strategies for reducing motion during MRI scans. The movie clips used in the present study were chosen from recent cartoon blockbusters based on our intuitions for being engaging, but not too exciting or upsetting. Testing a variety of movie clips will help hone in on specific segments of movies that are the most conducive to holding still. In addition, testing the effects of listening to music as another strategy for reducing head motion may be more effective for the adolescent age range. With respect to head motion feedback, the FD thresholds that triggered feedback were based on the FDs from each individual's baseline resting state scan. There may be certain feedback thresholds that optimize the ability to hold still in different populations. Interestingly, when testing the simple effects of feedback in the younger child group (5-10 years old), Fixed Feedback (i.e., set FD thresholds throughout the scan) significantly reduced head motion, but Adaptive Feedback (i.e., FD thresholds that adapt to the participant's behavior as the scan proceeds) did not. If initial head motion was very large, it might take a long time for the thresholds to drop into the high-quality range in the Adaptive condition. Therefore, starting the Adaptive Feedback scans with a more aggressive threshold might improve their performance. So far, we have only touched on a tiny corner of the full parameter space to be searched.
It is possible that real-time head motion feedback could have a beneficial effect on subsequent scans. We were underpowered to test for such effects, and we used a counterbalanced design to control for order effects. However, it would be valuable to test whether real-time feedback can be used to train individuals to hold still on future scans. If this approach works, one could begin with real-time head motion feedback and then switch to scans without feedback once sufficiently trained. In research, this type of approach would be beneficial to fMRI studies aimed at measuring brain function during task or those investigating FC during rest. In addition, if effective, conducting such training outside the scanner first would result in large cost savings of actual MRI time. Currently, training outside the scanner relies on head motion sensors, which are problematic because they can be misled by scalp motion, as discussed above.
We plan to add real-time, visual feedback capabilities to our FIRMM software in a way that allows experimenters and clinicians to choose their own feedback parameters with the ultimate goal of increasing our ability to scan pediatric patients.