The multivariate physical activity signature associated with metabolic health in children and youth: An International Children's Accelerometry Database (ICAD) analysis.

There is solid evidence for an association between physical activity and metabolic health outcomes in children and youth, but for methodological reasons most studies describe the intensity spectrum using only a few summary measures. We aimed to determine the multivariate physical activity intensity signature associated with metabolic health in a large and diverse sample of children and youth, by investigating the association pattern for the entire physical intensity spectrum. We used pooled data from 11 studies and 11,853 participants aged 5.8-18.4 years included in the International Children's Accelerometry Database. We derived 14 accelerometry-derived (ActiGraph) physical activity variables covering the intensity spectrum (from 0 to 99 to ≥8000 counts per minute). To handle the multicollinearity among these variables, we used multivariate pattern analysis to establish the associations with indices of metabolic health (abdominal fatness, insulin sensitivity, lipid metabolism, blood pressure). A composite metabolic health score was used as the main outcome variable. Associations with the composite metabolic health score were weak for sedentary time and light physical activity, but gradually strengthened with increasing time spent in moderate and vigorous intensities (up to 4000-5000 counts per minute). Association patterns were fairly consistent across sex and age groups, but varied across different metabolic health outcomes. This novel analytic approach suggests that vigorous intensity, rather than less intense activities or sedentary behavior, are related to metabolic health in children and youth.


Introduction
There is clear evidence of favorable associations between physical activity (PA) and metabolic health outcomes in children. While associations are evident for moderate-to-vigorous PA (MVPA) and vigorous PA (VPA), associations appears to be weak for light PA (LPA) and https://doi.org/10.1016/j.ypmed.2020.106266 Received 24 May 2020; Received in revised form 24 August 2020; Accepted 21 September 2020 sedentary time (SED) (Ekelund et al., 2012;Andersen et al., 2006;Janssen and LeBlanc, 2010;Poitras et al., 2016;Cliff et al., 2016;Aadland et al., 2018a). However, few studies include the entire PA intensity spectrum in their analyses and many studies summarize all intensities above walking into one category (MVPA), which limits information about the importance of specific intensities in the moderate to vigorous range. Capturing the entire intensity spectrum is important to avoid loss of information and residual confounding (Poitras et al., 2016;Aadland et al., 2018a;van der Ploeg and Hillsdon, 2017). Accordingly, associations across the entire PA intensity spectrum, including SED, should be examined to obtain a complete picture and to ease interpretations of associations between PA and health outcomes. This aim has traditionally been difficult to address, as researchers mainly have relied on statistical methods that cannot handle multicollinearity among the explanatory variables. Aadland et al. (2018a) recently applied multivariate pattern analysis to addressed the multicollinearity challenge of accelerometer-derived PA data. This analytical approach provides a solution to limitations imposed by traditional statistical approaches, as it can model any number of completely multicollinear variables (Wold et al., 1984). Thus, multivariate pattern analysis allows for modelling multiple variables across the entire PA intensity spectrum and hence use the rich information embedded in the acceleration signal, which can provide greatly improved information from accelerometry (Aadland et al., 2018a;Aadland et al., 2019a).
The recent application of multivariate pattern analysis to the field of PA epidemiology provides promising results in terms of how researchers may better exploit and model accelerometry-derived PA data. However, the previous studies (Aadland et al., 2018a;Aadland et al., 2019a) only included one cohort of 10-year-old children. Thus, these findings need verification and extension using a larger and more diverse sample of children. Therefore, the aim of the present study was to determine the PA intensity signatures associated with metabolic health outcomes in the International Children's Accelerometry Database (ICAD), which includes a large sample of children aged 6-18 years from culturally diverse settings.

Study design
The International Children's Accelerometry Database (ICAD) (http://www.mrc-epid.cam.ac.uk/research/studies/icad/) is a database that contains pooled data on accelerometer-determined PA, SED, and related health outcomes in children and adolescents from 21 studies from 10 different countries. The aims, selection and design of studies, as well as data reduction procedures and methods of the ICAD database have been described elsewhere (Sherar et al., 2011).

Participants
In the present analyses, we used data from children and adolescents aged 6-18 years from 11 studies from Europe (EYHS Denmark, Estonia, Norway, and Portugal (Andersen et al., 2006), ALSPAC (Golding et al., 2001), CoSCIS (Eiberg et al., 2005), KISS (Zahner et al., 2006), PANCS (Kolle et al., 2010)), the United States (NHANES 2003-2004(National Health and Nutrition Examination Survey, 2005), NHANES 2005(National Health and Nutrition Examination Survey, 2010), and Brazil (Pelotas (Victora et al., 2007)). Data were collected 1997-2007 and studies included cross-sectional, longitudinal, and intervention designs. A detailed overview of the studies are provided by Sherar et al. (2011) When several waves of data were available (i.e., when participants were measured at multiple time points), we included only the first wave to limit the sample to unique observations. The included studies provided data on PA and at least one of the metabolic risk factors of interest. All participants and/or their parents/legal guardian provided informed consent and all study protocols were approved by local ethical committees.

Physical activity
A detailed description of the assessment and data reduction procedures of PA has been published previously (Sherar et al., 2011). Briefly, accelerometer data for the vertical axis from all studies were reprocessed and reanalyzed for unification across studies using the Ki-neSoft software version 3.3.20 (Loughborough, UK). Data were reintegrated to 60-s epochs and non-wear periods of at least 60 min of consecutive zeros (allowing for two minutes of non-zero interruptions) were excluded. Inclusion criteria were a valid wear time of 10-16 h/day (i.e., excluding individuals with overnight wear) and ≥ 4 days/week.

Metabolic health measures
Height and weight were measured using standardized methods in all studies. We calculated body mass index (BMI; kg/m 2 ). For descriptive purposes, we further reported the proportions of individuals being overweight and obese based on the age-and sex-specific cut-offs suggested by Cole et al. (2000).
We used seven cardio-metabolic variables as outcomes; abdominal adiposity (waist circumference (WC)) and resting systolic blood pressure (SBP) from 11 studies (Andersen et al., 2006;Golding et al., 2001;Eiberg et al., 2005;Zahner et al., 2006;Kolle et al., 2010;National Health and Nutrition Examination Survey, 2005;National Health and Nutrition Examination Survey, 2010;Victora et al., 2007), lipid metabolism (triglycerides (TG), total cholesterol (TC) and high-density lipoprotein (HDL)-cholesterol) from 10 studies (Andersen et al., 2006;Golding et al., 2001;Eiberg et al., 2005;Zahner et al., 2006;Kolle et al., 2010;National Health and Nutrition Examination Survey, 2005;National Health and Nutrition Examination Survey, 2010), and glucose metabolism (insulin and glucose) from nine studies (Andersen et al., 2006;Eiberg et al., 2005;Zahner et al., 2006;Kolle et al., 2010;National Health and Nutrition Examination Survey, 2005;National Health and Nutrition Examination Survey, 2010). WC was measured using an anthropometric measurement tape at the height of the umbilicus at the end of a normal expiration, except in the National Health and Nutrition Examination Survey (NHANES) where WC was measured just above the iliac crest at the mid-axillary line (National Health and Nutrition Examination Survey, 2010). WC:height ratio was used for analysis. Blood pressure was measured during rest using manual (National Health and Nutrition Examination Survey, 2005;National Health and Nutrition Examination Survey, 2010) or automatic (Andersen et al., 2006;Golding et al., 2001;Eiberg et al., 2005;Kolle et al., 2010;Victora et al., 2007) methods. The average of two, three or four recordings was used for analysis. All blood samples were drawn from fasting individuals. We calculated the TC:HDL ratio and homeostasis model assessment of insulin resistance (HOMA) (Matthews et al., 1985), which were used for the association analyses.
We calculated a composite metabolic health score as the mean of five variables (WC:height ratio, SBP, TG, TC:HDL ratio, and HOMA). The score was constructed after adjustment of all variables for sex and age by obtaining standardized residuals from linear regression. Similar approaches have been used previously (Andersen et al., 2006;Aadland et al., 2018a). We regard this composite score as the main outcome. Additionally, we performed a sensitivity analysis using a composite score excluding WC:height ratio, to reduce the influence of fatness on the model. We also analyzed each of the five risk factors individually.

Statistical analyses
Descriptive characteristics were reported as frequencies, means, standard deviations (SD), and medians (time spent in PA intensities). Associations between PA and metabolic health were determined using multivariate pattern analysis. All analyses were adjusted for age and sex by using residuals from linear regression for all outcome variables including age and sex as independent variables. We also included sensitivity analyses adding cohort as a random effect in a linear mixed model to further adjust for potential differences between studies.
We used partial least squares (PLS) regression (Wold et al., 1984) to determine the multivariate association pattern between metabolic health measures (outcome variables) and the PA intensity spectrum (explanatory variables), as shown previously (Aadland et al., 2018a;Aadland et al., 2019b). Briefly, PLS regression decomposes the explanatory variables into a few orthogonal PLS components (latent variables), while maximizing the covariance with the outcome variable. This procedure is able to handle completely multicollinear variables (Wold et al., 1984). Given the strong correlations among the explanatory variables when using a spectrum description of PA (Aadland et al., 2019c), each variable provides limited unique information about the outcome. Thus, their unique contribution to the outcome is neither meaningful nor possible to estimate. Association estimates are therefore not independent of each other (Aadland et al., 2019b), which means the interpretation of associations differs from those of ordinary linear regression.
We validated all models using Monte Carlo resampling (Kvalheim et al., 2018) with 100 repetitions randomly selecting 50% of the observations as an external validation set in each repetition. For each PLS model, we used target projection (Kvalheim and Karstang, 1989;Rajalahti and Kvalheim, 2011) followed by reporting of selectivity ratios with 95% confidence intervals (CIs). These estimates show the direction and explained variance (R 2 ) for each PA intensity variable with the predicted outcome in the multivariate space (Aadland et al., 2019b;Rajalahti et al., 2009a;Rajalahti et al., 2009b). For example, a selectivity ratio of 0.50 and a total model R 2 of 10%, means the variable explains 5% of the actual outcome. Additionally, we reported the association using unstandardized estimates (Aadland et al., 2019b) to allow for an interpretation of the importance of a higher or lower duration (in minutes/day) among PA intensities.
The association patterns related to metabolic health was compared by age groups and sex (5.8-11.9-year-old boys and girls and 12.0-18.4year-old boys and girls) by performing the analyses separately for these four subgroups. The multivariate PA signatures were compared among groups by correlating association patterns using Pearson's r.
The multivariate pattern analyses were performed by means of the commercial software Sirius version 11.0 (Pattern Recognition Systems AS, Bergen, Norway). Fig. 1. The multivariate physical activity signature associated with metabolic health in children and youth. The composite score includes waist circumference to height ratio, systolic blood pressure, homeostasis model assessment of insulin resistance, total to high-density lipoprotein cholesterol ratio, and triglyceride (a lower score is more favorable). The PLS regression model includes five components and is adjusted for age and sex. The selectivity ratio for each variable is the explained to total variance of the predictive (target projected) component. A negative bar implies that increased physical activity is associated with better metabolic health. R 2 = explained variance.

Participants' characteristics
We included 11,853 participants in the analyses who provided valid data on age, sex, PA, and at least one outcome variable (Table 1). The overall number of participants varied from 4185 to 11,735 across models for single risk factor outcomes, whereas 4105 children provided data for the composite score (n = 917-1127 for age-and sex-specific groups) (Supplemental Table 1). Total accelerometer wear time was mean (SD) 780 (67) minutes/day (mean 771-795 min/day across sex and age groups) accumulated across a median of six wear days. Time accumulated across the intensities are shown in Supplemental Table 2. Fig. 1 shows the association pattern between the entire PA spectrum and the composite metabolic health score (R 2 = 4.2%). Associations were very weak for intensities lower than 1000 cpm, but gradually strengthened for intensities from 1000-1499 cpm to 4000-4499 cpm for which more time spent in PA was associated with better metabolic health. Associations weakened for intensities higher than 4500 cpm. Sensitivity analyses including adjustment for study (R 2 = 2.7%, Supplemental Fig. 1) or with exclusion of WC:height ratio from the composite score (R 2 = 3.4%, Supplemental Fig. 2) did not alter the association patterns (r between these association patterns and the pattern shown in Fig. 1 = 0.98 and 0.94, respectively).

Associations between PA and metabolic health
Association patterns between the entire PA spectrum and the composite metabolic health score were fairly consistent across sex and age groups (R 2 = 3.3-7.0%; r for association patterns = 0.76-0.95 across subgroups) (Fig. 2). However, a somewhat stronger unfavorable association for 0-99 cpm was found for boys than for girls, and a higher explained variance was found for the 6-12 year-old girls compared to other groups. Adjustment for study had a minor impact on association patterns (r = 0.81-0.98 for patterns adjusted and unadjusted for study), but reduced the explained variance for 6-12 year old girls from 7.0 to 3.8% (i.e., the results became more similar to other groups).
We found some variation in associations for the five single risk factors (Fig. 3). For SBP we did not find a significant predictive association pattern, whereas explained variances were 1.7, 1.7, 2.7 and 4.2% for TG, TC:HDL ratio, HOMA, and WC:height ratio, respectively.
Associations for WC:height ratio and HOMA gradually strengthened up to 4000-4999 cpm and thereafter decreased. For TG and TC:HDL ratio, associations gradually strengthened up to 1500-2499 cpm, then declined (TG) or plateaued (TC:HDL ratio), before associations strengthened again and peaked at 6000-7999 cpm. Adjustment for study had a minor impact on association patterns (r = 0.85-0.99 for patterns adjusted and unadjusted for study), though we did not find a predictive model for TG.
The relative importance of each minute of PA in different intensities for the composite metabolic health score is shown in Supplemental  Fig. 3. Whereas more time spent in 0-99 cpm was associated with a deterioration of metabolic health (0.00035 SDs per min/day), more time spent in other intensities was associated with improved metabolic health. Associations gradually strengthened for intensities from 100-499 cpm (−0.00066 SDs per min/day) up to 4500-4999 cpm (−0.05207 SDs per 1 min/day), and thereafter weakened.

Discussion
To handle many strongly correlated PA intensity variables from accelerometry, we investigated the multivariate PA signature associated with metabolic health in a large and diverse sample of children by means of multivariate pattern analyses. Extending previous findings using this type of analysis applied to PA data (Aadland et al., 2018a;Aadland et al., 2019a), this novel approach shows how the whole intensity spectrum of PA associates to metabolic health in childhood. Our results show strongest associations with metabolic health for vigorous intensities, whereas associations were weaker for lower intensities, in particular for time spent sedentary.
Consistent with previous studies and recommendations (Ekelund et al., 2012;Janssen and LeBlanc, 2010;Poitras et al., 2016;Cliff et al., 2016;Aadland et al., 2018a;Aadland et al., 2019a), our findings support that children and youth should spend time in moderate to vigorous intensities to improve their metabolic health. However, our findings suggest that vigorous intensities are more important than previously believed. The strongest association with metabolic health was found for an intensity of 4000-5000 cpm, which is suggested as an appropriate threshold for classification of vigorous intensity (Trost et al., 2011). This accelerometer output is achieved for brisk walking or slow running at ≈ 6 km per hour in children and adolescents (Supplemental Table 3). However, in the present study, participants' PA was summed over 60 s. HDL = high-density lipoprotein; HOMA = homeostasis model assessment; SD = standard deviation *The composite score includes waist circumference:height ratio, systolic blood pressure, total:HDL ratio, triglycerides, and HOMA of insulin resistance.
E. Aadland, et al. Preventive Medicine 141 (2020) 106266 Since children's PA is characterized by sporadic and intermittent bursts of activity most often lasting less than 10 s (Sanders et al., 2014;Aadland et al., 2018b), summation of PA over longer periods ("epochs") misclassify and mask vigorous activities like running and jumping (Aadland et al., 2019a). A recent study compared the PA signatures associated with metabolic health in children derived from 1-, 10-, and 60-s epoch data and found that the strongest association were observed for 7000-8000, 5500-6500, and 4000-5000 cpm, respectively (Aadland et al., 2019a). Thus, when using longer as compared to shorter epoch periods, association patterns were substantially biased towards lower intensities. Interestingly, when using 60-s epochs, the association patterns were similar in the previous (Aadland et al., 2019a) and the present study. Unfortunately, the ICAD data is only available with 60-s epochs. A similar misclassification could be a reality in much of the prevailing literature, as epoch periods of 10-60 s are most commonly used (Cain et al., 2013;Migueles et al., 2017). Consistent with research on children and youth's activity patterns (Sanders et al., 2014;Aadland et al., 2018b), we expect that most individuals do not obtain their PA from brisk walking. Rather, the stronger associations for higher intensities when using a short epoch probably show that the health effect of PA is achieved during intermittent vigorous intensity activities involving running and jumping. Shifting the focus to the lower end of the intensity spectrum, we observed a very weak association between SED (i.e., 0-99 cpm) and metabolic health, which is consistent with current evidence (Ekelund et al., 2012;Cliff et al., 2016;Aadland et al., 2018a;Hansen et al., 2018). This finding seems to be consistent across epoch settings (Aadland et al., 2019a). Similarly, and also consistent with previous findings (Poitras et al., 2016;Aadland et al., 2018a;Aadland et al., 2018b;Hansen et al., 2018), LPA intensities (i.e., ≈ 100-1999 cpm) showed weak associations with metabolic health, especially considering the biased association profile resulting from the 60-s epoch setting (Aadland et al., 2019a). As shown previously (Aadland et al., 2019a), as VPA is partly captured as MPA and MPA is partly captured as LPA when Fig. 2. The multivariate physical activity signatures associated with metabolic health by sex and age. The composite score includes waist circumference to height ratio, systolic blood pressure, homeostasis model assessment of insulin resistance, total to high-density lipoprotein cholesterol ratio, and triglyceride (a lower score is more favorable). The PLS regression models are adjusted for age and sex and include two, one, four, and one components, respectively, for 6-12-year-old boys, 12-18-year-old boys, 6-12-year-old girls, and 12-18-year-old girls. The selectivity ratio for each variable is the explained to total variance of the predictive (target projected) component. A negative bar implies that increased physical activity is associated with better metabolic health. R 2 = explained variance. using a 60-versus a 1-s epoch setting, the association for LPA shown herein is likely overestimated by misclassification of MPA. Taken together, our findings show no meaningful associations for time spent in SED and LPA with metabolic health in children and youth.
Our findings are generally consistent with previous studies from the ICAD database suggesting that substituting time spent in SED and LPA with time in MVPA are favorably associated with metabolic health (Ekelund et al., 2012;Hansen et al., 2018;Wijndaele et al., 2019;Tarp et al., 2018). The exception is the association for SBP: While we did not find a predictive association pattern, consistent with a previous study using similar methodology (Aadland et al., 2018a), weak significant associations with MVPA have been observed in previous studies (Ekelund et al., 2012;Wijndaele et al., 2019), but only in a subgroup of adolescents (Hansen et al., 2018). This could be a result of our thorough validation of regression models. Since the previous studies used predefined intensity categories of SED, LPA, and MVPA (Ekelund et al., 2012;Hansen et al., 2018;Wijndaele et al., 2019) or accumulated time above 500, 1000, 2000, and 3000 cpm (Tarp et al., 2018), they do not provide detailed knowledge of specific intensities' association with metabolic health. While a more detailed intensity spectrum and PLS regression can provide more nuanced information of association patterns across the intensity spectrum, as shown herein, a direct interpretation of our findings with respect to the number of minutes/day children should spend in specific intensities for an improved metabolic health is challenging (Aadland et al., 2019b). For this purpose, a traditional PA description and isotemporal substitution models may be useful (Aadland et al., 2019b). For example, based on the ICAD data, it is estimated that substituting 30 min/day of SED with MVPA is associated with a 1.5 cm reduced WC (Wijndaele et al., 2019), and that this association strengthens with increased age (Hansen et al., 2018). Importantly, the PA do not need to be accumulated in prolonged bouts (Aadland et al., 2018b;Tarp et al., 2018). Thus, the different methodological approaches may complement each other in informing the evidence base of PA epidemiology and PA guideline development.
Compared to previous studies that have modelled the associations between the whole intensity spectrum and metabolic health in children (Aadland et al., 2018a;Aadland et al., 2019a), the explained variance was considerably lower in the present study (4.2 versus 10.8, 13.4, and 17.0% explained variance for 60-, 10-, and 1-s datasets, respectively). One possible explanation for the further weakening of the association (R 2 = 4.2 versus 10.8% using 60-s epoch) may be the lack of aerobic fitness in the composite score in the present study. Among the six single risk factors included by Aadland et al. (Aadland et al., 2018a;Aadland et al., 2019a), aerobic fitness was strongest associated with PA. Another possible reason for the attenuation could be measurement error due to the application of different measures and protocols across studies. However, this factor does not seem to be important as accounting for cohort in our analysis did not improve model fit. Furthermore, most of the cohorts included in the ICAD have applied older ActiGraph models, more specifically the AM7164, which is more prone to drift and breakdown from wear and tear, compared to newer generations, for example the GT3X+, used in our previous studies (Aadland et al., 2018a;Aadland et al., 2019a). Fig. 3. The multivariate physical activity signatures associated with different indices of metabolic health. The PLS regression models are adjusted for age and sex. WC:height ratio = waist circumference to height ratio (seven components); HOMA = homeostasis model assessment of insulin resistance (two components); TC:HDL ratio = total to high-density lipoprotein cholesterol ratio (five component); TG = triglyceride (six component). The SR for each variable is calculated as the ratio of explained to residual variance on the predictive (target projected) component. A negative bar implies that increased physical activity is associated with better metabolic health. R 2 = explained variance.

Strengths and limitations
The main strength of the present study was the use of multivariate pattern analysis to handle the dependency among the PA variables across the intensity spectrum. This method is a novel and promising alternative to ordinary least squares regression, because it can handle multicollinear data sets (Wold et al., 1984;Aadland et al., 2019c;Rajalahti and Kvalheim, 2011). Importantly, this approach does not require pre-defined accelerometer cut points and therefore provide a solution to the cut point conundrum, which confuse the field and hamper comparison across studies. Furthermore, with regard to generalizability, the inclusion of a large and diverse sample of children from the ICAD database is an important strength of this study over previous studies using similar methodology (Aadland et al., 2018a;Aadland et al., 2019a;Aadland et al., 2018b).
Accelerometers do not provide "true" PA levels, as behavior changes over time, some activities might be poorly captured by accelerometry, and several analytic choices, for example epoch length (Aadland et al., 2019a), can affect data considerably. Measurement error attenuates associations and increases the chance of type II errors (Hutcheon et al., 2010). As it is well known that frequency filtering (Brage et al., 2003;John et al., 2012) causes a leveling-off of ActiGraph counts for running at higher speed, the attenuated associations for the highest PA intensities (≥ 5000 cpm) is likely a spurious finding caused by underestimation of these activities.
We only included adjustment of age and sex in our primary analyses, and additionally adjusted for cohort and removed WC:height ratio from the metabolic health composite score to remove the influence of adiposity in sensitivity analyses. As expected, this adjustment reduced the explained variance of the models, whereas association patterns were robust. Further adjustment for maturation and parents' education level did not change any findings (results not shown). We argue these findings show that our association patterns are stable, though residual confounding by, for example, diet, could influence the results.
Because our results are derived from a cross-sectional analysis, causality could not be inferred from our findings. However, as argued previously (Aadland et al., 2018a), PA guidelines are largely based on population studies of free-living total PA, whereas experimental studies investigate effects of PA added to everyday activities. Moreover, due to the rigorous design, exercise prescription and supervision, and the large number of groups and participants required, it would be very complex to obtain experimental evidence informing the field like the present paper. Finally, it is biologically plausible that PA affect the metabolic risk factors, whereas it is less likely that metabolic risk factors affect PA levels, except for overweight and obesity. Therefore, we argue the results presented herein have implications for children's PA guidelines when it comes to metabolic health.

Conclusion
When incorporating the entire PA intensity spectrum in the analysis of associations with metabolic health, our findings suggest the strongest associations are found for VPA, whereas associations for SED are weak. Though our results are cross-sectional, our findings suggest that PA guidelines, as well as future surveillance and intervention studies, should increase their focus on VPA and reduce their focus on SED to target the strongest PA markers of childhood metabolic health. We recommend that future studies apply shorter epochs during measurement of PA and a multivariate analytic approach to develop future understanding in the field of PA epidemiology.

Ethical approval and consent to participate
All procedures performed in the original studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Availability of data and materials
The specific data sets generated and analyzed during the current study are not publicly available. However, a new data set including the same variables can be applied for through an individual project agreement with ICAD (http://www.mrc-epid.cam.ac.uk/research/ studies/icad/).

Consent for publication
All participants and/or their legal guardian provided informed consent and local ethical committees approved the study protocols. Prior to sharing data, data-sharing agreements were established between contributing studies and MRC Epidemiology Unit, University of Cambridge, UK.