Estimation of Newborn Risk for Child or Adolescent Obesity: Lessons from Longitudinal Birth Cohorts

Objectives Prevention of obesity should start as early as possible after birth. We aimed to build clinically useful equations estimating the risk of later obesity in newborns, as a first step towards focused early prevention against the global obesity epidemic. Methods We analyzed the lifetime Northern Finland Birth Cohort 1986 (NFBC1986) (N = 4,032) to draw predictive equations for childhood and adolescent obesity from traditional risk factors (parental BMI, birth weight, maternal gestational weight gain, behaviour and social indicators), and a genetic score built from 39 BMI/obesity-associated polymorphisms. We performed validation analyses in a retrospective cohort of 1,503 Italian children and in a prospective cohort of 1,032 U.S. children. Results In the NFBC1986, the cumulative accuracy of traditional risk factors predicting childhood obesity, adolescent obesity, and childhood obesity persistent into adolescence was good: AUROC = 0·78[0·74–0.82], 0·75[0·71–0·79] and 0·85[0·80–0·90] respectively (all p<0·001). Adding the genetic score produced discrimination improvements ≤1%. The NFBC1986 equation for childhood obesity remained acceptably accurate when applied to the Italian and the U.S. cohort (AUROC = 0·70[0·63–0·77] and 0·73[0·67–0·80] respectively) and the two additional equations for childhood obesity newly drawn from the Italian and the U.S. datasets showed good accuracy in respective cohorts (AUROC = 0·74[0·69–0·79] and 0·79[0·73–0·84]) (all p<0·001). The three equations for childhood obesity were converted into simple Excel risk calculators for potential clinical use. Conclusion This study provides the first example of handy tools for predicting childhood obesity in newborns by means of easily recorded information, while it shows that currently known genetic variants have very little usefulness for such prediction.


Introduction
Childhood and adolescent overweight and obesity, which are leading causes of early type 2 diabetes and cardiovascular disease, have become major public health problems both in westernized and more recently in developing countries [1]. Traditional approaches for the management of overweight and obesity have had poor long term efficacy and therefore prevention is currently the most promising strategy for controlling the obesity epidemic [1].
Prevention of obesity should start as early as possible after birth. Longitudinal studies have shown a strong association between early infancy weight gain rate or adiposity and childhood and even adult body weight, fat mass and body mass index (BMI) [2][3]. Moreover, the efficacy of preventive behavioural and nutrition interventions targeting school children, either in primary schools or at home, is very limited [4][5]. Finally, in many countries preschool and school children are already burdened by a high prevalence of overweight or obesity [4].
Assessing the risk for future overweight or obesity in newborns may be a basis for focused preventive interventions for at-risk individuals during the very first months of their life. Even though several sociodemographic and anthropometric predictors, as well as several common genetic variants, have been associated with childhood overweight/obesity, no longitudinal study has attempted to explore the cumulative predictive properties of these known early life risk factors, or to propose possible tools to predict childhood obesity at birth [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20].
We aimed to build such predictive algorithms for the early identification of newborns at an increased risk for childhood and adolescent overweight/obesity. For this purpose, we estimated the ability of clinical, socio-demographic, and genetic risk factors to predict childhood and adolescent overweight/obesity in a large Finnish birth cohort. We then confirmed the promising usefulness of socio-demographic and anthropometric factors in predicting childhood obesity in two independent paediatric cohorts.

Ethics Statement
The study conducted on the NFBC1986 cohort was approved by the Ethical Committee of Northern Ostrobothnia Hospital District. The retrospective study of the Veneto cohort was approved by the Ethical Committee of the University of Verona and Project Viva was approved by the Human subjects Committees of Harvard Pilgrim Health Care, Brigham and Women's Hospital, and Beth Israel Deaconess Medical Center.
Written informed consent was obtained from parents or guardians of all participants and all clinical investigations were conducted according to the principles expressed in the Declaration of Helsinki.

Subjects
Development sample. The Northern Finland Birth Cohort 1986 (NFBC1986) (http://kelo.oulu.fi/NFBC) was followed prospectively from 12 th gestational week and several well known early risk factors for childhood obesity were recorded systematically. Participants who had their weight and height recorded at seven and sixteen years of age and met data completeness criteria (see below, N = 4,032) were used to build the models. We separately predicted childhood obesity (obesity at 7 years of age), childhood overweight/obesity (overweight or obesity at 7 years of age), adolescent obesity (obesity at 16 years of age), adolescent overweight/obesity (overweight or obesity at 16 years of age), and the severe sub-phenotypes of childhood obesity persistent into adolescence (obesity at 7 and 16 years of age) and childhood overweight/obesity persistent into adolescence (overweight or obesity at 7 and 16 years of age) ( Table 1, Table S1-S2). Overweight and obesity were defined by the IOTF BMI cut-offs [21].
Five SNPs were discarded during the genotyping procedure, since they did not pass the genotyping quality control criteria, leaving 39 SNPs. All 39 SNPs were in Hardy-Weinberg equilibrium (P.0.05). We assumed an additive model and constructed a cumulative genotype score by summing the number of risk alleles (0-78).
Validation samples. We used a school-based retrospective sample of 1,503 children aged 4-12 from Veneto, Italy, as one of the two validation samples to explore whether results from the NFBC1986 could be applied to a European paediatric cohort contemporary to the NFBC1986, with similar obesity prevalence (4%) but different cultural background [22]. The second validation set used was a prospective sample of 1032 children (7 years) from Massachusetts (United States) from the Project Viva (http://www. dacp.org/viva/index.html) to explore whether results would remain valid when applied to a very recent U.S. child cohort, with higher obesity prevalence (8%) and very different cultural background. Genetic variants were not available for the validation analyses. All children meeting the international criterion for obesity definition at the time of recruitment in the Italian sample and at 7 years of age in the U.S. sample were classified as affected by childhood obesity [21].

Statistical Analysis
Development phase. Predictive models were fitted by stepwise logistic regression analysis (criterion for variable entry: p,0.05, for variable removal: p.0.10) using traditional risk factors only, genetic score only and traditional risk factors plus genetic score for each obesity outcome. Each risk factor entering the analysis as continuous or ordinal scale variable showed a linear relationship with the logit-risk of childhood obesity in a preliminary linear regression analysis. For persistent childhood obesity, not all the a priori selected traditional predictors were used for the stepwise analysis but only the five with the strongest association with persistent childhood obesity in a preliminary univariate analysis, in order to avoid possible model over-fitting due to the relatively small number (forty-seven) of outcome events.
The discrimination accuracy of each model was evaluated by the area under the receiver operating curve (AUROC) of the modeled risk [23]. Models with AUROCs larger than 0.7 were considered potentially clinically useful and those with AUROCs larger than 0.8 were considered to have excellent accuracy [23].
The model calibration, that is the ''precision'' or correlation between the predicted and observed event rate, was assessed by the Hosmer-Lemeshow test [23]. The possible accuracy improvement associated with adding the genetic score to the traditional risk factors was evaluated by calculating the integrated discrimination improvement (IDI) compared to the traditional risk factors alone [24].
For each model a risk threshold was arbitrarily adopted at the 75 th percentile of the modeled risk, identifying the top 25% as being at increased risk and the thresholds' predictive properties (sensitivity, specificity and predictive values) were calculated.
An average of 1.67% (0-11.4%) of data was missing for each traditional risk factor, while an average of 0.72% (0-4.95%) of genotypes was missing for each SNP. We included participants with zero or one missing traditional baseline variable and three or fewer missing SNPs. Multiple imputation was performed for the remaining missing values, in order to avoid possible bias associated with missing potentially important information [25]. Win MICE (Multiple Imputation by Chain Equations) V0.1. was used for multiple imputation [25]. By the MICE procedure, imputed values for missing data are drawn from modelling them on the basis of the other considered variables, with logistic regression if the variable to impute is dichotomous, polytomous logistic regression if it is categorical with three or more categories and with linear regression if it is continuous [25]. So each missing value is replaced by an estimated value modelled on the other variables. Indeed, the method estimates a distribution of each missing variable, taking all aspects of uncertainty in the imputations into account. From this distribution, values are sampled and filled in for the missing data. So every imputation cycle produces, for each missing data, one estimated value sampled among several possible ones, giving rise to a unique dataset which can not be reproduced by following imputation cycles [25].
Five imputation cycles were run so that five values were imputed for each missing datum to get variation in the imputed values, thus reflecting the uncertainty introduced by imputation itself. Inference was based on the five resulting datasets [25]: areas under AUROCs were obtained by averaging the five single data sets coefficients, while 95% confidence intervals were delimited by the two overall most extreme boundaries, the lowest and the highest  [25]. All the coefficients, the AUROCs and the 95% C.I. boundaries were identical up to the first or second decimal for any considered variable across the five datasets. Validation and replication phase. Only the model developed for childhood obesity was used for validation because the model for prediction of childhood overweight/obesity was not considered accurate enough to be clinically useful and the models concerning adolescent phenotypes required older cohorts than Veneto and Project Viva. The NFBC1986 equation was applied to the validation cohorts after recalculation of the intercept according to the cohort-specific phenotype prevalence and mean values of predictors. In the Veneto sample, number of household members and gestational smoking were not available.
A replication analysis was also performed in which the model for childhood obesity was re-built in the two validation samples by stepwise logistic regression using the available traditional risk factors.

Results
Parental BMI, birth weight, maternal gestational weight gain, number of household members, maternal professional category and smoking habits were independent predictors of all or most of the six obesity outcomes ( Table 2-3).
The equations to estimate the risk for the obesity outcomes from these traditional risk factors are represented in supporting information (Dataset S1).
Discrimination accuracy of the risk calculation from traditional risk factors was excellent for persistent childhood obesity Parental BMI was the main contributor to discrimination accuracy while other predictors contributed moderately to the model discrimination effectiveness but increased the overall model calibration ( Table 2-3).
For any given pair of parental BMIs, estimation of the probability of childhood obesity varied greatly, depending on the combination of other predictors ( Figure 1).
Genetic score was an independent predictor of all of the six considered outcomes, with ORs associated with unitary score increase ranging from 1.05[1.03-1.08] to 1.09[1.03-1.14] (0.05.  (Table S7). Adding the genetic score to the traditional risk factors did not produce better AUROCs than using traditional risk factors alone and was associated with modest IDIs not larger than 1% ( Figure S1). The genetic score composed of only the twenty SNPs identified for childhood obesity traits exhibited similar associations with early obesity phenotypes (Table S8). Then only the models developed from traditional risk factors were taken into consideration for further analyses. Predictive properties of the risk thresholds corresponding to the highest risk quartile for each obesity phenotype are represented in Table 4. Positive predictive The Project Viva equation, i.e., the equation to predict childhood obesity issued from the U.S. sample (model replication), included parental BMI, race, gestational smoking and gestational weight gain (Dataset S1), had an AUROC of 0.79[0.73-0.84] (p,0.001) in the Project Viva sample and was adequately calibrated (p for Hosmer-Lemeshow test = 0.91).
The three equations predicting childhood obesity in the three studied cohorts were converted in an electronic automatic risk calculator for potential clinical use (Dataset S2).

Discussion
Our study provides the first example of predictive tool for assessing the risk of developing early obesity phenotypes, based on readily available traditional risk factors about newborns. The potential inclusion of genetic variants was explored, but due to their modest contribution to predictive accuracy, they were not included in the final models.
Analysis of the NFBC1986 showed that traditional risk factors performed better in prediction of severe rather than mild obesity phenotypes. Importantly, the predictive accuracy of the models did not decline from childhood to adolescence, suggesting that the association between the traditional risk factors and obesity is stable until early adulthood. This is consistent with recent evidence about the relationship between single early risk factors and adolescent and adult obesity [6,9,10]. The risk of childhood obesity was largely driven by parental BMI. However, other predictors moderately improved the discrimination accuracy and increased the exactitude of risk estimation. They also produced large ranges of possible risk estimates for any given parental BMI, significantly improving risk classification at any level of parental BMI (Table 2-3, Figure 1 and Dataset S2).
Predictive tools need to satisfy important requisites before they can be applied in clinical settings. First, significant preventive advantages should derive from prediction. Although medical societies have been called on to provide reasonable guidance on prevention based on available data and the American Academy of Paediatrics has recently underlined the emergent need of finding effective clinical tools to enable primary care providers to contribute to obesity prevention [26][27], there is no compelling evidence of any efficient obesity preventive strategy involving infancy. Then, robust trials proving the effectiveness of strategies of early prevention are still needed to justify the adoption of early obesity prediction in the everyday clinical practice. Should trials prove the efficacy of preventive strategies implying special interventions going beyond paediatric counselling and public health campaigns routinely provided to the general population, a predictive tool like that proposed here would offer the important advantage to exclude a large proportion of infants from such interventions, thanks to its good negative predictive value. This would improve the cost/effectiveness ratio of preventive actions.
However few available controlled prevention trials suggest that interventions directly involving parents of pre-school children outside education settings are more effective than school or community-based interventions targeting later ages, supporting the hypothesis that involving parents in the prevention of their offspring's obesity as early as possible is likely to be a good strategy [1]. In this view, it has been suggested that « Let's Move » against child obesity campaign, which is a U.S. government-sponsored obesity prevention program targeting children aged 2-10, might be more effective if children under 2 could be identified as prevention targets [4].
Parents of newborns are particularly sensitive to information given about their child's health. Once informed of their baby's increased risk for obesity, they might be more receptive to routine advice provided from birth during the first two years of life within population-wide prevention: breastfeeding, feeding on demand, weaning no earlier than the sixth month with recommended meal patterns and food portions, avoiding of television and sugarsweetened beverages [28]. Moreover, families of newborns at risk could be enrolled in more intensive schedules of growth monitoring and nutritional counselling than those offered to general population, in order to avoid excessive weight gain in infancy. Encouraging strategies aiming at significantly decreasing energy intake in infants should be avoided however, both because of the well known difficulties encountered by parents in doing it and because of potential, unknown harmful effects of an early caloric restriction. In contrast, recent evidence suggests that some preventive strategies prevention of obesity based on educating mothers could be useful to limit excessive infant weight gain promoting appropriate maternal responses to satiety cues and decreasing non-responsive feeding behaviours which over-ride satiety cues, such as food rewards, non food rewards to encourage infant to eat, etc… [29]. Such strategies do not imply a direct food restriction, but rather a limitation of ''passive'' (not hunger-driven) infant over-eating.
Obviously, even in case of proved efficacy of early obesity prevention, the targeted approach should also be carefully assessed by means of trials with a ''focused intervention'' design, before any dissemination of the early obesity prediction into broad clinical practice. In fact, targeted approach might also imply deleterious effects, among which, for example, stigmatization of families of infants classified as ''at risk'' or false reassurance of other families. Indeed, early prediction should not mean a ''diagnostic'' attitude towards any of the two categories of families. In particular, the assessment of age and BMI at adiposity rebound, which are good predictors of childhood and adult obesity, should be carried on in young children in order to optimize the overall detection rate of those likely to become obese and possibly sensitize families previously ''missed'' by the neonatal score [30].
Accuracy is another important requisite for a predictive tool. The model predicting persistent obesity had excellent accuracy (AUROC = 0.85) while the models predicting obesity and persistent overweight had clinically useful discrimination accuracy (AUROCs = 0.75 to 0.78) [23], similar to that of widely used tools for predicting multifactor medical conditions, such as the Framingham risk score for coronary heart disease (AUROC = 0.74 to 0.77 depending on gender and type of scoring adopted) [31]. Due to low prevalence of the obesity phenotypes in the NFCB1986, the fourth quartiles of predicted risk had low to moderate prevalence of cases even if they ''captured'' most or a high percentage of cases (low positive predictive value despite good sensitivity) ( Table 4). This represents a possible drawback of preventive strategies based on risk assessment [32]. Nevertheless, risk thresholds conceived for prediction and focused prevention are not required to be ''diagnostic'' but rather cost-effective. Thus, the criteria we propose to select newborns at risk for obesity, could have a strong impact on public health, despite their low specificity/positive predictive value, because they could justify cost-effective preventive strategies on a subsection of the general population, similarly to several sensitive though little specific selective criteria used for widespread preventive interventions, such as: age higher than 30 years as criterion to recommend pap test against cervical cancer, age higher than 50 years as criterion to recommend the faecal occult blood test against colon cancer, etc… [33][34]. The adequate discrimination and calibration accuracy achieved by the equations presented in the manuscript imply that a high percentage of future obese children (more than two-thirds), is included in the highest quartile of calculated risk. Thus, using the highest risk quartile of calculated risk as selective criterion would allow focused preventive strategies to reach 70-75% of potential future cases though involving only 25% of newborns. Should these strategies have just about 50% effectiveness, the number of future obese children would have a 35-38% decrease, which would represent much greater success compared with results obtained to date by large scale preventive strategies involving later infancy and childhood [5] The models using traditional risk factors had good calibration, which suggests that it may be possible to use the newborns' calculated risks in addition to the two risk categories. This would add precision to prediction and potential further effectiveness to related prevention Finally, the equations we present use easily accessible information, do not incur additional costs to clinical care, and only require minimal time to calculate, if converted into simple automatic calculators like those we propose in the Supporting Information. Such electronic risk calculators could be part of an electronic medical record system and/or be housed within computer-assisted standardised programs of obesity prevention, which are promising tools for the prevention and care of paediatric obesity [35].
The results of the validation/replication analyses allow for important considerations. First of all, traditional risk factors have a good cumulative accuracy (AUROC = 0.79) in the recent U.S. paediatric cohort, which has a significantly higher prevalence of childhood obesity than the NFBC1986. This demonstrates that the environmental pressure towards obesity has not weakened the role of early risk factors. Moreover, it supports the hypothesis that, at the current phase of the obesity pandemic, the use of ''familial and personal'' risk factors for early prediction may be useful, in addition to population wide interventions, in those regions, like Massachusetts, where the prevalence of obesity is still moderate and characterised by ethnic and social disparities rather than influenced by country-related risk factors [36]. In these regions, focused preventive strategies based on personal risk stratification may effectively integrate large scale interventions based on nation wide characteristics [32,36]. Interestingly, since 2010 the U.S Government has been supporting a preventive strategy against childhood obesity involving low-income children from Boston (http://www.cdc.gov/CommunitiesPuttingPreventiontoWork/ communities/profiles/both-ma_boston.htm), indicating efforts towards focused prevention. Employing focused strategies involving newborns whose risk is high according to diverse factors beyond social parameters, could lead to earlier, more effective prevention of overweight/obesity in children.
The NFBC1986 equation for childhood obesity proved to keep acceptably discriminative when applied to both the validation cohorts, but showed a lost of calibration when applied to the Viva cohort, suggesting that its adoption in the U.S. would have acceptable validity to discriminate newborns at risk for early obesity but not to perform exact risk estimations. This is probably due to inconsistency of some predictors, such as maternal professional category and number of household members. Accordingly, the Project Viva equation lacks these variables while it includes race, which is not present among obesity predictors in the NFBC1986 equation, because of the high ethnical homogeneity of the NFBC1986. Inconsistency of the role of SES variables across different populations is expected and it is the main reason why it would be very difficult to build a highly accurate and calibrated score that also has complete widespread validity [36].
Overall, the validation analysis suggests that ''local'' equations, including parental BMI but also other locally important early predictors, may have good accuracy in predicting childhood obesity at birth, even in countries like the U.S., with high environmental pressure towards early weight excess, and should be preferred, whenever possible, to the universal adoption of the NFBC1986 equation. Interestingly, parental BMI, which partly reflects the degree of familial genetic predisposition to obesity, had very similar effect size and accuracy in the three studied cohorts, consistently with the evidence that the growing obesity epidemic has not lowered the heritability of childhood adiposity [37].
Our study also explored, with the largest list of obesity-SNPs ever used, the performance of genetics in predicting early obesity phenotypes, showing very modest predictive accuracy of the assessed genetic variants, consistently with previous evidence on adult obesity [20]. Even if a modest predictive accuracy of the studied genetic variants was expected, the accuracy estimates obtained in this study rule out, for the first time, the hypothesis that genetics may perform a little better in predicting early obesity than adult obesity, due to presumed lower impact of environmental determinants during childhood than later in life. This result is consistent with recent evidence that polygenic risk and BMI show substantially similar correlation coefficients between childhood and adulthood and further contributes to the growing evidence that common genetic variants are not yet ''ready for use'' for the prediction of several complex diseases, due to the still small proportion of heritability explained by the newly discovered variants [30,38]. It is possible that next-generation sequencing techniques will reduce significantly the gap of ''missing heritability'' of obesity, identifying rare causative variants and clarifying the role of epigenetics by the genome-wide characterisation of DNA methylation patterns in foetuses or infants developing later obesity or not [39].
Finally, the most important evidence obtained by including currently known SNPs in our analyses is that not only common genetic variants have very low accuracy in predicting early obesity but also they produce a very little improvement of the prediction when combined with clinical factors. This is particularly important because although the notion that genetic variants have poor value in predicting common diseases is quite well established, the possible utility of including polygenic risk scoring within management strategies for complex diseases is a topical subject of current research and genetic testing services including obesity are being offered to consumers by private companies [39][40].
The main limitations of our manuscript are the lack of external validation for the equations predicting adolescent and persistent obesity, due to the young age of our validation cohorts and the use, in one of the validation analyses, of a retrospective paediatric cohort with some variables lacking and an age of assessment not perfectly corresponding to that of the original cohort (4-12 years versus 7 years).
The main strengths include: the novelty and the potential strong public health impact of multivariate obesity predicting tools valid for newborns; the optimization of results reliability and robustness by the adoption of several recommended methods shown recently to be lacking in several recent high impact prediction studies [41]: external geographical and temporal validation (for the model predicting childhood obesity), use of multiple imputation for missing values, avoidance of predictor dichotomisation, assessment of models calibration accuracy, avoidance of model over-fitting.
In summary, our study provides the first example of at birth prediction of early obesity by means of traditional, routinely available risk factors and should guide future efforts towards randomized trials of very early preventive approaches for identified high risk individuals to help combat the obesity epidemic. Figure S1 ROC curves of combined traditional risk factors (blue), genetic score (beige) and traditional risk factors + genetic score (green) predicting six obesity outcomes in the NFBC1986. Integrated discrimination improvements (IDIs) associated with adding the genetic score to the traditional risk factors are also provided. (TIF)