Lifestyle correlates of overweight in adults: a hierarchical approach (the SPOTLIGHT project)

Background Obesity-related lifestyle behaviors usually co-exist but few studies have examined their simultaneous relation with body weight. This study aimed to identify the hierarchy of lifestyle-related behaviors associated with being overweight in adults, and to examine subgroups so identified. Methods Data were obtained from a cross-sectional survey conducted across 60 urban neighborhoods in 5 European urban regions between February and September 2014. Data on socio-demographics, physical activity, sedentary behaviors, eating habits, smoking, alcohol consumption, and sleep duration were collected by questionnaire. Participants also reported their weight and height. A recursive partitioning tree approach (CART) was applied to identify both main correlates of overweight and lifestyle subgroups. Results In 5295 adults, mean (SD) body mass index (BMI) was 25.2 (4.5) kg/m2, and 46.0 % were overweight (BMI ≥25 kg/m2). CART analysis showed that among all lifestyle-related behaviors examined, the first identified correlate was sitting time while watching television, followed by smoking status. Different combinations of lifestyle-related behaviors (prolonged daily television viewing, former smoking, short sleep, lower vegetable consumption, and lower physical activity) were associated with a higher likelihood of being overweight, revealing 10 subgroups. Members of four subgroups with overweight prevalence >50 % were mainly males, older adults, with lower education, and living in greener neighborhoods with low residential density. Conclusion Sedentary behavior while watching television was identified as the most important correlate of being overweight. Delineating the hierarchy of correlates provides a better understanding of lifestyle-related behavior combinations which may assist in targeting preventative strategies aimed at tackling obesity.


Background
Excess body weight is determined by multiple factors acting in combination, including genetic, metabolic and behavioral factors, as well as more upstream socioeconomic influences and built environment characteristics [1]. Those that are modifiable provide important potential targets for preventive interventions [2]. Diet and physical activity are recognized as the most proximal determinants of energy balance [3] but there is growing recognition of the role of sedentary behaviors (e.g., sitting time), independent of physical activity [4][5][6][7]. The influences of smoking and alcohol intake on body weight are also well documented [8][9][10]. More recently, a role has also been suggested for sleep duration [11][12][13].
The inter-relationship of these obesity-related lifestyle behaviors has stimulated interest in co-occurrence patterns [14,15]. Several studies have used explorative data-driven methods, such as cluster analysis or latent class analysis to examine the relations between diet, physical activity, and sedentary behaviors, independently of the health outcome of interest [6,16,17]. Smoking status and alcohol consumption have been included in some analyses [18][19][20]. The variety of methodologies used make it difficult to ascertain how these factors correlate with each other and what this means for body weight and health. Additionally, previous studies have not considered contextual factors such as socioeconomic characteristics and the built environment, increasingly recognized as major upstream determinants of overweight [21].
A recursive partitioning method-the classification and regression tree (CART) approach [22]-makes it possible to examine how a set of risk factors jointly influence the risk of an outcome such as overweight. This approach has previously been used to assess the risk of overweight in children [23,24] and the risk of reduced mobility in older obese adults [25].
This study sought to identify the hierarchy of lifestylerelated behaviors associated with overweight in European adults, and to examine how subgroups identified differed by socio-demographic and built environment characteristics.

Study design and sampling
This study, part of the EU-funded SPOTLIGHT project [26], was conducted in five European urban regions: Ghent and suburbs (Belgium), Paris and inner suburbs (France), Budapest and suburbs (Hungary), the Randstad (a conurbation including Amsterdam, Rotterdam, the Hague and Utrecht in the Netherlands) and Greater London (United Kingdom). Sampling of neighborhoods and recruitment of participants have been described in detail elsewhere [27]. Briefly, neighborhood sampling was based on a combination of residential density and socio-economic status (SES) data at the neighborhood level. This resulted in four pre-specified neighborhood types: low SES/low residential density, low SES/high residential density, high SES/low residential density and high SES/high residential density. In each country, three neighborhoods of each neighborhood type were randomly sampled (i.e. 12 neighborhoods per country, 60 neighborhoods in total). Subsequently, adult inhabitants (≥18 years) were invited to participate in a survey. A total of 6037 individuals participated in the study between February and September 2014. The study was approved by the corresponding local ethics committees of participating countries and all participants in the survey provided informed consent.

Body mass index
Body mass index (BMI) was calculated by dividing selfreported weight (kg) by the square of the self-reported height (m 2 ). Adults were categorized as overweight if their BMI was ≥25 kg/m 2 [1].

Socio-demographic data
Socio-demographic variables included age, gender and educational level (defined as 'lower' [from less than primary to higher secondary education] and 'higher' [college or university level] to allow comparison between country-specific education systems).

Physical activity
Physical activity during the last 7 days was documented using questions from the long version of the validated International Physical Activity Questionnaire (IPAQ) [28]. Good reliability (Spearman correlation coefficients ranged from 0.46 to 0.96) and acceptable criterion validity (median ρ of about 0.30) have been found for this questionnaire in a 12 country study [28]. Transportrelated and leisure time physical activity were estimated (in minutes per day − min/d) by multiplying the frequency (number of days in the last 7 days) and duration (average time/d).

Sedentary behavior
The validated Marshall questionnaire was used to collect sedentary behavior data during the last 7 days [29]. Acceptable criterion validity (Spearman correlation coefficient greater than or equal to 0.50 for watching TV, and using a computer at home during weekdays) has been demonstrated. Lowest validity coefficients were found for other leisure-time activities and transportrelated sedentary behaviors during weekend days (correlation coefficients ranged from 0.15 to 0.42) [29]. Time spent (min/d) sedentary for travel, television (TV), computer and other leisure time activities (e.g., socializing, movies but not including TV and computer use) was averaged over a week.

Eating habits
Current eating habits were assessed using common food frequency questions on consumption of fruit, vegetables, fish, sweets, fast-food, sugar-sweetened beverages, and alcohol. Response options were 'once a week or less' , '2 times a week' , '3 times a week' , '4 times a week' , '5 times a week' , '6 times a week' , '7 times a week' , 'twice a day' , and 'more than twice a day'.

Smoking status
Participants reported their smoking status: current, former or never.

Sleep duration
Participants provided information on their hours of sleep during an average night. The response options ranged from 4 to 16 h/night (in half-hour intervals).

Neighborhood clusters
Four neighborhood clusters were previously identified based on data related to food and physical activity features of the built environment collected by a Google Street View-based virtual audit performed in 59 study neighborhoods [30]. The clusters were labeled: cluster 1 (n = 33) 'green neighborhoods with low residential density' , cluster 2 (n = 16) 'neighborhoods supportive of active mobility' , cluster 3 (n = 7) 'high residential density neighborhoods with food and recreational facilities' , and cluster 4 (n = 3) 'high residential density neighborhoods with low level of aesthetics'.

Data analysis CART approach
Recursive partitioning was used to identify the hierarchy and combinations of all lifestyle-related behaviors described in the Measures section that best differentiated overweight (≥25 kg/m 2 ) vs. non-overweight (<25 kg/m 2 ) participants.
Recursive partitioning is an algorithm of the CART nonparametric statistical method [22]. This approach has been used in different research fields, such as genetic epidemiology [31], and produced greater homogeneity in subgroups than has been achieved with other approaches, such as regression models [32]. Recursive partitioning is a step-by-step process by which a decision tree is built by either splitting or not splitting each node of the tree into two daughter nodes. Each possible split among all variables present at each node is considered. The tree is constructed by the algorithm asking a sequence of hierarchical Boolean (yes/no) questions (e.g., is X i ≤ θ j ?, where X i is a candidate variable, and θ j is a cut-off ) generating descendant nodes [33]. The cut-off in the candidate variable that produced the maximal differentiation between individuals is retained, and used to split the sample into two subgroups (i.e. two daughter nodes). This process is repeated for each new subgroup found. Every variable is a potential candidate at each stage in growing the tree, so some variables may appear several times, using different cut-offs. The best way to split the data is determined by the Gini impurity index. This index ranges from 0 (pure node, i.e. all observations within the node assigned to a single target class-e.g., a node with a class distribution [0;1]) to 1 (impure node, i.e. mixed target classes-e.g., a node with a class distribution [0.5;0.5]). The complete tree is pruned by a sequential node-splitting process to avoid over-fitting the data; a sequence of sub-trees is generated and compared. The optimum tree is obtained using both crossvalidation and cost-complexity pruning method. The cost-complexity pruning method assesses the balance between misclassification costs and complexity of the sub-tree. Additionally, each terminal node was set to require a minimum of 200 subjects.

Lifestyle subgroups
Characteristics of the subgroups identified through the CART analysis were compared. All variables included in the CART analysis were considered, in addition to sociodemographic and built environment characteristics (i.e. urban region, neighborhood type-pre-specified neighborhood type, and residential density and SES levels examined separately-and neighborhood cluster).
Chi-squared tests, and Kruskal-Wallis tests with posthoc Bonferroni-Dunn test were used to examine differences between subgroups.

Multilevel regression analyses
Because participants were nested within neighborhoods, the likelihood of being overweight for each partitioning variable was estimated by a multilevel logistic regression model (neighborhood identifier included as a random effect) adjusted for potential confounders (gender, age, education level, and neighborhood type).

Characteristics of the study population
Results are given for 5295 individuals for whom BMI was available. The study population comprised 55.8 % females, with a mean (standard deviation-SD) age of 51.7 (16.4) years; 54 % were highly educated. Mean BMI was 25.2 (4.5) kg/m 2 , and 46.0 % adults were overweight.
Compared to non-overweight subjects, overweight adults were more likely to be male, older, less educated, former smokers, short sleepers, less physically active, eating less fruit and vegetables, and spending more time sitting, especially when viewing TV. The prevalence of overweight ranged from 38.3 % in Greater Paris to 53.2 % in Greater Budapest (Table 1).

CART analysis
The final tree contained 10 nodes (i.e. 10 subgroups) and had a classification error of 35.4 %. The 6 variables that were retained as the most important for discriminating overweight status were in the following order: sedentary time while watching TV, smoking status, sleep duration, leisure time physical activity, and vegetable intake ( Fig. 1).
The odds of being overweight were 61 % (41-85 %) higher for those reporting longer time watching TV (≥142 min/d) than others.
Longer time spent watching TV (≥142 min/d) and being a former smoker were important correlates of overweight. Current or non-smokers who spent a long time watching TV and were less physically active during leisure time were also at risk of being overweight.
Among adults watching less TV (<142 min/d) and being former smokers, those who were short sleepers (<7 h/night) were more likely to be overweight compared to long sleepers. Protective factors against being overweight among current and non-smokers included: short time watching TV, being physically active during leisure time, and eating vegetables every day.  [7.9] min/day, median: 0 min/day), the highest mean frequency of eating fruits and vegetables. The highest percentage of participants living in neighborhoods that were characterized by high SES and high residential density was observed in this subgroup, as was the lowest percentage of participants living in 'green neighborhoods with low residential density'.

Discussion
This study investigated the hierarchy and combination of lifestyle-related behaviors in relation to the prevalence of overweight in European adults. Prolonged sitting while watching TV, being a former smoker, short sleep, lower levels of physical activity and lower vegetable consumption were the lifestyle-behaviors that identified the subgroups with highest likelihood of being overweight. High-risk subgroups included mainly males, older and less well educated adults living in greener neighborhoods with low residential density.
Although it is well recognized that overweight and obesity are multifactorial in origin [1,2], few studies have examined the joint relation of lifestyle-related behaviors with overweight in adults. In this study, a hierarchy of lifestyle-related behaviors in identifying subgroups at risk was established through a visual chart showing how risk factors are inter-related. The tree indicated that the most important factor was sitting while watching TV. This variable appeared several times at different levels of the tree, underlying its importance. The variable that followed was smoking status, in both tree branches, and no additional variable appeared to explain the risk for overweight in former smokers (among those with longer duration of watching TV), suggesting its very high impact. Sleep duration, leisure time physical activity and vegetable intake appeared at later stages in the tree, suggesting they would have less importance compared to sedentary behavior and smoking status. Relations between the lifestyle-related behaviors and overweight status were confirmed in multilevel regression analyses taking into account potential confounding factors. The findings also suggested nonlinear relations between lifestyle-related behaviors and overweight. Indeed, subgroups who watched TV a lot (>180 min/d) had lower odds of being overweight than subgroups who watched less TV (between 24 min/d and 142 min/d).
Although it has been suggested that a combination of several sedentary behavior variables is appropriate to capture sedentary lifestyle [36], only TV viewing was   retained among several variables related to sedentary time. The greater importance of TV viewing has been previously suggested in cross-sectional studies [37][38][39]. Given the lack of evidence from prospective studies, the issue of bidirectional or reverse causality has been raised [40]. In the Nurses' Health study, each 2 h/d increment in TV watching was associated with a 23 % [17-30 %] increased risk of obesity. However, the risk of developing obesity was attenuated after adjustment for baseline BMI [5]. These findings may suggest that, even at baseline, women who watched more TV were already on a trajectory to become obese [5]. Heavier individuals at baseline could have a preference for sedentary habits due to their higher body weight. TV viewing is not only an indicator of sedentary behavior but may represent a potential surrogate of other behaviors affecting the energy balance e.g., via increased snacking behavior [7,41].     Former smokers were more likely to be overweight than both current and never smokers. These results are consistent with previous findings [10,[42][43][44]. Weight gain after quitting smoking has been related to the fact that nicotine acts as an appetite suppressant and quitting may be associated with increased energy intake [45,46]. The average weight gain is about 4.5 kg, 1 year after quitting [46]. In the NHANES survey, weight gained over 10-years was significantly higher in former smokers compared to current smokers (8.4 kg vs. 3.5 kg, after adjustment for age, gender, ethnicity, education level) [44]. A recent study has estimated that smoking cessation leads to an average increase of 1.5-1.7 BMI units and that the drop in smoking may explain up to 14 % of the rise in obesity prevalence in recent decades [47]. Weight gain after smoking cessation was less pronounced when number of years since smoking cessation increased [43], and negatively associated with socio-economic status [48].
Short sleep duration was found to be associated with an increased risk of overweight. The hypothesized underlying mechanisms include thermoregulation, hunger hormone regulation changes, and/or an impact on physical activity and sedentary behaviors [49][50][51][52]. Short sleep duration was associated with other lifestyle-related behaviors, such as TV or computer use [52,53], a correlation between time spent sleeping, physical activity and sedentary behavior was documented [54]. High leisure time physical activity and intake of vegetables were associated with lower prevalence of overweight. These behaviors-which tend to co-occur-are both well recognized as healthy lifestyle behaviors [55,56]. Interestingly, some cut-offs found are close to thresholds previously reported and/or recommended guidelines (e.g., 2 h/d watching TV [5,38], 7-9 h of sleep [57]). In addition, at least one variable from each component of lifestyle (physical activity, sedentary behavior, sleep duration, eating habits, and smoking status) was identified as a correlate of overweight. Moreover, subgroups at high-risk of overweight were characterized by at least one unhealthy lifestyle behavior. These findings emphasize how all components of lifestyle are important to consider and a combination of unfavorable lifestyle factors may predict overweight in adults.
Lifestyle subgroups identified by CART differed in terms of socio-demographic factors. The subgroup with the highest prevalence of overweight comprised mainly males, older adults and lower educated adults. These findings are in line with previous studies [58,59]. Individuals with higher educational background may be more informed about the health consequences of their lifestyles and have the resources to take action, leading to healthier lifestyle behaviors [60]. The subgroups identified also varied across urban regions: 72.2 % of French respondents were in the subgroups with lower overweight prevalence (subgroups 1-6). Looking at differences at neighborhood level, as previously documented [61], some neighborhoods seem more obesogenic than others, especially low SES and low residential density neighborhoods. Socio-spatial disparities in obesity prevalence at census-tract level have been previously documented with lower prevalence in neighborhoods with high median home values [61]. Low SES neighborhoods have been shown to have less supportive environmental conditions for active transportation [21]. Moreover, a greater percentage of 'green neighborhoods with low residential density' was observed in subgroups with high overweight prevalence. Greener neighborhoods with low residential density may be less supportive of active transport and more oriented towards motorized transport. Use of motorized transportation may be linked to weight gain [62]. Conversely, in high residential density neighborhoods, many destinations are easily accessed since located at shorter distance, and parking a car may be more difficult therefore encouraging active transportation (e.g., walking, cycling, public transport) [21]. Thus, adults living in neighborhoods unsupportive of physical activity and far away from destinations may be more likely to remain indoors and watch TV.
This study has several strengths: a relatively large sample size, assessment of a number of lifestylerelated variables using standardized procedures, a survey performed in different geographical areas across Europe, and the use of a nonparametric method (CART) providing a visual representation of lifestylerelated behavior inter-relationships. This study has some limitations, caution is thus needed when interpreting and generalizing the results. Due to its crosssectional nature, temporal relations between overweight and lifestyle behaviors cannot be assessed. As data were self-reported, potential (recall) bias, and possible underestimation or overestimation of variables (e.g., weight/height [63], sedentary behaviors [29,64], physical activity [64][65][66]) cannot be excluded. Although behaviors such as eating habits were not recorded in enough detail to assess the role of more detailed dietary aspects, such as macronutrient intake, many aspects of lifestyle currently thought to be associated with body weight were covered (sedentary behavior, physical activity, eating habits, alcohol consumption, smoking status, and sleep duration). The CART method is data-driven, and the misclassification error was about 35 %. In the literature, it is not uncommon to report a misclassification error around 30 % and this might be higher for health promotion-based intervention strategies [67].

Conclusions
Low levels of TV viewing, non-smoking, high leisure time physical activity, high vegetable consumption, and longer sleep duration were identified as components of a healthy lifestyle associated with decreased risk of excess weight in adults. The results specifically point to the importance of sedentary habits as a key component to focus on when addressing the multiple factors associated with excess weight in preventive interventions.
Abbreviations BMI: Body mass index; CART: Classification and regression tree; SD: Standard deviation; SES: Socio-economic status; TV: Television