Development and validation of a simple-to-use nomogram for self-screening the risk of dyslipidemia

This study aimed to help healthy adults achieve self-screening by analyzing the quantitative relationship between body composition index measurements (BMI, waist-to-hip ratio, etc.) and dyslipidemia and establishing a logical risk prediction model for dyslipidemia. We performed a cross-sectional study and collected relevant data from 1115 adults between November 2019 and August 2020. The least absolute shrinkage selection operator (LASSO) regression analysis was performed to select the best predictor variables, and multivariate logistic regression analysis was used to construct the prediction model. In this study, a graphic tool including 10 predictor variables (a "nomogram," see the precise definition in the text) was constructed to predict the risk of dyslipidemia in healthy adults. A calibration diagram, receiver operating characteristic (ROC) curve, and decision curve analysis (DCA) were used to verify the model’s utility. Our proposed dyslipidemia nomogram showed good discriminative ability with a C-index of 0.737 (95% confidence interval, 0.70–0.773). In the internal validation, a high C-index value of 0.718 was achieved. DCA showed a dyslipidemia threshold probability of 2–45%, proving the value of the nomogram for clinical application for dyslipidemia. This nomogram may be useful for self-screening the risk of dyslipidemia in healthy adults.

www.nature.com/scientificreports/ absolute shrinkage and selection operator (LASSO) analysis, an appropriate tool for selecting more favorable variables by re-weighting the LASSO penalty for each variable. Based on this nomogram, healthy adults could self-screen for the risk of dyslipidemia, and dyslipidemia patients could design an exercise plan to maintain a healthy body size and shape to reduce the risk of dyslipidemia.

Study population. This cross-sectional study was approved by the Institutional Review Board (IRB) of
Wuhan Sports University and conducted following the principles outlined in the latest version of the Declaration of Helsinki. It was conducted at the Hubei Institute of Sport Science from November 18, 2019, to August 11, 2020, and aimed to observe the effect of body composition index measurements on dyslipidemia. A total of 1115 volunteered to participate and signed the written informed consent letter. From among the 1115 participants, we excluded those who (1) were diagnosed with dyslipidemia (because both exposure and outcome are measured simultaneously in cross-sectional studies, the outcome of interest in this study is undiagnosed dyslipidemia), cardiovascular diseases (including hypertension, myocardial infarction, coronary artery disease, heart failure, peripheral artery disease, or stroke), diabetes, abnormal liver function, abnormal renal function, abnormal thyroid function, malignancy, and/or active inflammatory diseases (n = 15); (2) had no data on their general characteristics (n = 59), including age, marital status, educational level, type of occupation, smoking status, frequency of physical activity, previous medical history, and/or annual income; (3) had no data on their body composition measurements (n = 26). The final study sample included 1015 participants of which 495 were males and 520 were females, aged 19-68 years.
General characteristics. The general characteristics were obtained through a questionnaire that was completed by each participant including the following information: name, sex, age, marital status, educational level, type of occupation, smoking behavior, drinking status, frequency of physical activity, previous medical history, and annual income. According to the latest regulations of the United Nations World Health Organization (WHO) on the classification criteria of age groups, participants were grouped into youth (age < 44 years), middle-aged (44 years ≤ age ≤ 60 years), and elderly (age > 60 years) categories 13 . Marital status was classified into two groups, namely living alone (including single, divorced, separated, or widow/widower) and living as a couple (including married, cohabitant, and other relationships). The educational level was divided into two groups, namely those with tertiary education (university education or higher) and those without. The type of occupation was divided into manual and non-manual by self-reporting. Smoking behavior was classified as never smoking, formerly smoking (having not smoked for more than 6 months), and currently smoking (having smoked at least one cigarette within the past 6 months). Alcohol intake was also classified into two groups, i.e., consuming alcohol presently or not (drinking alcohol in the past or never drinking). Physical activity was classified into the following three categories: none, irregular (≤ 2 episodes/week), and regular (≥ 3 episodes/week). Annual income was categorized as < 100,000 RMB and ≥ 100,000 RMB groups.
Body composition indices measurements. Height, weight, hip circumference (HC), waist circumference (WC), weight, fat mass, and body fat percentage were measured for participants when they were dressed in light clothing without shoes by trained staff following the standard procedures as body composition indices. The height was measured to the nearest 0.1 cm by using a stadiometer (Seca). WC and HC were measured with a nonelastic measuring tape to the nearest 0.1 cm. Weight and body fat percentages were measured using a direct segmental multi-frequency bioelectrical impedance analyzer (In Body 770). All measurements were repeated thrice, and the mean value was used in this study. We also calculated body mass index (BMI) as weight (kg) divided by squared height (m); waist-to-hip ratio (WHR) as WC (cm) divided by HC (cm); waist-to-height ratio (WHtR) as WC (cm) divided by height (cm), and hip-to-height ratio (HHtR) as HC (cm) divided by height (cm). The WHR, WHtR, HHtR, and body fat percentages were categorized based on the best cut-off points obtained from the receiver operating characteristic (ROC) curve analysis. BMI was classified as follows: < 18.5, 18.5-24.9, 25.0-29.9, and ≥ 30 using the WHO international standards 14 .
Serum lipid measures and the definition of dyslipidemia. After fasting overnight, the peripheral blood samples of participants were collected to measure the following variables: total cholesterol (TC), triglycerides (TG), low-density lipoprotein cholesterol (LDL-C), and high-density lipoprotein cholesterol (HDL-C). All measurements were recorded in the Hubei Provincial Hospital of Integrated Chinese and Western Medicine using the same and standard procedures. According to the 2016 Chinese Guidelines for the Management of Dyslipidemia in Adults 15 , dyslipidemia was defined as having TC ≥ 6.2 mmol/L, TG ≥ 2.3 mmol/L, LDL-C ≥ 4.1 mmol/L, and/or HDL-C ≤ 1.0 mmol/L.

Statistical analysis.
All data analyses were performed with R software (version 4.0.3; https:// www.R-proje ct. org). A univariate logistic regression analysis was performed to compare the differences between the non-dyslipidemia and dyslipidemia groups and calculate odds ratios (OR) and p-values. The LASSO method was used to select variables, before building the predictive model to pick out the optimal variables and eliminate redundant ones 16 . Then, based on the selected variables from the LASSO regression model, multivariable logistic regression analysis was performed and a visual nomogram was constructed as a predictive model. For assessing the nomogram's accuracy, we used two methods. First, C-index was measured, and internal validation was performed by the bootstrapping technique (1000 bootstraps) to quantify the discrimination of our proposed dyslipidemia nomogram 17  www.nature.com/scientificreports/ effect of the nomogram and calculate its net benefit 18 . Unless otherwise stated, p-values < 0.05 were considered significant.
Feature selection. Of all the potential variables, 10, including sex, age, marital status, educational level, physical exercise, annual income, WHR, WHtR, BMI, and body fat percentage, were retained in the LASSO binary logistic regression model at the minimum criteria of lambda (Fig. 1).
Construction of the prediction model. Table 2 presents the results of the multivariable logistic regression analysis for sex, age, marital status, education level, physical exercise, annual income, WHR, WHtR, BMI, and body fat percentage. A model was developed by introducing the above-mentioned independent variables and presented in a nomogram (Fig. 2).
Performance of the dyslipidemia nomogram. Calibration curves were plotted to assess the dyslipidemia nomogram. The calibration curve of the dyslipidemia risk nomogram for predicting dyslipidemia risk in healthy adults suggested a good performance (Fig. 3). To quantify the discrimination performance of the dyslipidemia nomogram, the C-index was calculated. The dyslipidemia nomogram was subjected to bootstrapping validation (1000 bootstraps) to calculate a relatively corrected C-index. The C-index for the prediction nomogram was 0.737 (95% CI 0.701-0.773), which suggested the model's good discrimination ability. Good calibration and discrimination were also obtained in the internal validation with a C-index of 0.718.
Clinical use. DCA was conducted to determine the clinical usefulness of the dyslipidemia nomogram by quantifying the net benefits at different threshold probabilities, which demonstrated that there was more benefit than either the treat-all or treat-none scheme when using the dyslipidemia nomogram at a threshold probability of 2-45% (Fig. 4).

Conclusions
Obesity and the accompanying dyslipidemia are major risk factors for atherosclerotic cardiovascular disease (ASCVD), which could be modified by exercise 18 . In this study, we aimed to reveal the quantitative relationship between body composition index measurements of obesity and dyslipidemia. Based on our results, the selfreported healthy adults could screen for the risk of dyslipidemia themselves, and thus make an exercise plan to maintain a healthy body size and shape to reduce the risk of dyslipidemia. The risk factors that our easy-to-use nomograms employed, including sex, age, marital status, educational level, physical exercise, annual income, WHR, WHtR, BMI, and body fat percentage, are easy to obtain. Previous studies have concluded that a C-index less than 0.7 is less accurate and that greater than or equal to 0.7 has good accuracy 19 . Internal validation in this study showed good calibration and discrimination ability of our proposed nomogram (C-index of 0.737 for the dyslipidemia nomogram and 0.718 for bootstrapping validation; both p < 0.001). The accuracy of our proposed dyslipidemia nomogram was demonstrated by the calibration curves.
Out of 1015 self-reported healthy participants in the study, the percentage of first-recorded dyslipidemia cases was 21.48% (218). In risk factor analysis, the male sex, age of 44 years or more, living as a couple, completing tertiary education, often exercising, an annual income of 100,000 RMB or more, a WHR of 0.85 or above, a WHtR of 0.52 or above, a higher BMI and a body fat percentage of 18% or above were the key individual factors that were associated with the risk of dyslipidemia.
Sex, age, obesity, and BMI were identified as common risk factors, consistent with previous studies 3,20 . Compared with the prediction model constructed by Zhang et al., the risk factors included in our model are much easier to obtain and can be used for disease screening in the general population 20 . The demographic risk factors of sex and age, which have been reported in numerous studies 7,21,22 , are non-modifiable but essential in the prediction of the risk of dyslipidemia. Living as a couple, finishing tertiary education, exercising frequently, and having an annual income of 100,000 RMB or more may indicate a better living situation, in which case it is more likely to have a high fat and calorie diet, thus contributing to increased blood lipid levels 23 . Unexpectedly, unlike previous studies, exercise tended to be a risk factor rather than a protective factor in our study. Some reasons that may explain why exercising frequently is a risk factor for dyslipidemia are as follows: reverse causality may have occurred. People who exercise frequently may have better living conditions and greater exposure to a high-fat and high-calorie diet, which may increase the proportion of individuals with this less-than-healthy diet, thereby making dyslipidemia more common. Differences in the type, intensity, and duration of exercise can have different effects on human health. Some people exercise frequently but the duration and intensity of exercise are not up to the standard, resulting in a decrease in the health benefits of exercise. Unfortunately, we did not collect data to analyze the type, intensity, and duration of exercise. The body composition indices of WHR, WHtR, BMI, and body fat percentage have been linked to dyslipidemia in numerous previous studies [24][25][26][27] . We found that the ability of BMI and body fat percentage to predict the risk of dyslipidemia was superior to that of WHR and www.nature.com/scientificreports/ Dyslipidemia is a the major risk factor for the development of type 2 diabetes (T2DM), atherosclerosis, stroke, and CVDs [28][29][30] . In this study, the nomogram prediction tool that we developed may facilitate appropriate measures when there is more benefit than either the treat-all or treat-none schemes to control blood lipid levels and attenuate the progression of dyslipidemia-related chronic diseases. Furthermore, unlike other studies, all the risk factors included in our study could be obtained by the participants themselves, making it very easy to use.
Our study also has some limitations that warrant consideration. First, the data from a small percentage of the population in one region is not representative of all Chinese people. Second, our study was cross-sectional, and   www.nature.com/scientificreports/ thus, we could not determine the causal relationship, and further validation in prospective studies is needed. Third, not all potential factors that could affect blood lipid levels were included in the risk factor analysis, such as eating habits, sleep disorders, etc. Finally, although the robustness of our nomogram was tested by subsequent bootstrapping validation, there was no external validation in our study to illustrate the generalizability of this model for populations in other regions and countries.
In conclusion, our proposed nomogram for dyslipidemia based on multivariate logistic regression analysis of 10 risk factors is suitable for predicting dyslipidemia risk, as evaluated by bootstrap validation. This visual and simple-to-use tool may be useful for self-reported healthy adults in self-screening the risk of dyslipidemia. It may also be helpful for dyslipidemia patients in making exercise plans to maintain a healthy body size and shape to reduce the risk of dyslipidemia.

Data availability
Datasets generated and/or analyzed during the current study are not publicly available due to private information about patients but are available from the corresponding author upon reasonable request.  . Decision curve analysis for the proposed nomogram for dyslipidemia. The decision curve showed that more benefit could be obtained from our model when the threshold probability was ranging from 0.02 to 0.45 than the intervention-all-patients scheme or the intervention-none scheme.