Development of a Predictive Model of Cardiovascular Risk in a Male Population from the Peruvian Amazon

Background: The coexistence of malnutrition due to over- and under-nutrition in the Peruvian Amazon increases chronic diseases and cardiovascular risk. Methods: A cross-sectional study of a male population where anthropometric, clinical, and demographic variables were obtained to create a binary logistic regression predictive model of cardiovascular risk. Results: We compared two methods with good predictive results, finally choosing Model 4 (r2 = 0.57, sensitivity 73.68%, specificity 95.35%, Youden index 0.69, and validity index 94.21), with non-invasive variables such as blood pressure (p < 0.001), hip circumference (p < 0.001), and FINDRISC test result (p < 0.05); Conclusions: We developed a cheap, fast, and non-invasive tool to determine cardiovascular risk in the population of this endemic area.


Introduction
The coexistence of under-and over-nutrition is known as the double burden of malnutrition [1]. Since it is related to many pathologies derived from these two nutritional statuses, the double burden of malnutrition is becoming an emerging crisis for all middleand low-income countries [2]. Overweight and obesity have increased substantially in these countries, reaching a prevalence of 21.1% and rising even more in Peru, to 47.9% [3]. This problem is due to the changes in food patterns that occurred throughout South America, where traditional food models based on pre-Columbian cultures evolved into Western ones. These changes increase high blood pressure (HBP) and Type 2 diabetes mellitus (DMT2) and dyslipedemia rates [4]. This situation is made worse because of the constant migration from rural areas to urban peripheries, where the prevalence of other risk factors such as illiteracy, violence, stress, or malnutrition show higher figures [5].
The prevalence of HBP in Latin America is 45.5%, and in Peru it reaches 14.5%, with a rate of 26.6% in the jungle, according to the TORNASOL II study [6]. Regarding DMT2, in the urban population of Latin America, the prevalence is between 4-8%. However, the data are relatively scarce and the percentage of patients without diagnosis is around 30-50%. In addition, it is estimated that the prevalence could be much higher in rural areas. In the case of Peru, this percentage of people with diabetes was 25.6% [4]. In terms of dyslipedemia, the data showed 17.4% hypercholesterolemia and 14.9% hypertriglyceridemia [7]. This research was conducted in Iquitos and specifically in Pueblo Libre, a slum located in the district of Belen that is characterized by periodic flooding between February and June. The geographical location of Iquitos confers a transitional character between the communities near the rivers of the jungle and the city itself [8].
Few studies have been conducted to evaluate the risk of DMT2, HBP, and dyslipidemia in this area, probably due to the difficulty in performing laboratory analysis and clinical tests in disadvantaged zones. These problems are linked to the unavailability of material and human resources and poor conditions [9]. For these reasons, we considered it necessary to develop a non-invasive, inexpensive, and easy-to-use diagnostic or screening method for these chronic conditions. This method could provide two advantages: (i) it does not require laboratory tests, and (ii) it makes it possible to identify which individuals at risk (with a positive result) could benefit from laboratory testing.
In 2003, Lindström and Tuomilehto proposed the FINDRISC questionnaire to predict the risk of developing DMT2 within ten years. Although it was designed for the residents of Finland, several studies have used it in different populations around the world, which shows that it is a flexible instrument for screening and preventing individuals at high risk of diabetes [10].
In 2016, our research group conducted a study among women of this population due to their high vulnerability [8]. Three years later, we decided to come back and study the other half of the citizens to develop a non-invasive method to predict cardiovascular risk (CR) and compare results.

Materials and Methods
A cross-sectional study was conducted in the slum of Pueblo Libre (Iquitos) from February to May 2019. The total number of residents was 6042. From these, we selected those over 16 years old, obtaining a total of 3017 subjects and, among them, 1424 men. Using Epidat 4.2, we determined that the sample size for absolute precision of 3%, a confidence interval of 95%, and expected prevalence of 3% and 11.3% for DM and HBP, respectively, was 329 participants. Therefore, a sample of 363 participants structured by sector within the population and age was selected. Cardiovascular risk: co-presence of a minimum of two of the following diseases: HBP, DM, or obesity.
The researchers followed the international standards for anthropometric assessment (ISAK) [14] to collect the anthropometric data. All measurements were performed by specifically trained staff, who took each measure three times to reduce the variation coefficients. Finally, the mean of these three measures was registered.
Weight was assessed to the nearest 0.1 kg using a Tanita BC545NSV20 electronic scale (Tanita, Tokyo, Japan). Then, a digital measuring stadiometer with an accuracy of 0.1 cm, model AC 1200D (Davi & Cia, Barcelona, Spain), was used to measure height.
A Lufki W606PM anti-elastic metallic tape (Lufki, Missouri City, TX, USA) with an accuracy of 0.1 cm was used to measure waist and hip circumference. For waist circumference, researchers took the midpoint from the inferior limit of the lower rib to the iliac crest. Regarding hip circumference, they placed the band around the hips to the level of the major trochanter. Both circumferences were measured at the end of a regular exhalation. Participants were upright, with both feet placed together and arms suspended next to the torso.
Arterial blood pressure was monitored using an OMRON M4 tensiometer (OMRON, Kyoto, Japan) [15] following a rest period of 10 min. Three readings were taken, waiting for a one-minute interval between them [16].

Ethical Considerations
This study strictly followed the guidelines of the Declaration of Helsinki on ethical principles in medical research. All participants were personally, verbally and in written form, aware of the aims of the research study. Researchers also informed them of the hazards and advantages of their involvement in this project. All informed consents were signed and preserved.

Statistical Analysis
SPSS 22 software (IBM Corp., New York, NY, USA) was used for statistical analysis. The quantitative variables were expressed as mean and standard deviation, while the qualitative variables were shown as percentages. The parametric Student's t-test or the nonparametric Mann-Whitney U test were applied to compare two means, depending on the normality of the data. For comparing three or more means, the variance test (ANOVA) analysis was used as a parametric test with the Bonferroni method for post-hoc contrasts. The Kruskal-Wallis test was used as a nonparametric test. The Kolmogorov-Smirnov test (n > 50) with the Lilliefors test correction and a graphical representation as a histogram or Q-Q and P-P plots were calculated to determine the goodness of fit to a normal distribution of the quantitative variables. For sample sizes of less than 50 individuals, the Shapiro-Wilk test was used to contrast the normality of the data. When necessary, a chi 2 test and Fisher's exact test were used to compare the percentages of qualitative variables.
Binary logistic regression was also computed, with the determination of the crude odds ratio (OR) values for each independent variable and adjusted ORs for the final variables of the model. The Wald test was applied as a statistical contrast test. The Hosmer-Lemeshow test was carried out to establish the model's goodness of fit. Finally, the Cox-Snell and Nagelkerke deviance and determination coefficients were used to determine the model's predictive validity.
We also performed the ROC (receiver operating characteristic) curves and each independent variable's area under the curve (AUC) to determine the model's predictive power. Finally, the sensitivity, specificity, positive and negative predictive values, and Youden index of the two final models were determined.
Researchers determined the significance level for an alpha error of less than 5% for all tests, and confidence intervals were calculated for a 95% confidence level.

Population and Sample
The sample consisted of 363 men ranging from 18 to 84 years of age. In addition, 75.5% were married, and 76.6% had not studied or just had primary studies.
Regarding their nutritional status, the mean BMI was 26.5 (4.5) kg/m 2 ; 42.3% showed healthy weight, 37.6% were overweight, and 20.3% were obese. Table 1 shows a summary of the rest of the characteristics of the sample.

Bivariate Analysis and Logistic Regression of HBP, DMT2, and CR
Based on the demographic and anthropometric variables, the items included in FIND-RISC, and the presence of different chronic diseases studied, associations were analyzed through bivariate analysis and logistic regression (Tables 2-4).     DMT2 was present in 1.6% (95% CI 0.2-3.1) of the subjects. No significant differences were found for demographic or anthropometric variables except for BAI (p < 0.05). Table 3 shows the results regarding HBP. The prevalence of this disease was 22.5% (95%CI 18.9-26.95%) (p < 0.001).
The results of the CR variable are shown in Table 4, in which the relationship to employment status stands out (p < 0.05), being more prevalent in the unemployed (10.3%). Differences were also found with nutritional status (p < 0.001), where the prevalence of CR in obese participants (39.2%) stands out. All the anthropometric variables obtained significant differences except for ABSI. The number of significant variables remained constant in the logistic regression.

Comparison of Adjusted Models and Diagnostic Accuracy for CR
Finally, four adjusted logistic regression models were calculated to predict CR, set up with the significant variables of the crude model (Table 5). Out of the four models, models 1 and 4 were compared (Table 6), as they obtained a higher goodness of fit: r 2 = 0.62 and r 2 = 0.57, respectively. Model 1 included HBP as a qualitative variable (OR = 56.8 CI95% (15.15-214.21) (p < 0.001)), in addition to WC as a quantitative variable (OR = 1.26 CI95% (1.15-1.38) (p < 0.001)), and was excluded because of the wide confidence interval and the very high odds ratio. Moreover, Model 4 included high/low FINDRISC as a qualitative variable and obtained an OR of 7.86 CI95% (1.42-43.5), as well as two quantitative variables: SBP (OR = 1.08 CI95% (1.05-1.12) (p < 0.001)) and HC (OR = 1.24 CI95% (1.14-1.35) (p < 0.001)).
ROC curves were performed to determine the discriminant ability of Model 4 ( Figure 1). From these, cutoff values were determined to calculate diagnostic accuracy indicators. Thus, our model achieved a sensitivity of 73.68%, a specificity of 95.35%, and a Youden index of 0.69. The validity index was 94.21%.

Discussion
During our second stay in this area, this screening was conducted to determine the prevalence of non-communicable diseases (NCDs) such as HBP and DM, nutritional status, and CR. This time, we decided to complete the study with the 363 men for whom we did not obtain data during our first stay. As a result, the prevalence found for overweight and obesity were 37.5% and 20.3%, respectively, being lower than those of 39.1% and 21.3% previously obtained in the last report published by the Peruvian Health Institute and its Food Surveillance System by Life Stages (VIANEV) [18].
Regarding chronic diseases, the prevalence found was 1.6% for DMT2 and 22.5% for HT, again lower than the national average for men of 12.5% and 33.6%, respectively. CR, our main outcome variable, which we define as the joint presence of two or more of these NCDs (DMT2, HBP, obesity), reached a prevalence of 8.24%, well below the 43% found for the South American and Caribbean population in the meta-analysis by Huaquía-Díaz et al. [19].
The difference in the prevalence of DMT2 is remarkable compared to the national average. In this sense, it was theorized whether the highly endemic population, with a

Discussion
During our second stay in this area, this screening was conducted to determine the prevalence of non-communicable diseases (NCDs) such as HBP and DM, nutritional status, and CR. This time, we decided to complete the study with the 363 men for whom we did not obtain data during our first stay. As a result, the prevalence found for overweight and obesity were 37.5% and 20.3%, respectively, being lower than those of 39.1% and 21.3% previously obtained in the last report published by the Peruvian Health Institute and its Food Surveillance System by Life Stages (VIANEV) [18].
Regarding chronic diseases, the prevalence found was 1.6% for DMT2 and 22.5% for HT, again lower than the national average for men of 12.5% and 33.6%, respectively. CR, our main outcome variable, which we define as the joint presence of two or more of these NCDs (DMT2, HBP, obesity), reached a prevalence of 8.24%, well below the 43% found for the South American and Caribbean population in the meta-analysis by Huaquía-Díaz et al. [19].
The difference in the prevalence of DMT2 is remarkable compared to the national average. In this sense, it was theorized whether the highly endemic population, with a percentage of indigenous individuals of 21.3% in the case of men in the province of Maynas, where Pueblo Libre is located, would have led us to obtain percentages different from the national or regional averages, in which all phenotypes and ethnicities are considered collectively [20].
Initially, a better health situation was found compared to the rest of the country. However, abdominal obesity determined through anthropometric measures such as WHtR and WHR were above average, with 0.56 and 0.96, respectively. This determined the existence of cardiovascular risk in the local citizens, as also reported by Paz-Krumdiek et al. for the population of the other areas of the Peruvian Amazon [21].
In addition, based on the previous experience of other authors with Mediterranean populations from Spain [22], we related the items that compose the FINDRISC test and its cutoff points with factors that predispose the presence or increase in CR. It was highly significant when the test scores were grouped dichotomously between high and low risk to be related to HT, T2DM, obesity, and CR.

Importance of a Predictive Model for the Amazonian Population
The prevalence of the different NCDs in Pueblo Libre is much lower than the data collected in national reports This fact indiates the need to modify the cutoff points of the scales or variables used in diagnosing CDNs, requiring an adaptation to the phenotypic characteristics of the population. Other authors have already evidenced this in other contexts [23]. In addition to the lack of adaptation of existing models, specifically for the characteristics of this community, there was a complete lack of development of predictive instruments for our population and any other group with similar characteristics. Rodrigo M. Carrillo-Larco et al. [24] emphasized the same deficit in their research, which went beyond the countries bordering northern Peru to the rest of Latin America and the Caribbean.
Considering this deficiency, we developed several predictive models for CR based on logistic regressions that included the study's independent variables. Among these, Model 4 stood out, incorporating quantitative variables such as SBP, HC, and the dichotomous qualitative variable low/high FINDRISC. This model showed an adjusted coefficient of determination R2 = 0.58, a Youden index of 0.69, 73.68% sensitivity, and 95.35% specificity.
In other words, we believe that the proposed model improved de facto the predictive capacity of the CR for the Amazonian population. Furthermore, since it had only three explanatory variables obtained by interview and with non-invasive and manual techniques, the simplicity of its applicability stands out. Moreover, any health professional could implement the proposed model in any health context (hospital, community, educational, military, or social). However, this non-invasive approach could sidestep the reality of the leading health policies in most low-and middle-income countries since there is a gap between the development of laws that include tools for controlling NCDs and their actual applicability in the target population [25,26].

Comparison with Other Models
Previous models developed to predict CR included behavioral, and therefore subjective, variables and laboratory-analyzed variables that required a blood sample and a more significant number of resources and expenditures. These were easily applicable in high-income countries where access to private or public health care resources was readily available. Such is the case of the model carried out by Li Yang et al. with an AUC = 0.781 primarily based on clinical variables such as HDL cholesterol, LDL, triglycerides, basal blood glucose, etc. [27]. On the other hand, there were models obtained through artificial intelligence, such as machine learning, using information from databases that contain complete medical histories (previous diseases, treatments, surgical tests, diagnostic tests, etc.) from patients with no prior CR that obtained an AUC = 0.781 [28]. Even the Framingham test, widely used by the scientific community and with an AUC = 0.721 [29], included data on dyslipidemia that would be difficult to obtain in a setting such as the Peruvian Amazon.
However, another predictive model was found in the literature, reinforcing our prediction model's main arguments of simplicity, quickness, and non-invasiveness. In this case, it was developed for the early detection of metabolic syndrome (MS) and even shared some of the variables used for prediction, such as HBP [30].

Applicability in the Amazonian Context
Peru is a country that has, in macroeconomic terms, transitioned over the last few years from the group of low-income countries toward that of upper middle-income countries [31]. However, the Loreto region, where our study population is located, is far from the national average, and its characteristics are much more likely in a low-income country [32]. Indeed, the lower level of resources in this area highlights the value of the use of assessment tools such as the one developed in our study, since the combination of silent diseases and altered nutritional states, which can occur equally in high-income as in low-income settings such as the city of Iquitos, produce different functional forms. In other words, they have an unequal impact on quality of life and life expectancy, with more significant resources available for primary, secondary, and tertiary care [33]. Therefore, the tools and approaches used in the Western population should be adapted to people with the same characteristics as our sample, since their dietary patterns, lifestyles, and access to health care are radically different [34,35], making early diagnosis an aid in avoiding the need for care and treatment that are difficult to obtain.

Study Limitations
After analyzing the data, we considered that a larger sample should be available and contain data from both men and women to improve our model's accuracy. The ethnic and socio-cultural characteristics of such an endemic area meant that, in general, the prevalence of variables detected was different from that expected. In addition, the percentage of DMT2 was very low, probably affecting the fact that despite the good AUC obtained, there were wide confidence intervals, and the R2 was not greater than 0.6. It was theorized that with a larger sample size, these model limitations could be overcome, in addition to determining the causality of the low prevalence of DMT2, especially important for the high intake of carbohydrates and sugars.

Conclusions
A model for predicting CR in a specific population of the Peruvian Amazon was developed based on independent anthropometric variables, nutritional status, and the presence of HBP or DMT2. The model's advantages consist of its easy application by any health professional and, above all, without the need for blood tests or unavailable resources. The tool was used to measure CR, with results below expectations. In future studies, we will try to reach a larger sample size to refute the theory that the cutoff points proposed by the WHO to detect NCDs and obesity are not equally accurate for this population.