High blood pressure prediction based on AAA++ using machine-learning algorithms

: The heart pumps the blood around the body to supply energy and oxygen for all the tissues of the body. In order to pump the blood, heart pushes the blood against the walls of arteries, which creates some pressure inside the arteries, called as blood pressure (BP). If this pressure is more than the desired level, we treat it as high blood pressure (HBP). Present days, HBP victims are growing in number across the globe. BP may be elevated because of change in biological or psychological state of a person. In this paper, we considered attributes such as age, anger, and anxiety (AAA) and obesity (+), cholesterol level (+) of a person to predict whether a person is prone to HBP or not. Obesity and cholesterol levels are considered as post-increment of AAA, where obesity as one +, and total blood cholesterol as another + because experimental results reveal that their impact is less comparatively AAA. In our technique, we used different classifiers for prediction, where each classifier considers the impact of each A in AAA along with obesity and cholesterol level of a person to predict whether a person becomes a victim of HBP or not. Random forest algorithm has shown 87.5% accuracy in prediction.


Introduction
Blood pressure (BP) is represented as systolic blood pressure (SBP) over diastolic blood pressure (DBP). If SBP exceeds 140 mm Hg or DBP exceeds 90 mm Hg on repeated measurements then it is treated as high blood pressure (HBP) (Alwan, 2011). Nowadays, HBP is one of the prime causes of ABOUT THE AUTHOR Mr. Satyanarayana Nimmala presently is pursuing his Ph.D. from Osmania University, Hyderabad, India. He did his B.Tech from Kakatiaya University, Kothagudem, India. He pursued his M.Tech from JNTUH, Hyderabad, India. Presently, he is working as an associate professor in the Department of CSE, CVR College of Engineering, Hyderabad, India. His research interests are data mining, bioinformatics, and machine learning.

PUBLIC INTEREST STATEMENT
Before diagnosis and treatment of most of the diseases, doctors measure the blood pressure of a person. High blood pressure of person is a key factor for many diseases like heart stroke, brain stroke, kidney failure, eye damage, and many others. In our research work, we focused on why the blood pressure of a person is elevated, by considering factors such as age, obesity level, cholesterol level, anger level, and anxiety level of a person. We examined 1000 patients' data and we used the above-listed parameters to predict whether a person becomes a victim of High blood pressure or not. Results reveal that anger level, anxiety level, and obesity level are playing a vital role in elevating the blood pressure of a person. heart stroke and brain stroke. It may not be a serious problem if it is diagnosed and treated earlier, but undiagnosed HBP may cause a serious health problem. There are many reasons which may elevate the BP, like unhealthy diet, lack of physical exercise, excess bad cholesterol, obesity, age, anger, anxiety, etc (Anchala, Kannuri, & Pant, 2014;Forouzanfar et al., 2015). But in this paper, we focused on the impact of age, anger, anxiety, obesity level, and total blood cholesterol levels in elevating the BP. HBP is mainly affected by the cardiac output (CO) and total peripheral resistance (TPR) (Global action plan for the prevention and control of noncommunicable diseases [2013][2014][2015][2016][2017][2018][2019][2020]2013). Mathematically, it can be written as where CO is affected by increased venous return or stroke volume or heart rate or sympathetic activity (Peltokangas, Vehkaoja, & Verho, 2017). TPR is affected by the resistance that acts against the blood flow in the arteries (National Heart, Lung and Blood Institute). The arteries may show resistance to blood flow because of a blood clot in blood vessels or presence of fat inside the blood vessels or damaged blood vessels. CO affects the SBP, whereas TPR affects the DBP (Gupta, Lo Gerfo, Raingsey, & Al, 2013).
We collected age, obesity level, and total blood cholesterol levels of persons from Doctor C, in a medical diagnostic center Hyderabad, India. Although stress, anger, and anxiety may not spike BP for the longer duration of time (Peltokangas et al., 2017) but uncontrolled anger may affect relationships, career, mental and physical health. Anger and anxiety levels are measured using the response of an individual for the set of predefined questions. For anger measurement, we set 10 predefined questions and for anxiety measurement, we set 20 predefined questions. A total of 1000 people were interviewed and their response is noted on a scale of 0 to 3. The mean value of the response, for all the questions on anger and anxiety, is used in the prediction process along with age, obesity, and total blood cholesterol levels. Table 1 represents the level of hypertension (Alwan, 2011; Global action plan for the prevention and control of noncommunicable diseases 2013-2020, 2013).

Background work
This section reveals the existing work carried on each parameter in AAA++, which is the main reason for the elevation of BP. Age: Arteries become stiff and narrowed due to aging and the elasticity nature of arteries also gets decreased (Takazawa et al., 1998;Vishram, Borglykke, Andreasen, Jeppesen, & Ibsen, 2012). Obesity: HBP problems are more common among the patients with schizophrenia, mainly due to weight gain or obesity (Millasseau, Kelly, & Ritter, 2003). Cholesterol: Increase in cholesterol levels is able to influence BP, at least during sympathetic stimulation (Sakurai et al., 2011). Anger and anxiety: The American Institute of Stress reports that "stress," "pressure," "tension," and "anxiety" are often synonymous. US National Library of Medicine reports that our body produces a surge of hormones when we are in an anxious situation. These hormones increase your BP by causing your heart to beat faster and our blood vessels to narrow (Global action plan for the prevention and control of noncommunicable diseases 2013-2020, 2013). Though several researchers have addressed, how BP is elevated based on physical and psychological factors but these approaches suffer from the following drawbacks. • Fail to find the exact risk of age in elevating BP (Peltokangas et al., 2017;Vishram et al., 2012) • Fail to find the exact risk of obesity in elevating BP (Richard, 2009;World Health Organization) • Fail to find the exact risk of total blood cholesterol in elevating BP (Ferrara, Guida, Iannuzzi, Celentano, & Lionello, 2002;Kalyan & Kanitkar, 2015) • Fail to find the exact risk of the combined effect of age, obesity, and total blood cholesterol in elevating BP (A Global Brief on Hypertension, 2013; Sugathan, Soman, & Sankaranarayanan, 2008) • Fail to find the exact risk of psychological factors such as anger and anxiety in elevating BP (Global action plan for the prevention and control of noncommunicable diseases -2020World Health Organization, 2018) In this paper, our focus is on the combined effect of AAA++ in elevating the BP.

Impact factors
BP of a person is elevated because of biological and psychological changes. Biological changes are like age, increase in obesity level, and total blood cholesterol. Psychological changes are such as anger, anxiety, stress, depression, and fear. But the exact influence of each of these factors is left for research. In this paper, we considered AAA++ to predict whether a person is prone to HBP or not. The rest of this section discusses how each A and + in AAA++ elevates the BP.

Impact of age
Aging is inevitable, although a person has a healthy diet and exercise regularly. It affects the heart performance in pumping blood. Heartbeats are regulated by the natural pacemaker system in the heart. If the age of a person increases then pathways of the heart's pacemaker system deposits fat, which will affect the heart performance while pumping the blood (Vishram et al., 2012). When we age, the elasticity nature of arteries also decreases, they become stiff (Millasseau. et al., 2003;Takazawa et al., 1998). In such a situation, to pump the blood throughout the body through arteries, the heart has to push the blood using more force, this may, in turn, elevate the BP.

Impact of anger
Anger may be the result of impatience, frustration, irritation, and many others. It may be positive emotion at some time but most of the time it is not good for health and state of mind. The way how anger is handled has a significant effect on heart and mind. Frequent explosive anger may lead to serious consequences like elevated BP, the rise in a heartbeat, and pulse rate. If a person gets angry, then the fight or flight mode of sympathetic nervous system gets activated (Global action plan for the prevention and control of noncommunicable diseases 2013-2020, 2013). As a response, nerves send more blood to muscles and brain, which elevates the BP (Gupta et al., 2013). Though suppressing and ignoring the anger is not good for health, but letting it go is also not good. So everyone should master anger in a way that the impact of it to be as minimal as possible. The questionnaire used for anger measurement is as shown in Table 2.

Impact of anxiety
Stress and anxiety are slightly different, even they represented on the same scale. The active form of stress may be considered as anxiety, and the active form of anxiety may be considered as depression. Nowadays, stress is one of the key factors that impact the quality of our regular life. If stress is chronic that happens frequently, it may become anxiety. Anxiety response creates specific thought pattern in mind, which gets executed repeatedly (Global action plan for the prevention and control of noncommunicable diseases 2013-2020, 2013). The person, who is a victim of anxiety, thinks again and again about the worst possible outcome of an ambiguous situation, where the possibility of happening best is more. Factors such as negative thinking, fear, insecurity, lack of something compared to others, thinking about a specific thing that may happen in future, lack of confidence, doing wrong things which are not ethical, not getting things right as per his or her perception, expectations from friends, relatives and closed ones, etc. may be the triggering factors for anxiety. When a person is anxious, fight or flight mode of sympathetic nervous system gets activated (A Global Brief on Hypertension, 2013; Sugathan et al., 2008). As said earlier, it elevates the BP. However, anxiety and long-term HBP may not be linked. The body produces a surge of hormones such as adrenaline and cortisol, when we are in an anxious situation. This, in turn, may tighten the arteries. Our experimental analysis also reveals that there is a significant impact of anxiety, in raising the BP. The questionnaire used for anxiety measurement is as shown in Table 3.

Impact of obesity
Body mass index (BMI) is used to measure to find where you fall on the scale of obesity. BMI is a measure of weight proportionate to height. If the BMI value is in between 18.5 and 24.9 is treated as normal. If BMI value is greater than 25 and less than or equal to 30, then it is treated as overweight (National Heart, Lung and Blood Institute, 2016). If the BMI value is more than 30, then the person is treated as obese. Obesity is considered as increased fatty tissue in the body (Kalyan & Kanitkar, 2015;Mertens & Van Gaal., 2000;Richard, 2009). So for the livelihood of increased fatty tissues, heart pumps the blood with some additional force to reach newly formed body tissues, which may spike the BP.

Impact of cholesterol
Although for the birth and development of body tissues, cholesterol is needed, but too much cholesterol is not good for the well-being of human body. Lipoproteins (small packages) are transporters of cholesterol in the human body. Lipoproteins are two types, LDL (low-density lipoprotein) cholesterol which is worst cholesterol which is not needed for the body (Kanai et al., 1990). HDL (high-density lipoprotein) cholesterol is called good cholesterol. HDL is the most required cholesterol for the functioning of many hormones of the human body. HDL carries cholesterol from all parts of the body back to the liver, where cholesterol is filtered and sent out from the body. If LDL is high, this forms fatty substance inside the arteries. This fatty substance reduces the diameter of arteries and raising the BP.

Proposed methodology
Although anger and anxiety elevate the BP temporarily, but repeated activation of these two may lead to long-term BP also. Different factors that influence BP of a person directly or indirectly are shown in Figure 1. It represents how each A and + in AAA++ elevates BP, It also represents an increase in blood volume or increase in heart rate or increase in stroke volume also increases BP. These are normally influenced by sympathetic and parasympathetic nervous system of human body. In this paper, we used a data-mining classification technique. Classification is the technique used to predict the class label of a data record or to represent a descriptive analysis of data record for taking effective decisions (Satyanarayana, Ramalingaswamy, & Ramadevi, 2014). It is also called as supervised approach. The classification model consists of two stages: In stage 1, training  stage, the model is trained by a set of records, whose class labels are already known. In stage 2, testing stage, the model goes to predict class labels of a set of records, whose class labels are unknown, also called as test records. There are various classifiers but for experimental analysis, we used classifiers supported by WEKA (Waikato Environment for Knowledge Analysis). WEKA supports various machine-learning (ML) algorithms. ML algorithms can be broadly classified into two groups: supervised and unsupervised algorithms. Supervised are categorized as classification and regression algorithms. As we have compared our experimental results with J48, Naïve Bayes and simple logistic regression classifiers, rest of this section discusses these ML algorithms (Satyanarayana et al., 2014).
Experimental analysis is done on real-time dataset consisting of 1000 records, which is collected from Doctor C, in a medical diagnostic center, Hyderabad, India. Each record consists of age, anger level, anxiety level, obesity level, total blood cholesterol level, and SBP and DBP of a person. We used 60% records to train the model, and 40% records to test the model. Random forest algorithm results showed 87.5% accuracy, which is higher in prediction compared to other ML algorithms.

Anger measurement
The literature says anger and BP may not associate for a longer period of time (World Health Organization). But our experimental results show that there is a significant effect of anger along with the anxiety of a person in elevating the BP. Our proposed technique measures the level of anger by using the responses obtained from the predefined questionnaire as listed in Table 2. Sample questions used for anger measurement are like waiting for anything annoys me, gets angry for the delay in completion of any assignment, gets angry if things won't go on my path, and I find difficult to forgive people who did wrong to me. We used 10 such questions and for each question, the answer is marked as one of the following options (a) no, never; (b) yes, rarely; (c) yes, often; (d) yes, most of the time. In the data-preprocessing phase, option a is considered as 0, option b is considered as 1, option c is considered as 2, and option d is considered as 3. Based on the mean value of all the answers, we considered either floor value or ceil value of mean for experimental analysis.

Anxiety measurement
Our proposed technique measures the level of anxiety by using the responses obtained from the predefined questionnaire as listed in Table 3. Sample questions used for anxiety measurement are like the existence of constant fear about something, facing breathing difficulty often, feeling of not having desired things in life, often scared without the clear reason, often aware of the heartbeat without doing physical exercise, and sense of dryness in the mouth. We used 20 such questions and for each question, the answer is marked as one of the following options: (a) no, never; (b) yes, rarely; (c) yes, often; (d) yes, most of the time. In the data-preprocessing phase, option a is considered as 0, option b is considered as 1, option c is considered as 2, and option d is considered as 3. Based on the mean value of all the answers, we considered either floor value or ceil value of mean for experimental analysis.

J48 algorithm
J48 is a decision tree based WEKA implemented C4.5 classification algorithm. A decision tree based classifier classifies the input instances by passing it through the tree starting at the top and getting down till to the leaf node (Satyanarayana et al., 2014). A leaf node value represents the predicted output value for a given input instance. Initially, information gain (IG) is calculated for each attribute of input instance. The attribute with highest IG is selected as splitting attribute. Recursive approach is used to divide the remaining instance at each node. IG of an attribute A is calculated at the selected node using where S is the set of instances at that node, and |S| is its cardinality, and S v is the subset of S for which attribute A has value v. The entropy of the set S is calculated using the following equation: where p i is the probability of instances in S which belongs to the ith class, and n is the number of classes.

Naïve Bayes classifier
This is a simple probabilistic classifier based on Bayes theorem. It constructs a classification model by learning the conditional probabilities of each input attribute (Satyanarayana et al., 2014). The same model is used to predict the class membership of input instance using the following equation: where P(x|y) is defined as the probability of observing x, given that y occurs. P(x|y) is called posterior probability P(y|x), P(x), and P(y) are called prior probabilities.

Simple logistic regression algorithm
The linear regression algorithm of WEKA calculates standard least squares to find a linear relationship in the training data. Standard linear regression is applied to the input attributes to get the predicted the output, which is calculated as w 0 þ w 1 a 1 þ :::::::: where a j are the input attributes, and w j are the weights associated with them (Vishram et al., 2012).

Performance measures used for classifier evaluation
The classifier performance is measured using the following measures which are represented inTable 4. Accuracy is used to find the proportion of correct classifications from an overall number of cases. Error rate represents percentage of wrong predictions. Precision represents the proportion of correct positive classifications from the cases that are predicted positive. The recall represents the proportion of correct positive classifications from the cases that are actually positive. F-measure is a weighted harmonic mean of precision and recall.
In Table 4, P is the total number of positive records, N is the total number of negative records, TP refers to the positive records which are correctly labeled by the classifier, TN is the negative records which are correctly labeled by the classifier, FP is the negative records which are improperly labeled as positive, and FN is the positive records which are incorrectly labeled as negative.

Experimental results and analysis
We collected data from 1000 people. Each person's data are considered as one record, for each record age of a person, anger level, anxiety level, obesity, blood cholesterol, SBPs, and DBPs are recorded. If anger level and anxiety level are below 1, we considered their floor value, and for more than 1, we considered their ceil value for the better prediction. Table 5 is used to convert SBP and DBP values to get the class label attribute. For performance of classifier (Satyanarayana et al., 2014), the details of the dataset are as shown in Tables 6-10. For anger measurement and    anxiety measurement, data collection is done manually by interacting with the people using the predefined questionnaire as shown in Tables 2 and 3. We have used a data-mining tool WEKA (Waikato Environment for Knowledge Analysis) for experimental analysis. It is open source software, consisting of many ML and data-mining algorithms. WEKA processes the input data using ARFF (attribute file format). So data collected are converted into an ARFF file in the datapreprocessing phase. Figure 2 represents age on the X-axis, where the minimum age is 20, the maximum age is 65. The Y-axis represents the anger level, where the minimum anger level is 0, and the maximum anger level is 3. Here, red "x" represents data records that are predicted as YES, black "x" represents data records that are predicted as NO. Figure 3 represents age on the X-axis, where the minimum age is 20, themaximum age is 65. The Y-axis represents the anxiety level, where the minimum anger level is 0, and the maximum anxiety level is 3. Here red "x" represents data records which are predicted as YES, black "x" represents data records which are predicted as NO. Figure 4 represents anxiety level on the X-axis, where the minimum age is 0, themaximum age is 3. The Y-axis represents theanger level, where theminimum anger level is 0, and themaximum anger level is 3. Here red "x" represents data records which are predicted as YES, black "x" represents data records which are predicted as NO. Figure 5 represents obesity level on theXaxis, where the minimum age is 15.5, themaximum age is 37. The Y-axis represents cholesterol level, where theminimum cholesterol level is 102 and themaximum cholesterol level is 258. Here  red "x" represents data records which are predicted as YES, black "x" represents data records which are predicted as NO.

Conclusion and future work
In this paper, we used age, anger, anxiety, obesity, and cholesterol levels of a person to predict whether a person is prone to HBP or not. We used different classifiers to predict whether a person becomes a victim of HBP or not. Among all classification algorithms used for experimental analysis,   random forest algorithm (Table 11) has shown better accuracy. Particularly, it has shown better performance in classifying negative records. It also outperformed, in terms of precision, recall, and F-measure comparatively with other classification algorithms, such as simple logistic regression, Naïve Bayes, J48, and REP tree algorithms. In future; we would like to consider other attributes such as gender, smoking, alcohol consumption, job satisfaction, and marital status to improve the prediction performance of the classifiers.

Funding
The authors received no direct funding for this research.