Next Article in Journal
A Human Induced Pluripotent Stem Cell-Derived Isogenic Model of Huntington’s Disease Based on Neuronal Cells Has Several Relevant Phenotypic Abnormalities
Previous Article in Journal
Genome Wide Epistasis Study of On-Statin Cardiovascular Events with Iterative Feature Reduction and Selection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of Mental Illness in Heart Disease Patients: Association of Comorbidities, Dietary Supplements, and Antibiotics as Risk Factors

1
Department of Systems Science and Industrial Engineering, The State University of New York at Binghamton, Binghamton, NY 13902, USA
2
Department of Biological Sciences, The State University of New York at Binghamton, Binghamton, NY 13902, USA
3
Health and Wellness Studies Department, The State University of New York at Binghamton, Binghamton, NY 13902, USA
4
Department of Computer Science and Engineering, Institute of Technology, Nirma University, Ahmedabad, Gujarat 382470, India
*
Author to whom correspondence should be addressed.
J. Pers. Med. 2020, 10(4), 214; https://doi.org/10.3390/jpm10040214
Submission received: 11 September 2020 / Revised: 23 October 2020 / Accepted: 5 November 2020 / Published: 9 November 2020

Abstract

:
Comorbidities, dietary supplement use, and prescription drug use may negatively (or positively) affect mental health in cardiovascular patients. Although the significance of mental illnesses, such as depression, anxiety, and schizophrenia, on cardiovascular disease is well documented, mental illnesses resulting from heart disease are not well studied. In this paper, we introduce the risk factors of mental illnesses as an exploratory study and develop a prediction framework for mental illness that uses comorbidities, dietary supplements, and drug usage in heart disease patients. Particularly, the data used in this study consist of the records of 68,647 patients with heart disease, including the patient’s mental illness information and the patient’s intake of dietary supplements, antibiotics, and comorbidities. Patients in age groups < 61 , gender differences, and drug intakes, such as Azithromycin, Clarithromycin, Vitamin B6, and Coenzyme Q10, were associated with mental illness. For predictive modeling, we consider applying various state-of-the-art machine learning techniques with tuned parameters and finally obtain the following: Depression: 78.01% accuracy, 79.13% sensitivity, 72.65% specificity, and 86.26% Area Under the Curve (AUC). Anxiety: 82.93% accuracy, 82.86% sensitivity, 83.35% specificity, and 88.45% AUC. Schizophrenia: 87.59% accuracy, 87.70% sensitivity, 85.14% specificity, and 92.73% AUC. Disease: 86.63% accuracy, 95.50% sensitivity, 77.76% specificity, and 91.59% AUC. From the results, we conclude that using heart disease information, comorbidities, dietary supplement use, and antibiotics enables us to accurately predict the mental health outcome.

Graphical Abstract

1. Introduction

Heart Disease is one of the most prevalent conditions in developed countries. It is the leading cause of death in men and women in the United States. Every 37 s, a person dies of Cardiovascular Disease (CVD) [1]. Around 647,000 Americans die of heart disease every year, which is 1 in every 4 deaths. Coronary Artery Disease (CAD) is the most common type of heart disease that killed 365,914 people in 2017. About 18.2 million people in the age group of 20 years and older have CVD [2,3]. Patients with both depression and CVD are three times at risk of mortality than the general population [4,5]. The more severe the depression is, the higher the risk of mortality and further complications from CVD events. The prevalence of depression and anxiety, along with CVD, is bidirectional [6]. This means that the risk of developing CVD in patients with mental illnesses is high, as is the risk of developing depression and anxiety in patients with CVD, which may worsen prognosis [7]. Excess mood disorders and anxiety have been found among people with heart disease irrespective of the countries despite their mental illness prevalence rate [8].
Several types of research have been conducted to study the risk of heart disease in patients with mental illnesses. In a study [9] that included 28,734 patients with and without the diagnosed psychotic disorder, it was revealed that those with diagnosed disorders have a significantly higher risk for CVD compared to patients without any diagnosed mental illness. Additionally, in a study aimed to identify CVD risk among veterans, anxiety was associated with CVD mortality in men, and depression was associated with CVD complications among women [10]. Patients with depressive disorders and schizophrenia have an increased risk of developing CHD. Anxiety, along with post-traumatic stress disorder (PTSD), is associated independently with the risk of developing CHD [11,12]. In a case-control study with 3,211,768 patients and 113,383,368 controls, those with schizophrenia had a higher risk for CHD, CVD, and congestive heart failure [13].
Among middle-aged women, depression is strongly associated with obesity, lower physical activity, and a high-calorie diet [14]. Obesity increases the risk of developing depression as obese individuals have a higher prevalence of depressive disorders. Additionally, depression was also found to be predictive of obesity [15]. A cross-sectional study on 151,389 patients (age ≥ 18 years) with one or more types of anxiety identified the association of anxiety and hypertension [16]. While in [17], a cross-sectional cohort study on 2028 depressed or anxious patients, the analysis concluded that depression is associated with low systolic blood pressure and less hypertension. However, both hypertension and systolic, as well as diastolic blood pressure, increased with the use of antidepressants.
Recurrent antibiotic use is associated with a higher risk of depression and anxiety. The study concluded that clarithromycin could induce psychosis manifestations among adult and pediatric patients [18,19]. In addition to that, clarithromycin could induce anxiety, hallucinations, and emotional liability [18,20], while [21] reported dizziness, derealization, and a sense of running very fast with no perception of rotation association with emotional liability, panic-anxiety, and unmotivated crying. In an FDA-sponsored study, clarithromycin showed a 10% increased risk of death from any cause and a 19% increase of risk in developing CVD [22]. Erythromycin associated psychosis was reported in [23]. When a 28-year-old male with schizophrenia conditions was given erythromycin for pityriasis rosea, he suffered from akathisia soon after he took the drug. The literature concluded that akathisia may be induced or precipitated by erythromycin by interfering with other drugs [24].
Although several studies have been conducted to identify the risk factors and the bidirectional association of mental illnesses and heart disease, the prediction of mental illness among patients with heart disease in terms of comorbidities, dietary supplement use, and drug usage is sparse. Therefore, this research serves as an exploratory study to identify the risk factors of mental illnesses using comorbidities, dietary supplement use, and antibiotic usage in heart disease patients. In addition to that, the purpose of this research is to accurately predict mental illness (depression, anxiety, and schizophrenia) in heart disease patients and make use of the data-driven model for a real-time prediction system in the health care setting. The datasets in the medical application are not well balanced most of the time, which incurs a highly disproportionate outcome distribution. Our prediction modeling framework deals with the class imbalance problem. Depression, anxiety, and schizophrenia are considered as separate targets, while an aggregate of these three mental illnesses as an outcome for the prediction was also employed in our research.

2. Materials and Methods

2.1. Variable Selection

Logistic Regression (LR) is a mathematical model that is used to select the statistically significant variables with p - value < 0.05 . The significant variables are then trained using the methods described below. In addition to that, adjusted R 2 ( A R 2 ) is used for model selection. A good model with useful variables will have a high adjusted R 2 . Adding a useless variable reduces the adjusted R 2 . It is given by R a d j 2 = 1 ( ( 1 R 2 ) ( n 1 ) n p 1 ) , where n is the number of data samples, p is the number of variables in the model, R 2 = 1 D D 0 , for a binomial outcome variable. D represents the deviance of the fitted model, D = 2 log ( β ^ ) and D 0 represents the deviance of the null model, D 0 = 2 log ( β 0 ^ ) . Bayesian Information Criteria (BIC) select the best subset of variables by penalizing the fitting model based on the number of predictors p . It is the function of the likelihood of the fitted model B I C = 2 log ( β ^ ) + p log n .
Four different types of approaches are implemented: no variable selection and no undersampling; no variable selection but undersampling; variable selection and no undersampling; variable selection and undersampling. For the approaches that involve variable selection, logistic regression that selects the statistically significant variables ( p - value < 0.05 ) , adjusted R 2 , and BIC are all implemented before the sampling procedure. A similar approach was discussed in [25], where three different approaches were considered in terms of sampling and variable selection. In the literature mentioned, Approach 1 selected the features after sampling, but unsampled data was retained. Approach 2 selected the features after sampling, but the sampled data was retained. Approach 3 sampled after the feature selection. The literature concluded that Approach 1 performed better. In our research, only Approach 3 was assimilated.

2.2. Prediction

Machine learning, in its broad-spectrum, learns from the data using any computational algorithm applied to a data sample [26]. The computational model created is used to automatically improve the prediction through pattern recognition or function approximation by using training or historical data. The model created using training data is then tested on the test data set. Some of the model metrics to assess the trained model are explained in detail in the later section. Using these metrics, the best model of interest is chosen for prediction. Machine learning applications are widespread in many kinds of research, which include but are not limited to health care and other clinical studies.
Some of the machine learning algorithms used in this study for predicting mental illnesses are Random Forest (RF) [27], Decision Tree (DT) [28], Naïve Bayes (NB), Extreme Gradient Boosting (XGBoost) [29], LightGBM (LGBM) [30], and Artificial Neural Networks (ANN) [31]. Random Forest is an ensemble of methods that create a multitude of decision trees at training time for each of the randomly selected bootstrap samples. Using bagging, the Random Forest classifier selects the best split of decision trees among the samples considered during each training step. Prediction on the new data is based on the aggregation of all the decision trees at each step of the training using the majority vote. Decision Tree uses a tree-based classification based on the input features. The non-leaf node represents input features; the leaf node represents target features. Using the information criterion such as Gini Index, Mutual Information, etc., the inputs are split so that they provide the most information, and the target features are classified. The features are recursively split into nodes until the tree reaches a stopping criterion. Naïve Bayes is a probabilistic model that assumes that all the features are independent. The name comes from the naïve assumption that the features are independent. Using conditional probability between the features and the target, the posterior probability is computed based on Bayes’ theorem. Extreme Gradient Boosting is one of the ensemble learning methods that use boosting. This is similar to [32] but is faster and scalable compared to Gradient Boosting Machine (GBM). This algorithm is tree-based learning similar to the Random Forest that uses a distributed gradient boosting technique. Weak learners are added to improve the performance and make it a strong learner. One weak learner might not work well on the data, but the addition of new weak learner will relatively improve the performance. Boosting determines which weak learner should be added next for the given data. This aggregation of weak learners during the training becomes a strong learner and improve model performance. This methodology is implemented in the distributed framework in XGB to make the computation fast and provide accurate predictions. LightGBM is a tree-based learning algorithm similar to XGB. LightGBM has a faster training speed and offers higher accuracy most of the time. This has low memory usage and parallel computing support. This works well on big datasets with larger dimensions and in the cases where the dataset occupies memory. ANN is a neural network model with one or more hidden layers between input and output layers. It approximates classification mathematically for linear and non-linear features. Each layer is fully connected to the previous nodes. Each of the connected nodes is associated with weights. At the training, based on the objective function used, the probability of the weights reaches a certain threshold, and the neurons are fired, which means that the respective fired neurons add more weight in the prediction process.
The dataset is split into a 70% training set and a 30% test set. The dataset is highly imbalanced; therefore, the dataset is undersampled using Synthetic Minority Over-sampling Technique (SMOTE) [33] to match the number of minority classes. With undersampling, the information, as opposed to the original sample size, is still retained. The sampling procedure depends on the approach, which will be covered in the upcoming sections. Classification algorithms are modeled on the training set for four different targets using 5-fold cross-validation. The test set is used for prediction after the model fitting.

2.3. Performance Measure

The aforementioned algorithms are assessed based on accuracy, recall (sensitivity), F1-score, specificity, and Area Under the ROC (Receiver Operating Characteristic) Curve (AUC). True Positive (TP) is the number of correctly classified non-mental illness cases. True Negative (TN) is the number of correctly classified mental illness cases. False Positive (FP) is the number of incorrectly classified non-mental illness cases as mental illness cases. False Negative (FN) is the number of incorrectly classified mental illness cases as non-mental illness cases. Accuracy measures the proportion of correctly classified non-mental and mental illness cases against all the samples. Accuracy is given by A c c u r a c y = T P + T N T P + F P + F N + T N . Recall measures the proportion of actual non-mental illness cases classified correctly against all the non-mental illness case samples. The recall is given by R e c a l l ( S e n s i t i v i t y ) = T P T P + F N . F1-Score measures the average of precision and recall. F1-score is given by F 1   S c o r e = ( 2 R e c a l l P r e c i s i o n R e c a l l + P r e c i s i o n ) , where P r e c i s i o n = T P T P + F P , measures the proportion of correctly classified non-mental illness cases against all the classification of non-mental illness cases. If F1-score is high, then both precision and recall is better. Specificity is the measure of correctly classified mental illness cases against all the sample’s mental illness cases. Specificity is given by S p e c i f i c i t y = T N T N + F P . ROC measures the True Positive Rate (TPR) against the False Positive Rate (FPR) at different thresholds. The AUC value determines the area under this ROC curve. The higher the value of AUC better the model in terms of differentiating between mental and non-mental illness cases.
Many researchers inculcated machine learning and artificial neural networks in the health care domain. Machine learning finds its application in a wide range of healthcare-related problem-solving capabilities. Some of the key research has been done in the area of heart disease prediction and mental illness’ association with heart disease. Authors in [34] generated risk prediction models for patients with severe mental illness, such as schizophrenia, along with gender, age, diabetes, Body Mass Index (BMI), as well as the use of antidepressants and other antipsychotic drugs. This risk prediction model proved to be better than the Framingham model [35]. For a large population in eastern China with high-risk CVD, a prediction model for a three-year risk assessment was implemented using random forest on 29,930 subjects [36]. The diagnosis of coronary artery disease was conducted using the Classification and Regression Tree (CART) in [37]. In [38], using the Cleveland heart disease dataset from the UCI machine learning repository, coronary artery disease was classified by Naïve Bayes. For the National Health and Nutrition Examination Survey (NHANES) dataset and the Framingham Heart Study CHS dataset, the XGB algorithm was implemented to predict cardiovascular disease [39]. A deep belief network, one of the most common deep learning classifiers, was used to diagnose the Coronary Artery Disease (CAD) using the 24-h ECG signal segments [40].
Python v3.7 and R version 3.6.3 using Jupyter IDE are used for the analysis. Pandas package in Python is used for data wrangling. Preprocessing of data was carried out using sklearn package in Python. The adjusted odds ratio is implemented using the package statsmodels in Python. The undersampling procedure is implemented using imblearn package in Python. Variable selection is implemented using leaps package in R programming. The variables selected in R programming are used to select the attributes in the dataframe in Python for creating a prediction model. The prediction models Random Forest, Decision Tree, and Naïve Bayes are created using sklearn package in Python. XGBoost is implemented using xgboost package in Python. LightGBM is implemented using lightgbm package in Python. The artificial neural network structure is created using keras package with TensorFlow backend in Python. The performance measures are implemented in Python.

3. Results

3.1. Dataset

The study protocol was reviewed and approved by the Internal Review Boards of United Health Services (UHS) and Binghamton University. Typically, patients at their first visit undergo a comprehensive medical screening for a history of medical and mental conditions, which is updated at follow-up visits. The dataset was deidentified prior to receipt, including only a few interest variables to the research team. The age-groups were aggregated to minimize the identification of the patients. A de-identified database of 68,647 heart disease patient records was provided by the Cardiology Group at UHS. Upon receipt, the data was stored on the computer of one of the principal investigators with a strong password. The database was only shared with individuals involved directly in the analysis of data. The database included gender, age bracket in 10-year increment, BMI, type of heart disease, list of comorbidities and supplement use, laboratory results, antibiotic use, and mental health. Heart diseases include Coronary Heart Disease (CHD), Cardiovascular Disease (CVD), Congenital Disease of Heart (CDH). The data representation for heart diseases is 0 for patients with no heart disease and 1 for patients with that specific heart disease. To better understand the associations, the attributes of CHD, CVD, and CDH are all considered as individual attributes. The categorical attributes are represented by 0 for no and 1 for yes. The mental illness included depression, anxiety, and schizophrenia, which were represented by 0 or 1; in which 1 represents a patient with that specific mental illness. The attributes are encoded and preprocessed using sklearn package in Python 3.7. Table 1 shows the number of observations and percentage distribution of the mental illness and non-mental illness patients segregated by the category of each attribute. A computed column, Disease, is added to the dataset as a target. Disease is 1 if either depression = 1 or anxiety = 1 or schizophrenia = 1 or if all the three mental illnesses represent 1. It is 0 if all three mental illnesses represent 0. The total number of rows with any one of the mental illnesses (Disease) accounts for 32.61%. The non-mental illness patients represent 67.39% of the dataset. The columns LAB and LabValue contains 97% missing data, while Gender had 1 row of unidentified gender listed in the data; therefore, these two columns and one of the unidentified gender rows were removed from the analysis. Of the total 68,646 rows, the number of the missing values in the whole dataset account for 437 rows, of which 434 rows of missing values correspond to the attribute BMI and three missing values in the attribute Gender. Consequently, a total of 68,209 rows and 29 columns were used for the research. The target columns here are depression, anxiety, schizophrenia, and disease. Table 2 shows the number of observations and the percentage distribution of a combination of mental illnesses. A summary of patients with one or more than one mental illness is listed in the table.

3.2. Association of Dietary Supplements, Comorbidities and Drug Usage in Mental Illness Patients

Some of the previous literatures use odds ratio and adjusted odds ratio to identify the risk factors. Psychosis had the largest effect among males and females on CVD among veterans with mental illnesses. The literature used an adjusted odds ratio to determine the effect [10]. Patients with two or more anxiety symptoms had CHD risk and sudden death, which was identified using an adjusted odds ratio [12]. The meta-analysis conducted in [15] showed a reciprocal link between obesity and depression using the odds ratio. A cross-sectional study on 151,389 patients ( a g e 18   y e a r s )   with one or more types of anxiety used final pooled odds ratio and the pooled adjusted hazard ratio to identify the association of hypertension with anxiety [18]. Based on the odds ratio and the associated confidence interval (CI) as well as the p-value, suicidal behavior risk factors are identified in [41]. Multivariate logistic regression was used to analyze the association between suicidal behavior and the factors mentioned above. In [18], a case-controlled study was implemented on a medical records database in the UK to study the association of antibiotic exposure on depression, anxiety, and psychosis. Table 3, Table 4, Table 5 and Table 6 shows the odds ratio for depression, anxiety, schizophrenia, and disease. The tables show the number of patients with the mental illness, each attribute’s upper and lower confidence interval under a 95% confidence interval, adjusted odds ratio with its associated p - value . Each of the attributes listed is statistically significant under a 95% confidence interval. The predictors with p - value   >   0.05 are excluded from the analysis. The odds ratio listed is an adjusted odds ratio which accounts for confounding variables. The disease represents all three mental illnesses as described in the previous section. Since age is not a continuous attribute, the 10-year increment in the age group is categorized with the age group 0–10 representing a reference level. BMI is categorized for the sake of understanding which BMI level is a risk factor for mental illnesses.

3.3. Modeling the Mental Illness Prediction Framework

While there was no hyperparameter tuning involved, the model was trained using the default parameter settings for all the algorithms except ANN. For the decision tree, entropy was used as the information criterion. For ANN, the network is designed with 24 neuron-input layers, one hidden layer of 16 neurons, and an output layer. Early stopping criteria were imposed on AUC, which used an Adam optimizer and a Binary Cross Entropy-loss function. A drop-out layer with a probability of 0.5 was used to avoid overfitting. The illnesses are trained as separate targets using random forest, decision tree, Naïve Bayes, XGB, LightGBM, and ANN using 5-fold cross-validation. During variable selection for some of the approaches, the variables are selected using either adjusted R 2 , logistic regression, or BIC before undersampling. Adjusted R 2   and BIC is implemented using R programming, while logistic regression is implemented in python to select the best variable set. After undersampling using SMOTE, the sampled data is split into 5-folds, while in each of the folds, the test set is predicted. At the end of the fold, the reported metrics are the average of the 5-folds. The best performing model is selected based on AUC. The threshold for identifying positive and negative class is 0.5 because the data is balanced using undersampling. The reported values are the average of the 5-folds. The table shows the best performing model using different approaches and the considered algorithms. The comparison of the baseline models and the models trained using the approaches are all reported in the Supplementary material. Table 7 shows a comparison of four approaches for each mental illness. Under the variable selection column in the table, the respective variable selection methodology used is listed, when it is applied. When no such approach is used, it is listed as no. Yes/no undersampling column represent the application of undersampling procedure for the respective model. A model with no variable selection and no undersampling is the baseline model for the targets. The top row in each of the mental illness categories is the best prediction model.

4. Discussion

4.1. Risk Factors of Mental Illness

The interpretation of the odds ratio is provided in the supplementary material. In Table 3, age brackets 1120, 2130, 3140, 4150, 5160, 6170, and > 70 are associated with depression. In addition to that, the female population has higher odds of depression compared to the male population. People who take Z_pak are likely to have depression compared to the population that does not take it. With < 18.5   kg / m 2 as the reference level for BMI, people with BMI 40   k g / m 2 are likely to be diagnosed with depression. Although other attributes listed like osteoarthritis, coronary artery disease, obesity, and hypertension have a lower association with depression, [11,12] mentioned coronary heart disease as one of the risk factors. This partially matched with our analysis. [14,15] concluded that obesity is associated with depression. Although there is no strong association encountered, obesity is associated with lower odds of depression. Similarly, hypertension has lower odds of depression, as mentioned in [17].
In Table 4, the female population has higher odds of developing anxiety compared to the male population. People who take clarithromycin, Z_pak, and CoQ have higher odds of developing anxiety. Only the age groups 21–30 and 31–40 are susceptible to developing anxiety. The other predictors listed show lower odds of developing anxiety. Coronary artery disease is associated with lower odds of anxiety, which partially match with [11,12]. In addition to that, hypertension is associated with lower odds of anxiety, partially matching the conclusions from [16]. Clarithromycin is associated with higher odds of anxiety, which agrees with [18,20,21].
In Table 5, schizophrenia has a smaller sample size, so the variables selected show association with lower odds of schizophrenia.
Table 6 represents all three mental illnesses combined. Age groups except for > 70 show association of higher odds of developing mental illness. The Female population is susceptible to mental illness compared to the male population. Drug intake such as clarithromycin, Z_pak, VitB6, and CoQ is associated with higher odds of developing any of the mental illnesses.

4.2. Depression

Compared to the best model (72.65%) the baseline model’s specificity for depression was inaccurate (0.61%). The lower specificity in the baseline model is mainly due to the imbalanced distribution of the classes. The prediction of depression is best with the variable selection that uses adjusted R 2 and undersampling. LightGBM that uses this approach outperformed on all the metrics compared to the other models. AUC increased to 86.26% in the case of variable selection in comparison to no variable selection, which is 85.18%. The performance of other approaches is provided in the supplementary material. The increase in AUC is obtained after removing Gender, CerebrovascularDisease, and elevatedESR. Having these variables in the model had little to no effect on the prediction. The total number of features in the final model had 21 variables. Since CVD was not one of the predictors of depression, the bidirectional association [6,7] was not concluded in our study. Although CAD is one of the predictors for depression, this inference match with [10].

4.3. Anxiety

Specificity for anxiety’s best prediction model using LightGBM (83.35%) is better than the baseline model’s specificity using LightGBM (57.01%). The specificity of anxiety in the baseline model is high in comparison to other mental illnesses because it has a large sample size in the dataset among all the other mental illnesses. There are only two best models that have significantly less difference in performance for the prediction of anxiety. LightGBM and XGB gave AUC of 88.45% and 87.93% respectively. These two were obtained using a variable selection that uses adjusted R 2   and undersampling. The same set of variables emerged similar to the case of depression. LightGBM was selected as the best performing model because it has higher specificity in comparison to XGB. However, the model AUC value was uplifted from 87.75% (no variable selection with undersampling) to 88.45% with the use of the variable selection. Similar to depression, CVD was not selected as a predictor for anxiety from the variable selection. The inference from our research contradicts the bidirectional association in [6,7]. Since CHD and CDH are all selected as a predictor of anxiety, the results match with [11,12].

4.4. Schizophrenia

The specificity of the best performing model for schizophrenia is 84.14%, which uses random forest, while the baseline model that uses random forest is 0.74%. Again, it is due to the class imbalance because schizophrenia has a small sample size in the dataset than the other mental illnesses. Schizophrenia’s best prediction model was achieved by selecting the statistically significant variables using logistic regression and undersampling after the selection procedure. A total of seven variables emerged from this method. The selected variables are Gender, Age, Hypertension, Obesity, CoronaryArteryDisease, BMI, and Z_pak. The variables selected are similar to odd’s ratio interpretation for schizophrenia discussed in Section 4.1. Variables such as gender, age, and BMI are used to create the risk prediction model for schizophrenia in [34]. It agrees with the risk prediction model created in our research. Since CAD is selected as one of the predictors of schizophrenia, the results completely agree with [10] and partially match with [13]. The best model’s AUC using random forest is 92.73%, and the case without variable selection is 92.56%.

4.5. Disease

For disease, no variable selection with undersampling is the best performing model. Compared to the baseline model’s specificity ( 56.59 % ) using lightGBM, the best model’s specificity using LightGBM increased ( 77.76 % ) . Although there is no significant difference between the AUC values of LightGBM ( 91.59 % ) and XGB ( 91.41 % ) , the specificity for LightGBM ( 77.76 % ) is higher than XGB ( 76.22 % ) . Therefore, LightGBM, with no variable selection and undersampling is selected as the best performing disease prediction model.
In all cases, the high model performance was achieved through undersampling, because the dataset is highly imbalanced. One of the main advantages of LightGBM is the faster training time on massive datasets. It outperforms other gradient boosting techniques such as XGB as well as the ANN model due to its distributed high-performance gradient boosting technique. This gradient boosting-based method captures all the interactions between the predictors better than linear models. Because the data is well structured, LightGBM performed well for depression, anxiety, and disease. In schizophrenia, since the sample size is smaller, random forest worked better due to the random resampling for training the model.
The best prediction model for depression is LightGBM, which uses variable selection that uses adjusted R 2 and undersampling; it gave 78.01 % accuracy, 79.13 % sensitivity, 72.65 % specificity, and 86.26 % AUC, respectively. A total of 21 variables emerged for depression from the variable selection. The best model for anxiety is LightGBM, which uses variable selection that uses adjusted R 2 and undersampling; it gave 82.93 % accuracy, 82.86 % sensitivity, 83.35 % specificity, and 88.45 % AUC, respectively. Similar to depression, 21 variables emerged. The best model for schizophrenia is the Random Forest, which uses variable selection that uses logistic regression and undersampling; it gave 87.59 % accuracy, 87.70 % sensitivity, 85.14 % specificity, and 92.73 % AUC, respectively. Using logistic regression, schizophrenia has seven variables. The best model for the disease that aggregates all the mental illnesses is LightGBM without using variable selection and uses undersampling; it gave 86.63 % accuracy, 95.50 % sensitivity, 77.76 % specificity, and 91.59 % AUC, respectively. Prediction of schizophrenia using our best model was able to achieve high accuracy, followed by a prediction of the disease, anxiety, and depression. The prediction of the disease using our best model accurately captured one of the mental illnesses from their heart disease information, which includes antibiotics and dietary supplement intake.
The healthcare institutions employ an automated prediction system in their diagnostic procedures while using a wide range of data. It is possible due to the increasing innovation in computational efficiency and accurate prediction system. A less time-consuming and accurate prediction of illnesses reduces the cost and time incurred in diagnosing each patient. In this research, a data-driven model to predict mental illness, which uses the information of comorbidities, dietary supplements, and antibiotics, is implemented. This research will find its application mainly to diagnose the illness using comorbidities, dietary supplements, and antibiotics, which is sparse in the medical application to date. The prediction helps identify the likelihood of mental illness based on the intake of drugs, dietary supplements, and antibiotics, along with diagnosed heart disease. This research can be useful when employed in healthcare settings because it helps propose prospective treatment and test procedures that can be time-consuming when done traditionally.

4.6. Strengths and Limitations of the Study

The study has several strengths and limitations. The large number of records studied, and the use of several analytical methods are some of its strengths. One of the limitations of this research is that the patient information comes from only one of the healthcare institutions and does not represent all the patient population. Diagnostic information and patient’s visit to other healthcare institutions outside of this hospital are not known. Also, the patients’ intake of other antibiotics and other dietary supplements for any other purposes other than the aforementioned is not captured in this research.

5. Conclusions

In summary, patients’ demographics in age groups > 10 , female population, patients with BMI 40 kg / m 2 , and drug intake like Z_pak were associated with depression. The female population, drug intake like clarithromycin, Z_pak, CoQ, and patients’ demographics in age groups 21–30 and 31–40 were associated with anxiety. Z_pak is associated with lower odds of schizophrenia. Patients’ demographics in the age group except > 70 , who is a female, and drug intakes such as clarithromycin, Z_pak, VitB6, and CoQ were associated with the disease as such, which are a risk factor.
The best prediction model for depression gave 78.01 % accuracy, 79.13 % sensitivity, 72.65 % specificity, and 86.26 % AUC, respectively. The best model for anxiety gave 82.93 % accuracy, 82.86 % sensitivity, 83.35 % specificity, and 88.45 % AUC, respectively. The best model for schizophrenia gave 87.59 % accuracy, 87.70 % sensitivity, 85.14 % specificity, and 92.73 % AUC, respectively. The best model for the disease that aggregates all the mental illnesses gave 86.63 % accuracy, 95.50 % sensitivity, 77.76 % specificity, and 91.59 % AUC. We can predict mental illnesses accurately using the dietary supplements, comorbidities, and drug usage data of patients with heart disease.
For future work, the interaction effect of the predictors on depression, anxiety, and schizophrenia will be considered. Applying the odds ratio to these significant interactions and confounders, the risk of each of these interactions, and their effect on mental illness will be interpreted. Since only a few variables are removed instead of selecting a subset of variables for the prediction, traditional subset selection methodologies did not work for the dataset. Therefore, another direction to pursue is to improve the prediction performance by applying Bayesian variable selection approaches. Bayesian net has proved to be useful in many healthcare domains. Applying Bayesian net for mental illness prediction will also be considered as part of the exploration.

Supplementary Materials

The following are available online at https://www.mdpi.com/2075-4426/10/4/214/s1, Table S1: Average Metrics of 5-fold CV for the Mental Illnesses with no Variable Selection and no Under Sampling; Table S2: Average Metrics of 5-fold CV for the Mental Illnesses with no Variable Selection and using Under Sampling; Table S3: Average Metrics of 5-fold CV for the Mental Illnesses with Variable Selection using Adjusted R 2 and no Under Sampling; Table S4: Variables Selected using Adjusted R 2 with no Under Sampling; Table S5: Average Metrics of 5-fold CV for the Mental Illnesses with Variable Selection using BIC and no Under Sampling; Table S6: Variables Selected using BIC with no Under Sampling; Table S7: Average Metrics of 5-fold CV for the Mental Illnesses with Variable Selection using Logistic Regression and no Under Sampling; Table S8: Variables Selected using Logistic Regression with no Under Sampling; Table S9: Average Metrics of 5-fold CV for the Mental Illnesses with Variable Selection using Adjusted R 2 and Under Sampling; Table S10: Variables Selected using Adjusted R 2 before Under Sampling; Table S11: Average Metrics of 5-fold CV for the Mental Illnesses with Variable Selection using BIC and Under Sampling; Table S12: Variables Selected using BIC before Under Sampling; Table S13: Average Metrics of 5-fold CV for the Mental Illnesses with Variable Selection using Logistic Regression and Under Sampling; Table S14: Variables Selected using Logistic Regression before Under Sampling.

Author Contributions

Conceptualization, J.S. and D.W.; methodology, J.S. and S.J.; software, J.S.; validation, J.S., L.B., and D.W.; formal analysis, J.S.; investigation, J.S. and S.J.; resources, L.B. and S.A.; data curation, S.A.; writing—original draft preparation, J.S.; writing—review and editing, J.S., L.B., and D.W.; supervision, D.W.; project administration, D.W.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

We would like to thank United Health Services (UHS) for facilitating the acquisition of the database.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Heron, M. Deaths: Leading Causes for 2017. Natl. Vital Stat. Rep. 2019, 68, 1–77. [Google Scholar] [PubMed]
  2. Benjamin, E.J.; Muntner, P.; Alonso, A.; Bittencourt, M.S.; Callaway, C.W.; Carson, A.P.; Chamberlain, A.M.; Chang, A.R.; Cheng, S.; Das, S.R.; et al. Heart Disease and Stroke Statistics-2019 Update: A Report From the American Heart Association. Circulation 2019, 139, e56–e528. [Google Scholar] [CrossRef]
  3. Fryar, C.D.; Chen, T.C.; Li, X. Prevalence of Uncontrolled Risk Factors for Cardiovascular Disease: United States, 1999–2010; US Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics: Hyattsville, MD, USA, 2012.
  4. Chaddha, A.; Robinson, E.A.; Kline-Rogers, E.; Alexandris-Souphis, T.; Rubenfire, M. Mental Health and Cardiovascular Disease. Am. J. Med. 2016, 129, 1145–1148. [Google Scholar] [CrossRef] [Green Version]
  5. Hare, D.L.; Toukhsati, S.R.; Johansson, P.; Jaarsma, T. Depression and Cardiovascular Disease: A Clinical Review. Eur. Heart J. 2014, 35, 1365–1372. [Google Scholar] [CrossRef] [Green Version]
  6. Thomas, A.J.; Kalaria, R.N.; O’Brien, J.T. Depression and Vascular Disease: What Is the Relationship? J. Affect. Disord. 2004, 79, 81–95. [Google Scholar] [CrossRef]
  7. Riba, M.; Wulsin, L.; Rubenfire, M. Psychiatry and Heart Disease: The Mind, Brain, and Heart; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar] [CrossRef]
  8. Ormel, J.; Von Korff, M.; Burger, H.; Scott, K.; Demyttenaere, K.; Huang, Y.; Posada-Villa, J.; Pierre Lepine, J.; Angermeyer, M.C.; Levinson, D.; et al. Mental Disorders among Persons with Heart Disease—Results from World Mental Health Surveys. Gen. Hosp. Psychiatry 2007, 29, 325–334. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  9. Cunningham, R.; Poppe, K.; Peterson, D.; Every-Palmer, S.; Soosay, I.; Jackson, R. Prediction of Cardiovascular Disease Risk among People with Severe Mental Illness: A Cohort Study. PLoS ONE 2019, 14, e0221521. [Google Scholar] [CrossRef] [PubMed]
  10. Vance, M.C.; Wiitala, W.L.; Sussman, J.B.; Pfeiffer, P.; Hayward, R.A. Increased Cardiovascular Disease Risk in Veterans with Mental Illness. Circ. Cardiovasc. Qual. Outcomes 2019, 12, e005563. [Google Scholar] [CrossRef] [PubMed]
  11. De Hert, M.; Detraux, J.; Vancampfort, D. The Intriguing Relationship between Coronary Heart Disease and Mental Disorders. Dialogues Clin. Neurosci. 2018, 20, 31–40. [Google Scholar] [PubMed]
  12. Sesso, H.D.; Kawachi, I.; Vokonas, P.S.; Sparrow, D. Depression and the Risk of Coronary Heart Disease in the Normative Aging Study. Am. J. Cardiol. 1998, 82, 851–856. [Google Scholar] [CrossRef]
  13. Correll, C.U.; Solmi, M.; Veronese, N.; Bortolato, B.; Rosson, S.; Santonastaso, P.; Thapa-Chhetri, N.; Fornaro, M.; Gallicchio, D.; Collantoni, E.; et al. Prevalence, Incidence and Mortality from Cardiovascular Disease in Patients with Pooled and Specific Severe Mental Illness: A Large-Scale Meta-Analysis of 3,211,768 Patients and 113,383,368 Controls. World Psychiatry 2017, 16, 163–180. [Google Scholar] [CrossRef] [Green Version]
  14. Simon, G.E.; Ludman, E.J.; Linde, J.A.; Operskalski, B.H.; Ichikawa, L.; Rohde, P.; Finch, E.A.; Jeffery, R.W. Association between Obesity and Depression in Middle-Aged Women. Gen. Hosp. Psychiatry 2008, 30, 32–39. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Luppino, F.S.; De Wit, L.M.; Bouvy, P.F.; Stijnen, T.; Cuijpers, P.; Penninx, B.W.J.H.; Zitman, F.G. Overweight, Obesity, and Depression: A Systematic Review and Meta-Analysis of Longitudinal Studies. Arch. Gen. Psychiatry 2010, 67, 220–229. [Google Scholar] [CrossRef]
  16. Pan, Y.; Cai, W.; Cheng, Q.; Dong, W.; An, T.; Yan, J. Association between Anxiety and Hypertension: A Systematic Review and Meta-Analysis of Epidemiological Studies. Neuropsychiatr. Dis. Treat. 2015, 11, 1121–1130. [Google Scholar] [CrossRef] [Green Version]
  17. Licht, C.M.M.; Geus, E.J.C.D.; Seldenrijk, A.; Hout, H.P.J.V.; Zitman, F.G.; Van Dyck, R.; Penninx, B.W.J.H. Depression Is Associated with Decreased Blood Pressure, but Antidepressant Use Increases the Risk for Hypertension. Hypertension 2009, 53, 631–638. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  18. Lurie, I.; Yang, Y.X.; Haynes, K.; Mamtani, R.; Boursi, B. Antibiotic Exposure and the Risk for Depression, Anxiety, or Psychosis: A Nested Case-Control Study. J. Clin. Psychiatry 2015, 76, 1522–1528. [Google Scholar] [CrossRef]
  19. Nightingale, S.D.; Koster, F.T.; Mertz, G.J.; Loss, S.D. Clarithromycin-Induced Mania in Two Patients with AIDS. Clin. Infect. Dis. 1995, 20, 1563–1564. [Google Scholar] [CrossRef]
  20. Elahi, F.; Bhamjee, M. A Case of Clarithromycin Psychosis. Ir. J. Psychol. Med. 2005, 22, 73–74. [Google Scholar] [CrossRef]
  21. Negrín-González, J.; Peralta Filpo, G.; Carrasco, J.L.; Robledo Echarren, T.; Fernández-Rivas, M. Psychiatric Adverse Reaction Induced by Clarithromycin. Eur. Ann. Allergy Clin. Immunol. 2014, 46, 114–115. [Google Scholar]
  22. Voelker, R. Another Caution for Clarithromycin. Jama 2018, 319, 1314. [Google Scholar] [CrossRef]
  23. Šakić, B.O.; Babović, S.S.; Gajić, Z.M. Erythromycin-Induced Psychotic Decompensation in a Patient Affected by Paranoid Schizophrenic Psychosis. Klin. Psikofarmakol. Bul. 2014, 24, 368–370. [Google Scholar] [CrossRef]
  24. Sachdeva, A.; Rathee, R. Akathisia with Erythromycin: Induced or Precipitated? Saudi Pharm. J. 2015, 23, 541–543. [Google Scholar] [CrossRef] [Green Version]
  25. Gao, K.; Khoshgoftaar, T.M.; Napolitano, A. Combining Feature Subset Selection and Data Sampling for Coping with Highly Imbalanced Software Data. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE, Pittsburgh, PA, USA, July 2015; pp. 439–444. [Google Scholar] [CrossRef]
  26. Mitchell, T.M. Machine Learning; IOP Publishing: Burr Ridge, IL, USA, 1997. [Google Scholar]
  27. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  28. Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar] [CrossRef]
  29. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef] [Green Version]
  30. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Long Beach, CA, USA, 2017; pp. 3147–3155. [Google Scholar]
  31. McCulloch, W.S.; Pitts, W. A Logical Calculus of the Ideas Immanent in Nervous Activity. Bull. Math. Biophys. 1943, 5, 115–133. [Google Scholar] [CrossRef]
  32. Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
  33. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  34. Osborn, D.P.J.; Hardoon, S.; Omar, R.Z.; Holt, R.I.G.; King, M.; Larsen, J.; Marston, L.; Morris, R.W.; Nazareth, I.; Walters, K.; et al. Cardiovascular Risk Prediction Models for People With Severe Mental Illness. JAMA Psychiatry 2015, 72, 143. [Google Scholar] [CrossRef]
  35. D’Agostino, R.B.; Vasan, R.S.; Pencina, M.J.; Wolf, P.A.; Cobain, M.; Massaro, J.M.; Kannel, W.B. General Cardiovascular Risk Profile for Use in Primary Care: The Framingham Heart Study. Circulation 2008, 117, 743–753. [Google Scholar] [CrossRef] [Green Version]
  36. Yang, L.; Wu, H.; Jin, X.; Zheng, P.; Hu, S.; Xu, X.; Yu, W.; Yan, J. Study of Cardiovascular Disease Prediction Model Based on Random Forest in Eastern China. Sci. Rep. 2020, 10, 1–8. [Google Scholar] [CrossRef] [Green Version]
  37. Ghiasi, M.M.; Zendehboudi, S.; Mohsenipour, A.A. Decision Tree-Based Diagnosis of Coronary Artery Disease: CART Model. Comput. Methods Programs Biomed. 2020, 192, 105400. [Google Scholar] [CrossRef] [PubMed]
  38. Gupta, A.; Kumar, L.; Jain, R.; Nagrath, P. Heart Disease Prediction Using Classification (Naive Bayes). In Lecture Notes in Networks and Systems; Springer: Singapore, 2020; Volume 121, pp. 561–573. [Google Scholar] [CrossRef]
  39. Rajliwall, N.S.; Davey, R.; Chetty, G. Cardiovascular Risk Prediction Based on XGBoost. In Proceedings—5th Asia-Pacific World Congress on Computer Science and Engineering, APWC on CSE; IEEE Computer Society: Los Alamitos, CA, USA, 2018; pp. 246–252. [Google Scholar] [CrossRef]
  40. Altan, G.; Allahverdi, N.; Kutlu, Y. Diagnosis of Coronary Artery Disease Using Deep Belief Networks. Eur. J. Eng. Nat. Sci. 2017, 2, 29–36. [Google Scholar]
  41. Greenfield, B.; Henry, M.; Weiss, M.; Tse, S.M.; Guile, J.M.; Dougherty, G.; Zhang, X.; Fombonne, E.; Lis, E.; Lapalme-Remis, S.; et al. Previously Suicidal Adolescents: Predictors of Six-Month Outcome. J. Can. Acad. Child Adolesc. Psychiatry 2008, 17, 197–201. [Google Scholar]
Table 1. Summary of drug usage, dietary supplements, and comorbidities for patients with mental illness a.
Table 1. Summary of drug usage, dietary supplements, and comorbidities for patients with mental illness a.
TotalNon-DiseaseDisease
(=Yes)
Depression
(=Yes)
Anxiety
(=Yes)
Schizophrenia
(=Yes)
Variables n % n % n % n % n % n %
Total68,209100.045,96867.3922,24132.6110,08014.7215,22922.242580.38
Gender b
Male37,95944.3523,55251.26669830.12291228.89444529.1915058.14
Female30,25055.6522,41648.7615,54369.88716871.1110,78470.8110841.86
Age (Years)
0–104870.712370.522501.12250.252291.5000.0
11–2026373.876341.3820039.018588.5114559.5551.94
21–3046416.8012642.75337715.18131213.02257116.88197.36
31–4056478.2822354.86341215.34140213.91251716.534617.83
41–50772711.3343689.50335915.10152115.09235615.473413.18
51–6013,49119.78948020.62401118.03194419.29263217.287227.91
61–7014,81121.7111,78925.65302213.59155615.44178711.736123.64
>7018,76827.5115,96134.72280712.62146214.50168211.04218.14
IDD
No67,48198.9345,36598.6922,11599.4310,01399.3415,15799.5325398.06
Yes7291.076031.311260.57670.66720.4751.94
HT
No48,77728.49571812.4413,71461.66584758.01973963.9515258.91
Yes19,43371.5140,25087.56852738.34423341.99549036.0510641.09
OA
No65,50196.0243,93395.5721,56796.97971496.3714,78597.0824896.12
Yes27093.9720354.436743.033663.634442.92103.88
CM
No68,01699.7245,80899.6522,20799.8510,06399.8315,20999.87258100.0
Yes1940.281600.35340.15170.17200.1300.0
Obesity
No59,60187.3839,34685.5920,25591.07901989.4713,97491.7622687.60
Yes860912.62662214.4119868.93106110.5312558.243212.40
CDH
No68,15399.9245,91899.8922,23499.9710,07699.9615,22599.97258100.0
Yes570.08500.1170.0340.0440.0300.0
HF
No66,06996.8644,18996.1321,87998.37987697.9815,00598.5325699.22
Yes21413.1417793.873621.632042.022241.4720.78
CVD
No68,11599.8645,89999.8522,21599.8810,06399.8315,21699.9125799.61
Yes950.14690.15260.12170.17130.0910.39
AS
No68,14599.9045,91599.8822,22999.9510,07199.9115,22299.95258100.0
Yes650.10530.12120.0590.0970.0500.0
CAD
No59,50487.2438,43383.6121,07094.73945593.8014,52395.3624594.96
Yes870612.76753516.3911715.276256.207064.64135.04
ND
No68,12199.8745,90199.8522,21999.9010,07599.9515,21399.98925699.22
Yes890.13670.15220.1050.05160.1020.78
E-CRP
No67,96999.6545,80099.6322,16899.6710,05999.7915,16799.5925699.22
Yes2410.351680.37730.33210.21620.4120.78
E-ESR
No67,90099.5545,75799.5422,14299.5510,03999.5915,15499.51258100.0
Yes3100.452110.46990.45410.41750.4900.0
LTUA
No67,91199.5645,71599.4522,19599.7910,05899.7815,19599.78258100.0
Yes2990.442530.55460.21220.22340.2200.0
BMIc
(mean ± std)
(50.25 ± 1108.60)
Underweight (<18.5)13231.936161.347073.182252.235633.7031.16
Normal
(18.5–24.99)
12,53518.38706215.36547324.61214221.25400826.325922.87
Overweight
(25–29.99)
18,75727.5012,93928.14581726.15253225.12405226.618432.56
Obese
(30–39.99)
26,07538.2318,72040.72753233.06357135.43480331.548834.11
Severe Obese ( 40)952313.96663114.43289213.00161015.97180311.84249.30
LAB d
CRP59541.0138242.9721337.908738.1617038.8100.00
ESR85658.9950757.0334962.1014161.8426861.191100.0
LabValue e (mean ± std)(17.79 ± 28.97)
E_Mycin
No68,19299.9745,95899.9822,23399.9610,07699.9615,22399.96258100.0
Yes180.03100.0280.0440.0460.0400.00
C_Mycin
No67,63699.1645,61999.2422,01698.99998899.0915,06698.9325799.61
Yes5740.843490.782251.01920.911631.0710.39
Z_pak
No48,02470.4133,37472.6014,64965.86643763.86996965.4620177.91
Yes20,18629.5912,59427.40759234.14364336.14526034.545722.09
Folate
No68,13199.8845,92199.9022,20999.8610,06299.8215,20999.8725799.61
Yes790.12470.10320.14180.18200.1310.39
VitB6
No68,05099.7745,86299.7722,18799.7610,05499.7415,19499.77258100.0
Yes1600.231060.23540.24260.26350.2300.00
CoQ
No67,58999.0945,49098.9622,09899.3610,01799.3715,13799.40258100.0
Yes6210.914781.041430.64630.63920.6000.0
O3FO
No68,02699.7345,82399.6822,20299.8210,05899.7815,20299.8225799.61
Yes1840.271450.32390.18220.22270.1810.39
Abbreviations: AS, Atherosclerosis; BMI, Body Mass Index; CAD, Coronary Artery Disease; CDH, CongenitalDiseaseOfHeart; CM, CancerMalignant; CVD, Cardiovascular Disease; C_Mycin, Clarithromycin; CoQ, Coenzyme Q10; E-CRP, Elevated C-reactive Protein; E-ESR, Elevated Erythrocyte Sedimentation Rate; E_Mycin, Erythromycin; HF, HeartFailure; HT, Hypertension; IDD, InsulinDependentDiabetes; LTUA, LongTermUseOfAntibiotics; LV, LabValue; ND, NutritionDeficiency; O3FO, Omega3FishOil; OA, Osteoarthritis; VitB6, Vitamin B-6; Z_pak, Azithromycin; a Some variable names are abbreviated to display the complete table; b One unidentified gender and missing data values for three patients (0.006%); c Body Mass Index (BMI) missing data values for 434 patients ( 0.6 % ) ; d LAB missing data values for 67, 195 patients ( 97.8 % ) ; e LabValue with miscellaneous values for six patients, < 0.3 for 10 patients, < 0.1 for three patients, and missing data values for 67, 195 patients ( 97.8 % ) .
Table 2. Summary of non-disease patients and patients with mental illness (n = 68,209).
Table 2. Summary of non-disease patients and patients with mental illness (n = 68,209).
Disease
n %
22,24132.61
Non-DiseaseDepressionAnxietySchizophrenia(D, A) a(D, S) b(A, S) c(A, D, S) d
n % n % n % n % n % n % n % n %
45,96867.3910,08014.7815,22922.332580.38326132.35360.36390.26100.31
Abbreviations: A, Anxiety; D, Depression; S, Schizophrenia; a The tuples indicate patients with both depression (D) and anxiety (A). b The tuples indicate patients with both depression (D) and schizophrenia (S). c The tuples indicate patients with both anxiety (A) and schizophrenia (S). d The tuples indicate patients who have anxiety (A), depression (D) as well as schizophrenia (S).
Table 3. Association of dietary supplements, antibiotics, and comorbidities for depression.
Table 3. Association of dietary supplements, antibiotics, and comorbidities for depression.
N aLCL bUCL caOR dp-Value
age
0–10Ref e
11–208584.432710.22976.73350.0000
21–3013123.80468.75975.77300.0000
31–4014024.03159.28356.11780.0000
41–5015214.12249.49206.25510.0000
51–6019443.68228.46825.58400.0000
61–7015562.99356.89814.54400.0000
Over–7014622.35455.43193.57620.0000
Gender
MaleRef
Female71681.61131.77881.69300.0000
Z_pak36431.32271.45631.38790.0000
BMI
<18.5Ref
>=4016101.08321.52281.28430.0040
Osteoarthritis3660.68920.87600.77700.0000
CoronaryArteryDisease6250.62840.75470.68870.0000
Obesity10610.47740.55610.51530.0000
Hypertension42330.23730.26530.25090.0000
a Number of patients with depression; b Lower Confidence Limit; c Upper Confidence Limit; d Adjusted Odds Ratio; e Reference level.
Table 4. Association of dietary supplements, antibiotics, and comorbidities for anxiety.
Table 4. Association of dietary supplements, antibiotics, and comorbidities for anxiety.
N aLCL bUCL caOR dp-Value
Gender
MaleRef e
Female10,7841.76041.92751.84210.0000
Clarithromycin1631.43292.18621.76990.0000
Intercept 1.40212.15731.73920.0000
Z_pak52601.44811.58801.51630.0000
CoQ921.17361.89961.49320.0011
age
0–10Ref
21–3025711.16181.78341.43940.0009
31–4025171.05761.62161.30960.0134
51–6026320.54360.83170.67240.0003
61–7017870.34540.53040.42800.0000
>7016820.24890.38270.30860.0000
HeartFailure2240.64400.87460.75050.0002
BMI
<18.5Ref
18.5–24.994,0080.57250.76150.66030.0000
25–29.994,0520.47270.62860.54510.0000
30–39.994,8030.38180.50780.44030.0000
>=401,8030.35420.47960.41210.0000
Osteoarthritis4440.53570.67720.60230.0000
CoronaryArteryDisease7060.53960.64390.58940.0000
ElevatedCRP620.39830.79340.56210.0011
Obesity1,2550.30270.35190.32640.0000
Hypertension5,4900.18570.20550.19540.0000
a Number of patients with anxiety; b Lower Confidence Limit; c Upper Confidence Limit; d Adjusted Odds Ratio; e Reference level.
Table 5. Association of dietary supplements, antibiotics, and comorbidities for schizophrenia.
Table 5. Association of dietary supplements, antibiotics, and comorbidities for schizophrenia.
N aLCL bUCL caOR dp-Value
Z_pak570.51680.94000.69700.018
Gender
MaleRef e
Female1080.34360.57660.44510.0000
Hypertension1060.15670.28230.21030.0000
a Number of patients with schizophrenia; b Lower Confidence Limit; c Upper Confidence Limit; d Adjusted Odds Ratio; e Reference level.
Table 6. Association of dietary supplements, antibiotics, and comorbidities for the disease.
Table 6. Association of dietary supplements, antibiotics, and comorbidities for the disease.
N aLCL bUCL caOR dp-Value
age
0–10Ref e
11–2020032.09973.32902.64380.0000
21–3033772.41453.77443.01890.0000
31–4034122.12473.30402.64960.0000
41–5033591.56222.42221.94530.0000
51–6040111.08041.66951.34300.0079
>7028070.50970.78870.63410.0000
Intercept 1.87922.94042.35070.0000
Gender
MaleRef
Female15,5432.01532.19312.10240.0000
Clarithromycin2251.45592.19041.78570.0000
Z_pak75921.56931.71371.63980.0000
VitB6541.05152.43831.60110.0283
CoQ1431.22441.87361.51450.0001
BMI
<18.5Ref
18.5–24.9954730.58870.79240.68300.0000
25–29.9958170.48230.64790.55900.0000
30–39.9973520.42460.57020.49210.0000
>=4028920.43420.59130.50670.0000
ElevatedESR990.46140.87910.63690.0061
HeartFailure3620.54870.71880.62800.0000
CoronaryArteryDisease11710.48610.56630.52470.0000
Osteoarthritis6740.41190.51390.46010.0000
ElevatedCRP730.19460.39360.27680.0000
Obesity19860.21640.25010.23270.0000
Hypertension85270.09870.10940.10390.0000
InsulinDependentDiabetes1260.07780.12160.09730.0000
a Number of patients with disease; b Lower Confidence Limit; c Upper Confidence Limit; d Adjusted Odds Ratio; e Reference level.
Table 7. Prediction performance of four different approaches for mental illnesses.
Table 7. Prediction performance of four different approaches for mental illnesses.
IllnessVariable SelectionUnder SamplingModelAccuracyF1-ScoreSensitivitySpecificityAUC
DepressionA- R 2 YesLGBM0.78010.79130.83380.72650.8626
NoYesLGBM0.76480.77450.80830.72140.8518
NoNoLGBM0.85300.91990.99030.06150.7648
LRNoLGBM0.85240.92010.99740.01600.7567
AnxietyA- R 2 YesLGBM0.82930.82860.82510.83350.8845
NoYesLGBM0.82420.82310.81780.83060.8775
NoNoLGBM0.85580.91000.93800.57010.8318
LRNoLGBM0.85500.90910.93310.58330.8289
SchizophreniaLRYesRF0.87590.87700.89760.85140.9292
NoYesRF0.87790.87990.89830.85440.9268
NoNoXGB0.99620.99811.00000.00000.7423
LRNoXGB0.99620.99811.00000.00000.7361
DiseaseNoYesLGBM0.86630.87720.95500.77760.9159
A- R 2 YesXGB0.86700.87880.96460.76950.9130
NoNoLGBM0.85650.90350.99710.56590.8522
LRNoLGBM0.85550.90280.99570.56560.8515
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Sivakumar, J.; Ahmed, S.; Begdache, L.; Jain, S.; Won, D. Prediction of Mental Illness in Heart Disease Patients: Association of Comorbidities, Dietary Supplements, and Antibiotics as Risk Factors. J. Pers. Med. 2020, 10, 214. https://doi.org/10.3390/jpm10040214

AMA Style

Sivakumar J, Ahmed S, Begdache L, Jain S, Won D. Prediction of Mental Illness in Heart Disease Patients: Association of Comorbidities, Dietary Supplements, and Antibiotics as Risk Factors. Journal of Personalized Medicine. 2020; 10(4):214. https://doi.org/10.3390/jpm10040214

Chicago/Turabian Style

Sivakumar, Jayanth, Saba Ahmed, Lina Begdache, Swati Jain, and Daehan Won. 2020. "Prediction of Mental Illness in Heart Disease Patients: Association of Comorbidities, Dietary Supplements, and Antibiotics as Risk Factors" Journal of Personalized Medicine 10, no. 4: 214. https://doi.org/10.3390/jpm10040214

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop