Predicting inmate suicidal behavior with an interpretable ensemble machine learning approach in smart prisons

The convergence of smart technologies and predictive modelling in prisons presents an exciting opportunity to revolutionize the monitoring of inmate behaviour, allowing for the early detection of signs of distress and the effective mitigation of suicide risks. While machine learning algorithms have been extensively employed in predicting suicidal behaviour, a critical aspect that has often been overlooked is the interoperability of these models. Most of the work done on model interpretations for suicide predictions often limits itself to feature reduction and highlighting important contributing features only. To address this research gap, we used Anchor explanations for creating human-readable statements based on simple rules, which, to our knowledge, have never been used before for suicide prediction models. We also overcome the limitation of anchor explanations, which create weak rules on high-dimensionality datasets, by first reducing data features with the help of SHapley Additive exPlanations (SHAP). We further reduce data features through anchor interpretations for the final ensemble model of XGBoost and random forest. Our results indicate significant improvement when compared with state-of-the-art models, having an accuracy and precision of 98.6% and 98.9%, respectively. The F1-score for the best suicide ideation model appeared to be 96.7%.

The algorithms 1 and 2 are for subsections "Dimensionality Reduction via SHAP" and "Interpretation & Dimensionality Reduction via Anchor" respectively in the main paper.In the algorithm 1, we represent our working with SHAP for generating feature importance values using the XGBoost classifier (XGB).We take our two processed datasets, one containing suicide ideation features (Suicide D) and the other without such features (NonSuicide D).For each of the datasets, we perform the following steps.We split the dataset into a training set (T) and a testing set (S) using an 80-20 split ratio.The XGBoost classifier is trained on the training set (T) and used to make predictions on the testing set (S).To calculate the feature importance, the input dataset (X) is converted into a DataFrame (xFrame) with the same column structure as the training and testing sets.Then, the SHAP explainer is created using the trained XGBoost classifier.The explainer computes the SHAP values, which quantify the contribution of each feature to the predictions made by the XGBoost model.The computed SHAP values (SHAPValues) are then visualized using a bar plot.This plot displays the importance of each feature, ranking them based on their impact on the model's predictions.The parameter "max display=27" specifies that only the top 27 features will be shown in the plot.Finally, we create two Algorithm 2 Algorithm to obtain top anchored features using Anchor on XG-Boost In the algorithm 2, for both of the reduced datasets (RDS) from SHAP analysis, we first split it into a training set (T) and a testing set (S) using an 80-20 split ratio.The XGBoost classifier (XGB) is then trained on the training set (T) and used to make predictions on the testing set (S).To prepare the training set (T) for anchor explanations, we create a list of feature names (FeatureNames) from the columns of the training set.We then create a DataFrame (df) using the training set (T) and the feature names (Fea-tureNames).Next, an AnchorTabularExplainer is created using the values of the training set (T) and the feature names (FeatureNames).This explainer is responsible for generating anchor explanations based on the XGBoost model.We iterate over each row (r) in the testing set (S).For each row, it retrieves the instance (INS) from the testing set.The XGBoost model makes a prediction (P) on this instance.An explanation for the instance is generated using the explainer (exp) by calling the 'explain instance()' method with the prediction (P) as input.The generated explanation (genExp) includes the anchor names, precision, and coverage.We record and display the anchored rules, precision, and coverage for the gener-ated explanation.This information helps to understand the rules or conditions under which the model makes its predictions.After iterating over all rows in the testing set (S), we analyze recorded generated rules and create a further reduced dataset having 12 features for suicide ideation and 19 features for without suicide ideation dataset.

Figure 1 :
Figure 1: Pairwise correlation of anchor reduced features without suicide ideation features.

Figure 2 :
Figure 2: Pairwise correlation of anchor reduced features with suicide ideation features.

Figure 3 :
Figure 3: Feature contributions by SHAP values of anchor reduced features, including suicidal ideation features.

Figure 4 :
Figure 4: Feature contributions by SHAP values of anchor reduced features excluding suicidal ideation features.

Figure 5 :
Figure 5: Confusion Matrix for Anchor reduced Ensemble model with and without Suicidal Ideation (SI) features.

Table 1 :
Important features for the dataset with suicidal ideation features, along with the count of how many times each feature was used as an anchor in explanations.T -Total usage count as an anchor T NS -Usage count in non-suicidal class predictions T S -Usage count in suicidal class predictions

Table 2 :
Important features for the dataset without suicidal ideation features, along with the count of how many times each feature was used as an anchor in explanations.T -Total usage count as an anchor T NS -Usage count in non-suicidal class predictions T S -Usage count in suicidal class predictions