Explainable machine learning framework to predict the risk of work-related neck and shoulder musculoskeletal disorders among healthcare professionals

Objective This study aims to develop risk prediction models for neck and shoulder musculoskeletal disorders among healthcare professionals. Methods A stratified sampling method was employed to select employees from medical institutions in Nanning City, yielding 617 samples. The Boruta algorithm was used for feature selection, and various models, including Tree-Based Models, Single Hidden-Layer Neural Network Models (MLP), Elastic Net Models (ENet), and Support Vector Machines (SVM), were applied to predict the selected variables, utilizing SHAP algorithms for individual-level local explanations. Results The SVM model excels in both Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) and exhibits more stable performance when generalizing to unseen data. The Random Forest model exhibited relatively high overall performance on the training set. The MLP model emerges as the most consistent and accurate in predicting shoulder musculoskeletal disorders, while the SVM model shows strong fitting capabilities during the training phase, with occupational factors identified as the main contributors to WMSDs. Conclusion This study successfully constructs work-related musculoskeletal disorder risk prediction models for healthcare professionals, enabling a quantitative analysis of the impact of occupational factors. This advancement is beneficial for future economical and convenient work-related musculoskeletal disorder screening in healthcare professions.


Introduction
Work-related musculoskeletal disorders (WMSDs) refer to injuries or disorders of the muscles, nerves, tendons, joints, cartilage, and spinal disks that are associated with exposure to risk factors in the workplace (1).According to the data on work-related musculoskeletal disorders (WMSDs) from 2018 to 2020 published by the Chinese Center for Disease Control and Prevention, there are three highprevalence groups in China: flight attendants, medical staff, and workers in vegetable greenhouses.Medical staff, in particular, are a high-risk group for WMSDs due to their heavy workloads accompanied by poor dynamic loads, static loads, physical loads, and ergonomic environments (2).Current research has revealed that WMSDs among medical staff are most commonly observed in the shoulder, neck, and lower back (3), with the highest prevalence occurring in the lower neck region (4).
Previous studies have predominantly utilized descriptive statistical analysis and logistic regression to analyze the influencing factors of musculoskeletal disorders among medical staff in terms of dynamic loading, static loading, physical loading, repetitive motion, ergonomic environment, and labor organization.Wang et al. employed logistic regression to analyze a sample of 1,017 medical staff in the department of obstetrics and gynecology and found that individual, postural, work-environmental, as well as psychosocial factors were the main contributors to musculoskeletal disorders (5).Krishnan et al. discovered that musculoskeletal disorders were associated with age, low education level, female gender, years of working experience, and lifestyle (6).Machine learning models have demonstrated significant advantages, such as high accuracy and resistance to overfitting.Consequently, they have been widely applied in predicting chronic diseases, infectious diseases, and tumors.However, the utilization of machine learning models in the study of work-related musculoskeletal disorders (WMSDs) remains relatively limited.
Considering these research gaps, we utilized data on Work-Related Musculoskeletal Disorders (WMSDs) from healthcare professionals in Nanning, Guangxi Zhuang Autonomous Region, to construct risk prediction models for shoulder and neck WMSDs.This approach quantitatively reveals the varying degrees of influence each variable has on the risk of developing work-related musculoskeletal disorders, A web calculator for the neck and shoulder disease risk of WMSDs was constructed based on shinyapps.io,which can be applied to the early detection and prevention of neck and shoulder WMSDs in healthcare workers.Risk prediction model for neck WMSDs website is: https://shoulderwmsdspred.shinyapps.io/neck/.Risk prediction model for shoulder WMSDs website of shoulder is: https:// shoulderwmsdspred.shinyapps.io/shoulder/.

Setting and participants
This study, funded by the Health Commission of Nanning, was conducted as part of a survey on musculoskeletal disorders among occupational populations in Nanning.The research was carried out from June 2022 to March 2023.Medical personnel from medical institutions in seven districts and five counties of Nanning were selected as the study participants using stratified sampling.The survey was conducted online using the QuestionStar platform, and 617 medical personnel from three tertiary hospitals, seven secondary hospitals, and three disease control centers participated by completing the questionnaires.

Research tools
The questionnaire comprised four sections: personal information, musculoskeletal disorder status, work stress, and occupational health literacy.The Cronbach's Alpha for this questionnaire is 0.741.
The musculoskeletal disorder status was assessed using Chinese version of the "Musculoskeletal Disorder Questionnaire" provided by the Occupational Health and Poison Control Institute of the Chinese Center for Disease Control and Prevention, a tool developed by referring to the musculoskeletal disorder survey forms in Nordic countries and adapted to the Chinese context (7).The survey assessed musculoskeletal disorders in nine areas: neck, shoulders, back, elbows, waist, wrists, hips, knees, and ankles/feet.The respondents reported neck and shoulder WMSDs occurrences during the last 12 months, which were used as dependent variables to construct neck and shoulder WMSDs predictive models.
The work stress scale utilized in this study was the Q17 Stress Test, which is widely applied to assess work stress in hospitals (8).
For evaluating occupational health literacy, the 2021 National Health Commission's National Key Industry Occupational Health Literacy Monitoring Questionnaire was employed.A correct response rate of 60% was considered as having adequate occupational health literacy.

Ethical consideration
This study obtained approval from the Ethics Committee of Guangxi Medical University (approval number: 2021002).The purpose and content of the research were explained to all participants, and informed consent was obtained from each of them.The Boruta algorithm represents an approach for feature selection, particularly well-suited to address feature selection challenges within machine learning tasks.Its primary objective lies in the identification of the most pivotal attributes from a dataset teeming with numerous features, thereby bolstering model performance while mitigating the risk of overfitting.
As indicated in Figure 1, it becomes evident that several variables exhibit pronounced interrelationships.In light of this observation, this research segregates the dataset into training and testing subsets at a 3:1 ratio.Subsequently, the target variables, namely the presence of neck and shoulder musculoskeletal disorders, are employed to train machine learning algorithms.Leveraging the Boruta algorithm, we undertake a rigorous examination of feature variables, culling those that bear no meaningful contribution to the model.Ultimately, this process yields 12 independent variables for the "Neck" category and 17 independent variables for the "Shoulder" category, as elaborated in Figure 2 and Table 1.

Robustness assessment of models
We conducted a comparative analysis encompassing four distinct model categories: (1) Tree-Based Models: This category includes decision tree models, random forest models (RF), and XGBoost models (Xgboost).( 2) Single Hidden-Layer Neural Network Models (MLP).The multilayer perceptron consists of multiple layers of neurons, where each layer is connected to the preceding layer, receiving its inputs.Simultaneously, each layer is also connected to the subsequent layer, influencing the neurons within the current layer.These layers include the input layer, hidden layer, and output layer.In this study, the MLP employed a single hidden layer comprising 15 hidden units.(3) Elastic Net Models (ENet).(4) Support Vector Machines (SVM).For each of these model categories, we performed an extensive hyperparameter grid search through 5-fold crossvalidation on the training dataset (refer to Figure 3) (9).Subsequently, we evaluated model performance on both the training and testing datasets using metrics such as mean absolute error (MAE), root mean square error (RMSE), accuracy, and other relevant indicators.

Model interpretability
We employ the SHapley Additive exPlanations (SHAP) framework as our chosen method for model interpretability.In this context, we utilize the R programming language and leverage both the "fastshap" and "shapviz" packages (10, 11).These tools allow us to construct beeswarm plots and waterfall plots, respectively.The bee swarm plots show the distribution of the SHAP values for each feature across all the data points, and the waterfall plots are individualized explanations of a single prediction, showing the contribution of each feature to the final prediction (see Figures 4, 5).
The Shapley value represents the average marginal contribution of a variable across all conceivable coalitions.For each individual, the SHAP value associated with each variable reflects its contribution to the individual's risk of musculoskeletal disorders in the neck and shoulder.The determination of an individual's susceptibility to neck and shoulder musculoskeletal disorders is achieved by summing the contributions of these variables relative to the baseline value (which corresponds to the average predicted age across the dataset).

Partial dependency computation
The computation and graphical representation of partial dependency values for each variable are showcased in Figures 6, 7, offering illustrative examples.

Demographic data
The surveyed medical personnel consisted of 403 females and 214 males (see Table 2).Among them, 419 were married, 173 were unmarried, and 25 had unknown marital status.Regarding age distribution, 244 medical personnel were between 25 and 34 years old, 194 were between 35 and 44 years old, and 102 were between 45 and 54 years old.In terms of educational background, 42 respondents had education levels below a university degree, 527 had completed college or undergraduate studies, and 48 had completed postgraduate studies or above.Regarding work experience, 215 medical personnel had been in the profession for 15 years or more.Self-assessment of health status revealed that 379 individuals rated their health as average, while 216 rated it as good.As for monthly income, 201 medical personnel earned between 3,000 and 4,999 yuan, and 190 earned between 5,000 and 6,999 yuan.In terms of the size of their employing institutions, 376 medical personnel worked in units with 300-999 employees.Night shifts were part of the work schedule for 280 medical personnel.Additionally, 214 individuals had a weekly working time of 40 h or less, and 402 had no more than two types of chronic diseases.

Model performance comparison
The calibration curve of the model illustrates the degree of calibration in predicting probabilities on both the training and testing datasets.An ideal calibration model would exhibit a curve closely aligned with the diagonal line running from the lower-left corner to the upper-right corner.As the calibration curve approaches this diagonal line, the model's probability predictions become more accurate.Performance varies among risk prediction models for different musculoskeletal disorders affecting the neck.The random forest model shows a relatively significant deviation from the ideal diagonal line on the training set, suggesting potential overfitting to the training data.On the other hand, the support vector machine exhibits a curve on the training set that is closer to the ideal diagonal line, indicating more accurate probability predictions.XGBoost demonstrates good calibration on the training data but appears to overestimate probabilities on the testing data.The calibration curve on the testing set for the elastic net model suggests a degree of miscalibration in predicting neck diseases.Although the MLP model exhibits strong calibration on the training data, its calibration performance on the testing data is comparatively subpar (see Figure 8).
The performance of various risk prediction models for different shoulder musculoskeletal disorders varies.The RF model exhibits a certain degree of deviation from the ideal diagonal line in both the training and testing calibration curves.This suggests some inconsistency between the model's predicted probabilities and the actual occurrence frequencies.The SVM model displays a calibration  somewhat more concentrated distribution on the testing set (see Figure 9).Among the risk prediction model for neck musculoskeletal disorders, the SVM model achieves the lowest average MAE of 0.9165, indicating the smallest average prediction error.Following closely are the MLP and RF models, with average MAE values of 0.9850 and 0.9855, respectively.The Xgboost model has a slightly higher average MAE of 0.9950.The ENet model exhibits the highest average MAE at 0.9990.
Similarly, the SVM model attains the lowest average RMSE of 1.0385, signifying its superior performance when considering penalties for larger errors.The MLP model follows with an average RMSE of 1.0940, ranking second.ENet, Xgboost, and RF models display similar RMSE values of 1.1010, 1.1035, and 1.1045, respectively.
Among these models, the SVM model excels in both MAE and RMSE, indicating its relatively high predictive accuracy, especially in handling larger prediction errors.The MLP model performs well in RMSE but slightly lags behind the SVM in MAE.ENet, XGBoost, and RF models exhibit comparable performance in both metrics but fall slightly short of SVM and MLP.
In the risk prediction model for shoulder musculoskeletal disorders, The MLP (Multilayer Perceptron) model shows the best performance on both the training set (MAE = 0.946) and the testing set (MAE = 0.954), indicating its predictions are closest to the actual values on average.The XGBoost model follows closely with MAE = 0.974 on the training set and MAE = 0.982 on the testing set, suggesting slightly less accurate predictions than MLP but still outperforming other models.The SVM and ENet models have identical MAE on the training set (MAE = 1.001) and very similar performance on the testing set (SVM MAE = 1.009,ENet MAE = 1.007), which are moderate compared to MLP and XGBoost.The RF (Random Forest) model exhibits the highest MAE, particularly on the testing set (MAE = 1.111), which implies less accurate predictions on average compared to the other models.
The MLP model stands out as the most consistent and accurate model for predicting shoulder musculoskeletal disorders according to both MAE and RMSE metrics.XGBoost also performs well and could be considered a good alternative, especially if computational efficiency is a concern, as gradient boosting can be more computationally intensive than neural networks depending on the implementation and dataset size.The SVM and ENet models show moderate performance, while the RF model might require further parameter tuning or feature engineering to improve its prediction accuracy (see Figure 10).
When evaluating various machine learning models for predicting neck musculoskeletal disorders, we observed that conventional logistic regression model performs relatively average.The Random Forest model exhibited relatively high overall performance on the training set (accuracy = 0.703, sensitivity = 0.749, specificity = 0.667, AUC = 0.772).However, on the testing set, the SVM model outperformed with an accuracy of 0.574 and an AUC of 0.623.This suggests that while the Random Forest model demonstrates strong learning capabilities during the training phase, the SVM model exhibits more stable performance when generalizing to unseen data (see Table 3).
For the prediction of shoulder musculoskeletal disorders, the conventional logistic regression model performs relatively average.The SVM model demonstrates the best performance on the training set (accuracy = 0.781, sensitivity = 0.802, specificity = 0.768,  The optimal hyperparameter cross-validation results for machine learning models.The subplots in (A), from left to right, and from the first row to the second row, represent the optimal hyperparameter cross-validation results for neck musculoskeletal disorder prediction models for RF, SVM, Enet, MLP, and Xgboost, respectively.The subplots in (B), from left to right, and from the first row to the second row, represent the optimal hyperparameter cross-validation results for shoulder musculoskeletal disorder prediction models for RF, SVM, Enet, MLP, and Xgboost, respectively.The horizontal axis is sensitivity, and the vertical axis is 1-specificity.The beeswarm plot and waterfall plot for neck musculoskeletal disorders.In (A), from the first row to the second row, and from left to right, the subplots represent the neck musculoskeletal disorders beeswarm plots for RF, SVM, Xgboost, Enet, and MLP, respectively.In (B), from the first row to the second row, and from left to right, the subplots represent the neck musculoskeletal disorders waterfall plots for RF, SVM, Xgboost, Enet, and MLP, respectively.The beeswarm plot and waterfall plot for shoulder musculoskeletal disorders.In (A), from the first row to the second row, and from left to right, the subplots represent the shoulder musculoskeletal disorders beeswarm plots for RF, SVM, Xgboost, Enet, and MLP, respectively.In (B), from the first row to the second row, and from left to right, the subplots represent the shoulder musculoskeletal disorders waterfall plots for RF, SVM, Xgboost, Enet, and MLP, respectively.AUC = 0.866).On the testing set, the MLP model achieves the highest accuracy (0.690), while the Xgboost model has the highest AUC value (0.734).This suggests that the SVM model exhibits strong fitting capabilities to the data during the training phase, but on the testing set, the MLP and Xgboost models provide better generalization.
Particularly, the MLP model exhibits higher specificity (0.713) on the testing set, indicating its good performance in reducing false positives (see Table 4).

Interpretability of machine learning models for the risk of musculoskeletal disorders
To quantitatively delineate the contribution of each variable in predicting the risk of musculoskeletal disorders of the neck, our investigation primarily focuses on the application of the Shapley Additive Explanations (SHAP) framework within the Random Forest (RF) and Support Vector Machine (SVM) models.The RF model elucidates the top six determinants impacting the susceptibility of Healthcare Professionals to musculoskeletal disorders: prolonged forward neck posture, wrist flexion or maintenance of this position for extended periods, physical exhaustion post-work, prolonged neck twisting posture, static posture maintenance, and prolonged sedentary work.The SVM model reveals a similar hierarchy of influential factors, albeit with slight variations in their order.The results of conventional logistic regression (LR) are shown in Table 5, but since the performance of LR is inferior to that of random forest (RF) and support vector machine (SVM), it is not discussed in detail.Notably, the sustained forward tilt of the wrist significantly augments the risk of neck-related musculoskeletal disorders.Conversely, prolonged sitting and maintaining a uniform posture while working exhibit a negative correlation with the risk of developing these disorders (refer to Figure 4).
To quantitatively exhibit the contribution of each variable in the prediction of shoulder musculoskeletal disorder risks, we primarily examine the outcomes of the Shapley Additive Explanations (SHAP) tree framework on the Multilayer Perceptron (MLP) and Support Vector Machine (SVM) models.The MLP model identifies the six principal factors affecting the risk among Healthcare Professionals: prolonged forward neck posture, prolonged sedentary work, workrelated stress levels, number of chronic diseases, physical exhaustion  6, but since the performance of LR is inferior to that of multilayer perceptron (MLP) and support vector machine (SVM), it is not discussed in detail.Notably, low levels of work stress and not sitting for prolonged durations have a negative impact on the risk of lumbar musculoskeletal disorders.In contrast, maintaining a prolonged forward neck posture significantly increases the risk of shoulder musculoskeletal disorders (refer to Figure 5).Healthcare professionals who maintain a prolonged forward neck posture face a higher risk of developing neck musculoskeletal disorders.Similarly, those with extended periods of wrist flexion are more likely to suffer from these disorders.Medical staff experiencing varying degrees of tiredness post-work-from slightly tired to extremely exhausted-are more susceptible to neck musculoskeletal disorders.Additionally, a long-term neck twisting posture and prolonged periods of sitting while working significantly increase the likelihood of these conditions (refer to Figure 6).
Results from the SVM and MLP models indicate that healthcare professionals who frequently maintain a forward neck posture are at a greater risk of shoulder musculoskeletal disorders.Similarly, prolonged sitting while working elevates the risk of these disorders.Moderate to high levels of work-related stress are more likely to lead to shoulder musculoskeletal disorders in medical staff.Those with one or more types of chronic diseases face a heightened risk of developing these conditions.Experiencing tiredness or extreme fatigue after work increases the likelihood of these disorders, as does a history of absenteeism due to illness.Moreover, maintaining a prolonged neck twisting posture, sustaining a significant bending posture for extended periods, and long-term wrist flexion are all associated with an increased risk of shoulder musculoskeletal disorders (see Figure 7).

Discussion
Different models exhibit varying performances in assessing the risk of shoulder and neck musculoskeletal disorders, each with unique strengths and limitations.For instance, while the Random Forest excel in training datasets for predicting neck musculoskeletal disorder risks,  The calibration curves for machine learning models predicting the risk of neck musculoskeletal disorders.The first column shows the calibration curves for the training data, and the second column shows the calibration curves for the testing data.From the first row to the fifth row, the calibration curves for RF, SVM, Xgboost, Enet, and MLP are displayed for both the training and testing data.The calibration curves for machine learning models predicting the risk of shoulder musculoskeletal disorders.The first column shows the calibration curves for the training data, and the second column shows the calibration curves for the testing data.From the first row to the fifth row, the calibration curves for RF, SVM, Xgboost, Enet, and MLP are displayed for both the training and testing data.the SVM demonstrate superior generalization abilities on test datasets.These findings emphasize the importance of considering performance metrics when selecting models for specific medical prediction tasks, especially in clinical applications where a model's generalizability and its ability to reduce misdiagnosis (through high specificity) are crucial.Future research could explore these models' performances on larger and more diverse datasets and refine their parameter settings, offering deeper insights for effective clinical prediction of musculoskeletal diseases.Many studies using machine learning models lack interpretability (12)(13)(14), making it challenging to verify their reliability.Interpretability supports the acceptability of evidence and facilitates data-driven, personalized healthcare management.To achieve this, we have developed interpretable models for predicting the risk of shoulder and   (21), which includes five major categories and 48 items.However, it remains timeconsuming for occupational screening.Our study employs the Boruta algorithm for feature selection, reducing neck musculoskeletal disorder screening to just 12 key items and shoulder disorder screening to 17, Enabling a simplified screening process to identify individuals at higher risk of musculoskeletal disorders.By inputting demographic data into an electronic system, the musculoskeletal disorder prediction model can assess the risk of these conditions in healthcare professionals, thereby significantly reducing the workload for screening.
It is noteworthy that the forward posture of the neck in healthcare professionals significantly contributes to the risk of musculoskeletal disorders in both the neck and shoulder regions.Providing ergonomic chairs are recommended.Zhang et al. found that factors influencing sonographer's physicians' musculoskeletal disorders include work duration, consistent with the results of this study, where work duration was the main influencing factor for shoulder musculoskeletal disorders among healthcare professionals (12).We recommend providing targeted ergonomics-oriented occupational health education for medical staff, replacing ergonomic chairs, encouraging correct working postures, and emphasizing the importance of rest after work to reduce the incidence of occupational musculoskeletal disorders.Personalized musculoskeletal disorder risk management advice should be provided to healthcare professionals across different departments, considering both occupational factors and individual health profiles.In addition to professional factors, this study also discovered a correlation between the number of chronic diseases in medical personnel and the risk of shoulder musculoskeletal disorders, suggesting that future research   This study also has certain limitations.The absence of physiological tests makes it difficult to eliminate factors causing musculoskeletal disorders unrelated to work.Another limitation is the lack of comparison of musculoskeletal disorder factors among medical staff from different departments.The risk prediction models are derived from cross-sectional data, where exposure and outcome are ascertained at the same time point, inherently limiting the predictions.Additionally, the sample size of this study is relatively small.Future studies should establish large cohorts of healthcare workers with WMSDs to better explore the causal relationships between variables.Furthermore, a comparative analysis of musculoskeletal disorder factors among medical staff from different departments should be conducted.

Conclusion
Five machine learning models were utilized to construct predictive models for the risk of neck and shoulder musculoskeletal disorders among healthcare professionals.These models are economically feasible and convenient for preliminary screening of work-related musculoskeletal disorders in healthcare workers.Additionally, this study offers a comprehensive interpretable machine learning framework, enabling a quantitative analysis of the impact of occupational factors on the risk of work-related musculoskeletal disorders.A web calculator can be applied to the early detection and prevention of neck and shoulder WMSDs in healthcare workers.

FIGURE 1
FIGURE 1Heatmap illustrating the correlations between different variables.(A,B) Respectively present the variable correlations for musculoskeletal disorders in the neck and shoulder regions.
10.3389/fpubh.2024.1414209Frontiers in Public Health 05 frontiersin.orgcurve close to the ideal state on the training set, indicating relatively accurate probability predictions during the training phase.However, the calibration curve on the testing set deviates slightly, indicating that the model's predicted probabilities may be too high or too low when dealing with new data.The XGBoost model demonstrates good probability calibration on the training set, with a calibration curve that closely aligns with the ideal diagonal line.On the testing set, although there is a slight deviation in the curve, the overall performance remains relatively robust.The ENet model exhibits low predicted probabilities on both datasets, as evidenced in the histograms, where a significant portion of predicted probabilities clusters in the lower probability value range.Regarding the MLP model, the calibration curve on the training set indicates strong probability calibration.However, on the testing set, the curve deviates slightly from the ideal diagonal line, suggesting the possibility of mild overfitting.The histograms reveal a more even distribution of predicted probabilities on the training set but a

FIGURE 2
FIGURE 2 Data variable selection based on the Boruta algorithm.(A,B) Respectively depict the variable selection outcomes for the risk prediction dataset of neck musculoskeletal disorders and shoulder musculoskeletal disorders.

FIGURE 6
FIGURE 6Bias plot of important factors for neck musculoskeletal disorders.

FIGURE 7
FIGURE 7Bias plot of important factors for shoulder musculoskeletal disorders.

FIGURE 10 The
FIGURE 10The MAE and RMSE values for various machine learning models.(A) Depicts the MAE and RMSE values for neck musculoskeletal disorders, while (B) illustrates the MAE and RMSE values for shoulder musculoskeletal disorders.

TABLE 1
Variables of the musculoskeletal disorder risk prediction model.
√ Simultaneous_bending_turning Do you frequently maintain a posture that involves both bending and turning for extended periods during work?√Luo et al. 10.3389/fpubh.2024.1414209Frontiers in Public Health 07 frontiersin.orgFIGURE3

TABLE 2
Basic demographic characteristics of survey participants.

TABLE 3
Comparison of model performance for neck musculoskeletal disorder prediction.

TABLE 4
Comparison of model performance for shoulder musculoskeletal disorder prediction.

TABLE 5
Neck musculoskeletal disorder binary logistic regression results.

TABLE 6
Shoulder musculoskeletal disorder binary logistic regression results.delve deeper into the clinical mechanisms linking work-related musculoskeletal disorders with chronic diseases. should