A pilot project informing the design of a web-based dynamic nomogram in order to predict survival one year after hip fracture surgery

Background: Hip fracture is associated with high mortality. Identification of individual risk informs anesthetic and surgical decision making and can reduce the risk of death. However, interpretation of data, and application of research findings can be difficult, and there is a need to simplify risk indices for clinicians and lay-people alike. Results Twenty-four (7.3%) patients died within 30 days, 65 (19.8%) within 120 days and 94 (28.6%) within 365 days of surgery. Independent predictors of mortality common to all models were admission Age, BMI, and creatinine, lactate and their combination. Age and BMI inversely correlated with mortality. Presentation with a creatinine level of 90 mol.L-1 increased the odds of death OR 2.9 (1.4 6.0) 365 days after surgery compared to an admission level of 60 mol. L-1 Presentation with a plasma lactate level of 2 mmol. L-1 increased the odds of death OR 2.2 (1.1 4.5) 365 days after surgery compared to a plasma lactate level of 1 mmol. L-1. Patients presenting to hospital with a BMI of 30 kg.m-2 were less likely to die within 365 days OR 0.41 (0.17 0.99) after surgery compared to patients with a BMI of 20 kg.m-2. We presented four models in Shiny. Data entry created Kaplan-Meier graphs and outcome measures (95%CI). Conclusion We developed easy to read and interpretable web-based nomograms for prediction of survival after hip fracture surgery. Objective: Our primary objective was to develop a web-based nomogram for prediction of survival 365 days after fracture hip surgery. Methods: We collected data from 329 patients up to 365 days after hip fracture surgery and built four models using packages in RStudio. A global Cox Proportional Hazards Model was developed from all covariates. Covariates included sex, age, BMI, white cell count, lactate, creatinine, hemoglobin, C-reactive protein, ASA status, socio-economic status, duration of surgery, total time in the operating room, side of surgery and procedure urgency. We also developed a Cox proportional hazards model (CPH). a logistic regression model (LRM), and a generalized linear model (GLM) for binomial response data using iterative data reduction and elimination. We wrote an app in Shiny in order to present the models in a user-friendly way. The app consists of a drop-down box for model selection, horizontal sliders for data entry, model summaries, and prediction and survival plots. A slider selects patient follow-up over 365 days. Results: Twenty-four (7.3%) patients died within 30 days, 65 (19.8%) within 120 days and 94 (28.6%) within 365 days of surgery. Independent predictors of mortality common to all models were admission Age, BMI, and creatinine, lactate and their combination. https://preprints.jmir.org/preprint/34096 [unpublished, non-peer-reviewed preprint] JMIR Preprints McLeod et al Age and BMI inversely correlated with mortality. Presentation with a creatinine level of 90 mol.L-1 increased the odds of death OR 2.9 (1.4 6.0) 365 days after surgery compared to an admission level of 60 mol. L-1 Presentation with a plasma lactate level of 2 mmol. L-1 increased the odds of death OR 2.2 (1.1 4.5) 365 days after surgery compared to a plasma lactate level of 1 mmol. L-1. Patients presenting to hospital with a BMI of 30 kg.m-2 were less likely to die within 365 days OR 0.41 (0.17 0.99) after surgery compared to patients with a BMI of 20 kg.m-2. We presented four models in Shiny. Data entry created Kaplan-Meier graphs and outcome measures (95%CI). Conclusions: We developed easy to read and interpretable web-based nomograms for prediction of survival after hip fracture surgery. Clinical Trial: Nil (JMIR Preprints 13/10/2021:34096) DOI: https://doi.org/10.2196/preprints.34096 Preprint Settings 1) Would you like to publish your submitted manuscript as preprint? Please make my preprint PDF available to anyone at any time (recommended). Please make my preprint PDF available only to logged-in users; I understand that my title and abstract will remain visible to all users. Only make the preprint title and abstract visible. No, I do not wish to publish my submitted manuscript as a preprint. 2) If accepted for publication in a JMIR journal, would you like the PDF to be visible to the public? Yes, please make my accepted manuscript PDF available to anyone at any time (Recommended). Yes, but please make my accepted manuscript PDF available only to logged-in users; I understand that the title and abstract will remain visible to all users (see Important note, above). I also understand that if I later pay to participate in <a href="https://jmir.zendesk.com/hc/en-us/articles/360008899632-What-is-the-PubMed-Now-ahead-of-print-option-when-I-pay-the-APF-" target="_blank">JMIR’s PubMed Now! service</a> service, my accepted manuscript PDF will automatically be made openly available. Yes, but only make the title and abstract visible (see Important note, above). I understand that if I later pay to participate in <a href="https://jmir.zendesk.com/hc/en-us/articles/360008899632-What-is-the-PubMed-Now-ahead-of-print-option-when-I-pay-the-APF-" target="_blank">JMIR’s PubMed Now! service</a> service, my accepted manuscript PDF will automatically be made openly available. https://preprints.jmir.org/preprint/34096 [unpublished, non-peer-reviewed preprint] JMIR Preprints McLeod et al


Table of Contents
We presented four models in Shiny. Data entry created Kaplan-Meier graphs and outcome measures (95%CI).

Conclusion
We developed easy to read and interpretable web-based nomograms for prediction of survival after hip fracture surgery.
Anesthetic guidelines and protocols [8] increasingly drive standardization of practice, but invariably lead to a wide distribution of outcomes. Identification of individual risk can inform anesthetic and surgical decision making, and potentially improve outcomes.
Several risk specific [2,3,4,5,6] and generic [9] surgical risk indices are available for prediction of survival after hip fracture, but most are limited to prediction of survival 30 days after surgery. They discriminate well but lack adequate calibration [10]. The Nottingham Hip Fracture Score has validated nationally [2] and internationally [11] and is commonly used, albeit its 365-day score is restricted to binary prediction of low and high-risk patients. Other tools predicting 30-day mortality include the Hospital Universitario La Paz -Hip Fracture (HULP-HF) model [6], and the model developed by the Royal College of Surgeons of England's Clinical Effectiveness Unit (CEU-17) [12].
Moreover, the more complex a statistical model, due to non-linearity and interactions [Harrell10], the more difficult is to comprehend and apply.
In clinical practice, biochemical markers such as lactate [13,14] and CRP [15] give valuable preoperative information on hemodynamic and inflammatory status that informs anesthetic and surgical decision making but are not included in the aforementioned scoring systems.
Clinicians and lay-people alike find risk predictive indices difficult to interpret and apply to individual patients [16]. Apps can translate statistical modelling using RStudio [17] and (Shiny, R Studio Version 1.3.1093) [18] into easy-to-understand web-based interactive nomograms, that readily demonstrate differences between minimal risk and high-risk patients [16]. Web-based nomograms have recently been developed for prediction of cancer [19], metastases [20] cerebral edema [21], transfusion risk [22], and neonatal brain damage [23].
A need arises to develop an easy to interpret web-based nomogram from clinical hip fracture data.
Demonstrating utility from local data, would provide the platform for prospective development of a large multicenter database that would inform a statistical model with high calibration and discrimination, that could be used easily at the bedside to inform staff and lay-people of outcome after hip fracture surgery.
Therefore, our primary objective was to develop a web-based nomogram from clinical data collected over a 12-month period from patients undergoing hip fracture surgery.

Methods
Our study consisted of data collection, statistical modelling, and app development.

Data collection
We retrospectively collected preoperative and operative data from all patients presenting for hip fracture surgery in Ninewells Hospital, Dundee, over an 8-month period. Patients were followed up for one year by reviewing case notes as part of a 4 th year medical student project. Caldicott Guardian approval was obtained from the University of Dundee on 16 th October 2016.
Data included patient characteristics, comorbidities, and health status. Patient characteristics recorded on admission included: age; sex; body mass index (BMI); side (left or right); type of fracture (intracapsular or extracapsular); pre-fracture residence; and a rank social deprivation score based on 6976 data zones from the Scottish Index of Multiple Deprivation database (SIMD16). We used the SIMD16 vigintile database that ranks deprivation from 1 (most deprived residential area) to 20 (least deprived residential area) [24]. Blood tests (hemoglobin, white cell count, creatinine, lactate and CRP) were taken on hospital admission. Regarding surgery, we noted the American Society of Anesthesiologists (ASA) status of the patient; type of anesthesia (general or spinal); surgical implant; and time of operation. Operations performed between 0900h and 1700h were classified as "daytime"; between 1700h and 2200h as "evening"; and from 2200h to 0900h as "night-time" procedures. After surgery we noted the need for transfusion; acute kidney injury; cardiovascular complications such as pulmonary embolus or myocardial infarction; and infection from any source (wound, urinary or respiratory). We recorded the date of hospital discharge and destination. Both the residence of the patient pre-fracture and on ward discharge were classified into home or sheltered housing; care home; acute hospital, rehabilitation setting or long-term hospital care.
Our primary outcome was death by any cause within 365 days of hip fracture surgery.
Our modelling strategy was based on that recommended by Harrell [10]. We selected variables based on our clinical experience and evidence from published studies. We collected as much pertinent data as possible, with wide distributions for predictor values. We hypothesised that continuous variables were nonlinear. Imputation replaced missing covariables with the median value. We restricted the number of events per variable (EPV) in the model according to the equation EPV = Events or outcomes/15. We prespecified the complexity of the model and allotted 3 cubic splines (knots) to continuous variables initially to detect any nonlinear relationships between variables and outcomes, and one degree of freedom to categorical data.
We first created a global model using all variables and noted the partial χ 2 statistic for testing the association of each predictor with outcome adjusted for all other predictors and the number of degrees of freedom (df) used. We reduced the model by calculating the number of degrees of freedom that could be spent and decided how they should be spent. We ranked the apparent importance of predictors of death by plotting Akaike's information criterion (AIC) defined as χ 2 -2 df.
Initial estimation of shrinkage (γ) needed used the formula γ = (χ 2 -p)/χ 2 ). We also interpreted the model graphically and decided which parameters should be retained for bootstrap validation of calibration and discrimination. Continuous variables that showed a linear relationship with outcome were restricted to one degree of freedom.
Overfitting and effects of shrinkage were assessed using the corrected calibration slope. This was obtained using bootstrapping bias-corrected (overfitting -corrected) estimates of predicted vs.
observed values. In order to check proportional hazards assumptions, we examined scaled Schoenfeld residuals.
Assessment of model accuracy, the ability to discriminate between low-risk and high-risk patients, used measures of rank discrimination such as the bias corrected C-index and Somer's D xy, and predicted probability using the Brier score. The C-index represents the probability of concordance, C, between predicted and observed survival, and is equivalent to the Receiver Operator Characteristic (ROC) area under the curve (AUC). Concordance is defined as the proportion of all pairs of subjects whose survival time can be ordered such that the subject with the higher predicted survival is the one who survived longer with c = 0.5 for random predictions and c = 1 for a perfectly discriminating model. D xy is the difference between concordance and discordance probabilities and related to the c-index by the equation Dxy = 2(c − 0.5). The lower the Brier score (nearer to 0) the better the predicted probability. Internal validation of calibration and discrimination used the bootstrap.

App development
We developed an app using Shiny, a package from RStudio that builds interactive web applications with R. We created three files: ui.R to define the user interface; server.R to interrogate data from the UI and define the app logic; and functions.R to combine both and create the Shiny application. Prediction of outcomes was displayed using plotly on an interactive graph with probability on the xaxis. The mean was displayed as a colored square with horizontal lines representing the 95% confidence interval (CI) for outcome. Survival models showed a Kaplan Meier plot of estimated survival probability over time.

Data collection
We recorded data from 329 patients, of whom 224 (68%) were female and 85 (32%) were male.
Four percent of biochemical data were missing and replaced with the median value. Over two thirds of patients (n = 224 (68%)) were classified as ASA 3 or ASA

Discussion
We provide proof of concept of a simple dynamic nomogram created in R and Shiny that shows individual survival after hip fracture surgery. The survival model informing the nomogram showed good discrimination and calibration.
Nomograms are useful tools for clinicians, and until now have been used for prognostication mainly in cancer patients. Advances in statistical modelling now allow the non-linear relationships between continuous variables and outcomes to be explored. Moreover, software such as Shiny [18] linked to R allows data scientists to express mathematical relationships in an easy to use, informative way.
Sliders on the left-hand side of the nomogram enable clinicians to enter individual data from patients and provide a survival curve from Cox proportionality models. Such an approach offers "precision`' medicine tailored towards the personal needs, and can inform preoperative planning and mode of anesthesia. Currently hip fracture care is driven by protocols, based on best evidence from randomized controlled trials. However, maximum benefit is gained by only a fraction of patients: some are overtreated while others are undertreated [32].
Our study presents data in a way that is new to anesthesia. However, we have decided not to offer the app for general use for two reasons. First, our global model suggests that the discriminatory ability of our model could be improved with recruitment of more patients. Secondly, apps are considered as class 1 devices by the Medicine and Health Regulatory Authority (MHRA), London, UK. Apps must be registered with the MHRA and be both accurate and technically sound. Our intention is to use the data presented in this paper as pilot data, and prospectively create a database that can be interrogated in the same way but shows improved discrimination by incorporating more variables into a model and reflecting the contribution of each to overall outcome.
The strengths of our study are that we gathered a wide range of patient data and used current modelling and validation techniques. In order to create our global model, we captured data from dimensions of risk recommended by Iezzoni [33] that were most likely to explain the variation in mortality such as patient characteristics, recent health status, quality of life, and markers of acute clinical stability.
Rather than just focus on 30-day mortality, we followed our patients for a year in order to obtain a detailed, temporal overview of outcomes after hip fracture surgery. In contrast, most models focus on measurement of 30-day mortality [2,3,5,12], and tend to reflect events during hospital stay.
The Nottingham Hip Fracture Score has been validated as a predictor of 1-year mortality [7] but divides patients according to a binary "low risk/high risk" classification based on a cut-off score.
Lactate and creatinine emerged as important predictors of mortality in all models. Non-linear modelling of continuous lactate and creatinine data showed early steep rises in mortality followed by a levelling of risk. For example, patients presenting with a plasma creatinine of 90 micromol. L -1 rather than 60 micromol. L -1 or plasma lactate of 2 mmol. L -1 rather than 1 mmol. L -1 were four times more likely to die one year after surgery. These findings suggest that the immediate hemodynamic and inflammatory effects of trauma have a profound effect that influences mortality up to 12 months after surgery. Our findings are consistent with previous studies demonstrating associations between serum lactate and mortality following hip fracture [13,14]. Associations between mortality and elevated CRP have also been previously demonstrated though not consistently [15].
A systematic review by Rocos et al [34] stated that almost 1 in 5 patients had elevated lactate on hospital admission. However, we are not aware of any randomized controlled trials of fluid resuscitation in patients presenting with hip fracture. An association between prolonged lactate clearance and mortality may occur in the surgical ICU population [35], but this cannot be extrapolated to the management of elderly patients with frailty fractures.
The models revealed the inverse effect of BMI. Our findings agree with those obtained from a mass population study comparing all-cause mortality and BMI [36] and suggest that frailty and muscle mass have a significant long-term negative impact on survival after hip fracture surgery. Unlike other studies, we failed to show a significant effect of anemia. This probably reflects changes in patient blood management strategies since initial studies into this association were published.
We used modelling techniques available on R. Non-linearity of lactate, creatinine and BMI justified our application of restricted cubic splines to continuous data. Although this allocated three degrees of freedom to continuous variables, this technique improved the accuracy of the model. We used bootstrapping to validate our model. The advantage of bootstrapping is that the entire dataset can be used, unlike data-splitting which reduces the sample size for both model development and testing.
Variable selection or stopping rules were not used: these methods provide regression coefficients that are too high and confidence intervals that are too small.
Our study had some limitations. Mortality was in-line with national data but provided insufficient data to generate a model that incorporated all potential confounders. We used an internal validation method for our model but external validation from other centers would provide greater rigor.