Performance in mortality prediction of SAPS 3 And MPM-III scores among adult patients admitted to the ICU of a private tertiary referral hospital in Tanzania: a retrospective cohort study

Background Illness predictive scoring systems are significant and meaningful adjuncts of patient management in the Intensive Care Unit (ICU). They assist in predicting patient outcomes, improve clinical decision making and provide insight into the effectiveness of care and management of patients while optimizing the use of hospital resources. We evaluated mortality predictive performance of Simplified Acute Physiology Score (SAPS 3) and Mortality Probability Models (MPM0-III) and compared their performance in predicting outcome as well as identifying disease pattern and factors associated with increased mortality. Methods This was a retrospective cohort study of adult patients admitted to the ICU of the Aga Khan Hospital, Dar- es- Salaam, Tanzania between August 2018 and April 2020. Demographics, clinical characteristics, outcomes, source of admission, primary admission category, length of stay and the support provided with the worst physiological data within the first hour of ICU admission were extracted. SAPS 3 and MPM0-III scores were calculated using an online web-based calculator. The performance of each model was assessed by discrimination and calibration. Discrimination between survivors and non–survivors was assessed by the area under the receiver operator characteristic curve (ROC) and calibration was estimated using the Hosmer-Lemeshow goodness-of-fit test. Results A total of 331 patients were enrolled in the study with a median age of 58 years (IQR 43-71), most of whom were male (n = 208, 62.8%), of African origin (n = 178, 53.8%) and admitted from the emergency department (n = 306, 92.4%). In- hospital mortality of critically ill patients was 16.1%. Discrimination was very good for all models, the area under the receiver-operating characteristic (ROC) curve for SAPS 3 and MPM0-III was 0.89 (95% CI [0.844–0.935]) and 0.90 (95% CI [0.864–0.944]) respectively. Calibration as calculated by Hosmer-Lemeshow goodness-of-fit test showed good calibration for SAPS 3 and MPM0-III with Chi- square values of 4.61 and 5.08 respectively and P–Value > 0.05. Conclusion Both SAPS 3 and MPM0-III performed well in predicting mortality and outcome in our cohort of patients admitted to the intensive care unit of a private tertiary hospital. The in-hospital mortality of critically ill patients was lower compared to studies done in other intensive care units in tertiary referral hospitals within Tanzania.

hospital. The in-hospital mortality of critically ill patients was lower compared to studies done in other intensive care units in tertiary referral hospitals within Tanzania. Subjects Emergency and Critical Care, Internal Medicine Keywords Performance, Mortality, Tanzania, ICU, MPM-III, SAPS 3

BACKGROUND
The burden of critical care and ICU mortality is greatest in countries with low global national income (Vincent et al., 2014). The reported ICU mortality widely varies from one setting to the other with higher rates reported in low and middle-income countries (LMICs) (Vincent et al., 2014;Ilori & Kalu, 2012;Smith, Ayele & McDonald, 2013). As of 1st July 2020, the World Bank upgraded Tanzania's economic status from a low to lower-middle income country due to its strong economic performance over the past decade. However, the availability of intensive care units in Tanzania is very limited; none of the seven district hospitals surveyed in 2009 had an ICU. The four national referral hospitals had a total of only 38 ICU beds serving a population of 57 million (Baker et al., 2013). This is in contrast to high-income countries (HICs) which generally have between five to 30 ICU beds per 100,000 people (Dondorp, Iyer & Schultz, 2016). The availability and improvement of quality of care of critical illness in LMICs is necessary to reduce this burden and even more significant in the coming years as the population ages and prevalence of comorbidities increases (Adhikari et al., 2010).
Despite the use of high-cost and sophisticated devices, ICU mortality rates remain high. The burden of diseases compounded by a severe lack of resources, specialists and data makes prediction of ICU outcomes in terms of morbidity and mortality a crucial component of care across the continent. In HICs mortality prediction models are not only used to predict outcome but also as tools for quality enhancement and analytical decision making. These mortality predictive models were developed more than 25 years ago using patient characteristics. They help quantify the severity of illness, estimate the gravity of the disease, help predict outcome and assist in resource allocation (Keegan, Gajic & Afessa, 2011;Zimmerman et al., 1995).
The three major predictive scoring systems used to predict mortality in general ICU patients are the Acute Physiologic and Chronic Health Evaluation (APACHE) scoring system, the Simplified Acute Physiologic Score (SAPS) and the Mortality Prediction Model (MPM 0 ) (Juneja et al., 2012). APACHE-IV, SAPS 3, and MPM 0 -III are the latest versions of the aforementioned scoring systems (Zimmerman et al., 2006;Metnitz et al., 2005;Higgins et al., 2007). When selecting a predictive scoring system for use in a given ICU, it is essential to use a model that is well-proven, established and validated contextually. APACHE-IV has long been considered more precise for predicting mortality but is perceived as burdensome and more costly especially in resource-limited settings (Kuzniewicz et al., 2008). MPM 0 -III is considered superior in resource-limited settings since it has the lowest extraction burden among the three models and is available without cost on various medical information sites. MPM 0 -III has been a well-studied tool in East Africa. Nevertheless, its performance in predicting outcomes among critically ill patients admitted to ICU's in Kenya (Lukoko et al., 2020) and Rwanda (Riviello et al., 2016) are contradictory. External validation of SAPS 3 among patients admitted to the ICU's in Austria, Brazil and Italy reported that SAPS 3 had a good ability to predict outcomes but performed poorly across all probabilities of death when compared with APACHE-IV and MPM 0 -III (Poole et al., 2009;Nassar Jr et al., 2012;Metnitz et al., 2009). SAPS 3 may have greater potential for international use since the score was derived from data in more than one country (Metnitz et al., 2005). No study to date has assessed the performance of SAPS 3 in LMICs, especially in Sub-Saharan Africa.
These aforementioned predictive scoring systems have been compared in different studies and have produced variable results. The existence of a large number of scoring systems with contrasting performance suggests the best fit model is ICU specific. Thus, each particular ICU needs to determine which scoring system performs best in its setup; hence there was a need to carry out a comparative study in our cohort of patients to identify the best performing model. There have been no studies done in Tanzania that have compared the performance of these scoring models. The study had two main objectives: (1) To compare the performance of MPM 0 -III and SAPS 3 in order to identify which model best fits in the ICU of the Aga Khan Hospital, Dar es Salaam. (2) To identify disease patterns and risk factors associated with higher mortality rates among critically ill patients.

METHODS
This was a single centre retrospective cohort study, conducted in the ICU of the Aga Khan Hospital, Dar-es-Salaam, Tanzania. The Aga Khan Hospital is the largest private hospital and the only Joint Commission International (JCI) accredited hospital in Tanzania. The ICU of the Aga Khan Hospital is a 15 bed unit which provides level III services to all kind of critically ill patients. The ICU is capable of providing mechanical ventilation, inotropic support and renal replacement therapy. The unit is divided in 3 sections -7 adult general ICU beds (including 2 isolation rooms), 4 for cardiac patients and 4 for paediatrics. The ICU is an open-model one run by a multidisciplinary team comprising of physician of the primary specialty of care, physiotherapist and dietician led by a full-time critical care specialist. The nurse-to-patient ratio ranges between 1:1 and 1:2. All adult patients aged 18 and above admitted to the ICU were eligible for the study. Patients admitted for observation, having incomplete data and those whose duration of stay in the unit was less than an hour as well as those diagnosed with COVID-19 were excluded from the study. Admissions to the ICU are limited to those meeting a strict admitting criteria set by the hospital. A total of 747 adults patients were admitted to the ICU from August 2018 to April 2020. A sample size of 331 patients having a specific outcome (death or discharge home) was determined to be sufficient to give the study a 80% power and 95% confidence for detection of 10% difference in performance between SAPS 3 and MPM 0 -III. The ICU admission register was used to identify patients admitted and patient files were retrieved from the medical records. The medical file numbers were entered into a computer and computer generated random sampling was performed until the desired sample size was achieved. Patient demographics and clinical data were extracted using patient records and were entered into a spreadsheet on Microsoft Office Excel 2010 (Redmond, WA, USA). Data was extracted by experienced junior doctors with working experience in the ICU and was independently verified by the primary author for accuracy and completeness. The reasons for admission were grouped into 11 categories: surgery, gastroenterology, neurology, endocrinology, respiratory, cardiovascular, nephrology, sepsis, oncology, hematology, obstetrics and gynecology. When multiple diagnoses were present, the leading one with the worst prognosis was selected as the main reason for admission.
Descriptive analysis of demographic characteristics were done and presented as percentages while the categorical and continuous outcome variables were analysed and presented as means and medians with interquartile ranges respectively. Categorical and continuous variables between survivors and non-survivors were compared using Pearson's chi-square test and Mann-Whitney U test respectively. SAPS-3 and MPM 0 -III were calculated using an online scoring calculator, available on http://www.uptodate.com/. Accurate discrimination and calibration are key distinguishing features that should be met by all predictive scoring models. Discrimination of the model was assessed by the area under the receiver operating characteristic (ROC) curve. An area of 0.7-0.8 is reflected as fair, 0.8-0.9 good and >0.9 excellent. Non-parametric Wilcoxon statistics was used to compare the area under the ROC curves (Steyerberg et al., 2010). A Hosmer-Lemeshow goodness-of-fit test which follows chi-square distribution was used to evaluate the model fit as well as calibration of the models with a p-value of >0.05 signifying no evidence of poor fit (Steyerberg et al., 2010). However, all other statistical tests with a p-value of <0.05 were considered statistically significant. Any variable with P-Value <0.05 and those considered clinically significant in explaining mortality in the ICU were considered in multivariable model. Determinants of mortality among critically ill patients were identified using binary logistic regression; odds ratio with corresponding 95% Confidence intervals (CI) and P-value were reported. All statistical analysis was done using STATA version 15. The study protocol was approved by the Ethical Research Committee (ERC) of the Aga Khan University (AKU/2020/051/fb) and individual consent of each study participant was exempted since it did not affect the rights and welfare of the patients.
Calibration of each scoring system exhibited good performance. The goodness of fit Hosmer-Lemeshow test and p-value of each scoring system is shown in Table 4 below. The area under ROC of SAPS 3 and MPM 0 -III in prediction of mortality are shown below in Fig. 1 below. The area under the ROC was calculated to evaluate the predictive value of the scoring systems. The area under the ROC curve for the SAPS 3 showed a statistically      The overall estimated median (IQR) predicted mortality among the 331 ICU patients was 6% (2%-20%) on the SAPS 3 model and 11.5% (3.8%-27.9%) based on the MPM 0 -III model. The stratified analysis by survivors and non-survivors is shown in Fig. 2 below. The median predicted mortality risk for survivors is lower than those of non-survivors. In the SAPS 3 model, the estimated median for survivors was 5% (IQR: 1%-11%) while for the non-survivors this was 50% (IQR: 34%-69%) Based on the MPM 0 -III model the median  predicted mortality was 9.1% (3.1%−1.7%) and 68.5% (IQR: 42.7%-84.0%) for survivors and non-survivors respectively. Multiple clinical factors were associated with increased adjusted odds of mortality. These included length of ICU stay (adjusted odds ratio [aOR], 1.462; P-Value = 0.001) and those transferred from the ward (aOR, 5.341; P-value<0.022). However, it was protective to stay longer in the hospital as the odds of mortality decreased as the length of hospitalization increased (aOR, 0.717; P-Value = 0.002) ( Table 6).

DISCUSSION
To our knowledge, this is the first study to report on performance of predictive scoring models in Tanzania and more so in a private setting. Accurate discrimination and calibration are two key characteristics that should be met by all predictive scoring systems. Both SAPS 3 and MPM 0 -III performed well in our cohort. According to our results, a SAPS 3 score of higher than 54 can predict mortality with sensitivity of 72% and specificity of 90%. A MPM 0 -III score of greater than 4 can predict mortality with sensitivity of 74% and specificity of 87%.
Discrimination describes the accuracy of a given prediction in our cohort, the discriminatory capability of both SAPS 3 (20 variables) and MPM 0 -III (16 variables) was good. There was no statistically significance difference when both these models were compared, suggesting that the model with more variables was not associated with better discriminatory performance. MPM 0 -III has been externally validated in various ICUs in North America (Higgins et al., 2007;Kuzniewicz et al., 2008;Higgins et al., 2009) and has shown to have good discrimination which was similar to our study finding. However, a study done at Aga Khan University Hospital, Nairobi, Kenya (Lukoko et al., 2020) and two public ICUs in Rwanda (Riviello et al., 2016) showed MPM 0 -III to have fair discrimination amongst their cohort. This observed difference in discrimination may be due to the effect of differences in proportion of case mixes between the study settings. Similarly SAPS 3 has been externally validated in various ICUs in Italy (Poole et al., 2009), Brazil (Nassar Jr et al., 2012), Austria (Metnitz et al., 2009) and found to have good discriminatory capability amongst their cohort. Despite SAPS 3 having greater prospective for international generalizability there has been no published studies evaluating its performance in Sub-Saharan African ICUs. This is the first study that reports its potential for application in LMICs.
Calibration describes how the instrument performs over a wide range of predicted mortalities. Calibration is sensitive to alterations in case-mix and patient care and interventions. Despite its tendency to deteriorate over time and leading to overestimation of mortality (Nassar Jr et al., 2012), both SAPS 3 and MPM 0 -III were well calibrated amongst the critically ill patients admitted at our study setting. Our study findings were contrary to SAPS 3 validation studies mentioned earlier which reported poor calibration and overestimation of mortality (Poole et al., 2009;Nassar Jr et al., 2012;Metnitz et al., 2009). However, external validation studies have reported MPM 0 -III to have good calibration (Higgins et al., 2007;Kuzniewicz et al., 2008;Higgins et al., 2009). Earlier studies mentioned that were conducted in Sub-Saharan Africa have produced contrasting results. The MPM 0 -III was well calibrated amongst the critical ill patients admitted to the ICU of the Aga Khan University Hospital, Nairobi (Lukoko et al., 2020) but showed poor calibration amongst all adult patients admitted to Rwanda's two public ICUs (Riviello et al., 2016). These findings highlight the similar treatment protocols and interventions between two sister hospitals located in different geographical regions.
In this retrospective study we also aimed to identify patient demographics, disease patterns, clinical outcomes as well as factors associated with higher risk of mortality in patients admitted to the ICU of the Aga Khan Hospital, Dar -es Salaam. Based on this retrospective observational cohort, the in-hospital mortality of critically ill patients was 16.1%, which is far less than the reported mortality among all other tertiary referral hospitals in Tanzania, 41.4% (Sawe et al., 2014) but slightly exceeds rates reported in western Europe and North America (Vincent et al., 2014). This disparity is not surprising since the intensive care unit at our setting has access to more resources than similar units in the country and comparable in various ways to facilities in HICs. The ICU cohort studied in the four tertiary referral hospitals in Tanzania was younger (median age 34 years, IQR 21-53) compared to our study population (median age 58 years, IQR 43-71). This variation could be due to the exclusion of patients aged less than 18 years in our study. However both the cohorts had male predominance of 57.5% and 62.8% respectively (Sawe et al., 2014). The bulk of admissions in our cohort were those suffering from neurological disease, sepsis, respiratory and cardiovascular related conditions. Mortality was highest among those admitted due to sepsis. Our results are in parallel with a large intercontinental database that emphasized the association of sepsis with high mortality rates in all countries (Vincent et al., 2014). The median length of ICU stay is similar to reports from tertiary hospitals in Sub-Saharan Africa (Sawe et al., 2014;Kwizera, Dunser & Nakibuuka, 2012).
Prolonged length of stay (LoS) in the ICU and patients transferred from the general ward to the ICU were factors associated with higher adjusted odds of mortality among critically ill patients. Prolonged LoS in the ICU may be attributed to development of multi-systemic complications necessitating continued organ support. There are no laws and guidelines in Tanzania with regards to withdrawal of support, hence we hypothesize that significant fraction of patients with a prolonged course of illness and with expected poor outcomes are admitted for extended intervals before succumbing to death. Our study findings are comparable to several studies done in well-equipped ICUs that concluded patients with multiple diseases and having organ dysfunction were key factors that prolong the LoS in ICU (Toptas et al., 2018;Moitra et al., 2016). Contrasting results have also been published that LoS in ICU is not an independent risk factor for in-hospital mortality (Williams et al., 2010). Those patients transferred from the general ward to the ICU also had higher adjusted odds of mortality; this is not surprising since it is a mere reflection of deteriorating physiological and clinical condition. Few studies have demonstrated early transfer to the ICU for treatment to have a substantial impact on in-hospital mortality and LoS (Churpek et al., 2016;Young et al., 2003).
We identified several limitations in our study. Firstly, this was a single center study and as such the findings may not be valid across all patient populations in Tanzania. Secondly, since our study was a retrospective design it restricted us the ability to follow up outcomes after ICU discharge and doesn't provide the same level of evidence as a prospective study design.

CONCLUSION
In summary, this is the first and largest study to report on performance of predictive scoring models in Tanzania. Our study concluded both SAPS 3 and MPM 0 -III performed well in