Utility of pneumonia severity assessment tools for mortality prediction in healthcare-associated pneumonia: a systematic review and meta-analysis

Accurate prognostic tools for mortality in patients with healthcare-associated pneumonia (HCAP) are needed to provide appropriate medical care, but the efficacy for mortality prediction of tools like PSI, A-DROP, I-ROAD, and CURB-65, widely used for predicting mortality in community-acquired and hospital-acquired pneumonia cases, remains controversial. In this study, we conducted a systematic review and meta-analysis using PubMed, Cochrane Library (trials), and Ichushi web database (accessed on August 22, 2022). We identified articles evaluating either PSI, A-DROP, I-ROAD, or CURB-65 and the mortality outcome in patients with HCAP, and calculated the pooled sensitivities, specificities, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and the summary area under the curves (AUCs) for mortality prediction. Additionally, the differences in predicting prognosis among these four assessment tools were evaluated using overall AUCs pooled from AUC values reported in included studies. Eventually, 21 articles were included and these quality assessments were evaluated by QUADAS-2. Using a cut-off value of moderate in patients with HCAP, the range of pooled sensitivity, specificity, PLR, NLR, and DOR were found to be 0.91–0.97, 0.15–0.44, 1.14–1.66, 0.18–0.33, and 3.86–9.32, respectively. Upon using a cut-off value of severe in those patients, the range of pooled sensitivity, specificity, PLR, NLR, and DOR were 0.63–0.70, 0.54–0.66, 1.50–2.03, 0.47–0.58, and 2.66–4.32, respectively. Overall AUCs were 0.70 (0.68–0.72), 0.70 (0.63–0.76), 0.68 (0.64–0.73), and 0.67 (0.63–0.71), respectively, for PSI, A-DROP, I-ROAD, and CURB-65 (p = 0.66). In conclusion, these severity assessment tools do not have enough ability to predict mortality in HCAP patients. Furthermore, there are no significant differences in predictive performance among these four severity assessment tools.


Inclusion and exclusion criteria
The inclusion criteria for eligible studies were as follows: prospective or retrospective studies targeting hospitalized patients with HCAP, nursing home-acquired pneumonia (NHAP) and/or NHCAP according to the 2005 ATS/IDSA guidelines 1 and/or the 2017 Japanese Respiratory society (JRS) guidelines 25 , evaluating severity scores of PSI 26 , A-DROP 27 , I-ROAD 12 , or CURB-65 28 and reporting mortality outcomes and raw data for the number of patients and deaths for any item of each severity grade, written in English or Japanese as original research articles.Exclusion criteria were as follows: studies involving children; case reports, conference reports, reviews; studies including patients who did not receive inpatient treatment in hospital because of possible significant biases for the treatment contents; studies with overlapping periods at the same medical institution; and studies lacking detailed data of namely true-positive, false-positive, true-negative, and false-negative values at any severity grade for mortality.

Data extraction and quality assessments
Two reviewers (SN and MK) independently assessed all the articles.The non-relevant studies were excluded based on the titles and abstracts after searching PubMed, Cochrane Library (trials), and Ichushi web database using the keywords, and the full texts of potentially appropriate titles and abstracts were further reviewed.The following information was collected from the included studies: geographic location, design, sample size, the mean age of participants, type of severity score, a common outcome, and mortality rate.The QUADAS-2, which includes four risk-of-bias domains and three domains of applicability 29 , was used to evaluate the risk of bias.Two investigators (SN and MK) evaluated the risk of bias using the QUADAS-2, and any disagreements were resolved by a third reviewer (NN) and discussed.

Severity grade of PSI, A-DROP, I-ROAD, and CURB-65
The detailed calculation parameters of these four assessment tools are demonstrated in Table 1.PSI 26 is classified into a five-class according to total score of the prognostic factors and the severity grade was categorized into ≥ IV (moderate) and V (severe) when there was a total score of 91 or more and 131 or more points, respectively.A-DROP 27 and CURB-65 28 is a 6-point scoring system and "more than one point" and " more than three points" for A-DROP and "more than two points" and "more than three points" for CURB-65 was categorized into ≥ II (moderate) and ≥ III (severe), respectively.I-ROAD 12 is classified into three grades and it was categorized into severe when three or more prognostic factors of "Predictors of life expectancy" were applied.When less than

Outcomes
The primary outcome in this study was short-term mortality (28-day, 30-day or in-hospital mortality).

Statistical analysis
Paired forest plots and the pooled sensitivities, specificities, positive likelihood ratio (PLR), negative likelihood ratio (NLR) and diagnostic odds ratio (DOR) were calculated using the "midas" and "metandi" commands in the STATA 14 software (StataCorp LP, College Station, TX, USA), as previously reported 22 .In addition, the overall area under the curves (AUCs) of each severity assessment tools were calculated and compared with the Review www.nature.com/scientificreports/Manager ver.5.4 software.In eight studies 14,18,[30][31][32][33][34][35] where AUC was not described in the paper, the AUC was calculated based on the receiver operator characteristic (ROC) curves that were obtained from the raw data of the number of patients and fatalities for each severity grade using STATA 14 software.Statistical significance was set at a p-value of < 0.05.I 2 statistics were used to evaluate the heterogeneity of the reported studies, as follows: 0-25%, low; 25-50%, moderate; 50-75%, high; 75-100%, very high.

Database search and risk of bias assessment
A total of 2881 articles (PubMed 2276, Cochrane Library 134, and Ichushi web 471) were identified in the initial search, and 41 articles were potentially eligible after the first screening of the titles and abstracts.Next, the full text was reviewed, and 20 articles were excluded.Eventually, 21 observational studies were selected for this study (Fig. 1).The summary of the risk of bias using the QUADAS-2 in the included studies was shown in Fig. 2. For patient selection, three or two studies were evaluated as having a high risk of bias or high concern for applicability, respectively, because of the possibility of inappropriate exclusion or mismatched definition.In the patient flow and timing assessment, three studies were assessed to have a high risk of bias because of inappropriate omission or uncertainly of evaluation timing of reference standard.

Discussion
The present study evaluated the significance of PSI, A-DROP, I-ROAD, and CURB-65 for predicting mortality in HCAP patients.Our results indicate that these severity assessment tools cannot accurately predict mortality in patients with HCAP.In addition, there were no significant differences between these severity assessment tools.It has been shown that PSI, A-DROP and CURB-65 in CAP and I-ROAD in HAP have high AUCs, nearly 0.8, for predicting mortality [8][9][10][11][12] .In this meta-analysis, the overall AUCs for these severity assessment tools for predicting mortality are 0.67-0.70,although only two reports showed high AUC values of over 0.8 for A-DROP 38 and CURB-65 41 .AUC is often used to measure the accuracy in studies of severity assessment, and the discriminatory value based on AUC is evaluated as "poor" for 0.60-0.69,"moderate" for 0.70-0.79,"good" for 0.80-0.89,and "excellent" for 0.90-1.00,respectively 44 , although its criteria differ between studies 45 .In our study, PSI and A-DROP had "moderate" discriminative ability, while I-ROAD and CURB-65 showed "poor" discriminative ability when we follow this criteria.Overall, our results showed no significant capability for predicting mortality among the four assessment tools.Generally, patients with HCAP are highly heterogeneous, and their mortality is affected by various factors, including general conditions, laboratory data on admission to the hospital, comorbidities, antibiotic-resistant bacterial infections, and their social backgrounds, in addition, it may be also influenced by the rate of intensive care unit (ICU) admission and/or do not attempt resuscitation (DNAR); for example, the rates of ICU admission and DNAR were 0.9-26.4% 6,13,14,18,30,31,37,39 and 24.0-55.6% 7,18,37, respectively, although these numbers weren't mentioned in all 21 studies.Thus, these severity assessment tools did not show enough predictive capability for mortality in HCAP patients.In addition, these results remained unchanged even when limited to NHCAP patients, although the comparison of severe grade between A-DROP and I-ROAD was only performed due to few studies evaluating PSI or CURB-65 (Supplementary Table S1 and Fig. S1).
This meta-analysis found no significant differences in overall AUCs between PSI and the remaining tools.PSI includes some comorbidities and physical and laboratory parameters as evaluation items and might be the best score for predicting mortality in the patients' group with comorbidities such as HCAP 46 .In addition, the item "pH", included in PSI and the SCAP score, is known as an indicator of metabolic acidosis under sepsis 31 .On the other hand, the item "age", included in all of the severity assessment tools evaluated in this study, occupies a relatively large weight in PSI score but was not a significant risk factor for in-hospital mortality in NHCAP 5 .Further investigation is needed, but age and comorbidities may be overvalued in predicting pneumonia severity in elderly patients such as HCAP 39 .Furthermore, the influence of the general condition, such as "bedridden state" and "low serum albumin" as well as inflammatory biomarkers, such as "CRP level" and "neutrophil-to-lymphocyte ratio" has been shown for predicting mortality in elderly patients with pneumonia 47,48 .Therefore, these explain the low AUC value despite a large number of items, as the prognosis might be more strongly influenced by the ordinal general condition than the presence of comorbid diseases in these patients 43 .Similar to our results with low NLR, Chalmers et al. reported that PSI might be superior for identifying low-risk patients with low NLR (0.2 for ≥ IV and 0.5 for ≥ V) in patients with CAP 49 , although the AUC value in our results was low compared with that of CAP (0.82 for ≥ IV, 0.81 for ≥ V).Therefore, PSI may be useful for identifying low-risk patients in HCAP similar to CAP patients, and NLR below 0.1 is generally considered useful for diagnoses 50 .
A-DROP and CURB-65 are easy to use in daily clinical practice.However, these tools may not be ideal in patients with multiple comorbidities because these tools may underestimate the severity in the elderly patients with comorbidities 51 .In addition, most HCAP patients are over 65 years old, and the age index of A-DROP and CURB-65 might not be significant, although the utility of CURB, without the item "age", was insignificant in patients with HCAP 36,37 .On the other hand, the results of this study showed that A-DROP and CURB-65 had almost similar predictive capabilities to PSI in the evaluation using overall AUC.PSI is relatively complex and often avoided in complicated environments such as an emergency room.Our results indicate that the predictive abilities of themselves were not enough to predict mortality, but A-DROP and CURB-65 can be one of the choices, instead of PSI, in clinical practice for HCAP owing to their evaluation conveniences.
Our previous study could not evaluate the utility of I-ROAD for predicting mortality in HCAP 22 because there was only one report 19 (accessed July 16, 2015).However, this study analyzed reports on I-ROAD published after 2015 (all Japanese studies).I-ROAD includes immunodeficiency and radiological findings, and these are a major difference from the other severity assessment tools, such as PSI, A-DROP and CURB-65.Indeed, it was reported that the prognostic ability of PSI and CURB-65 for mortality prediction in HCAP patients changed irrespective of immunosuppression 36 , consistent with our previous study 22 .In addition, radiological characteristics such as bilateral pneumonia were reported as independent risk factors for mortality in NHAP 52 .Although I-ROAD is not widely used outside Japan, it might be a viable choice in patients with HCAP since there was no significant difference between I-ROAD and other severity assessment tools although their low prognostic capability.
In addition to the major evaluation method listed above, there are various severity assessment tools such as the IDSA/ATS severity criteria 13,31 , M-ATS 30,31 , NHAP index 16 , NHAP model score 14 , qSOFA 7,43 , R-ATS rules 30 , SCAP 31,36 , SMART-COP 16,31 , SOAR 14,31,37 and SOFA 7 , but none of them showed adequate prognostic capability.In Japan, sepsis evaluation using qSOFA and SOFA was recommended as the initial evaluation in the 2017 JRS guidelines for managing pneumonia in adults, in addition to severity assessment by PSI, A-DROP, or CURB-65 25 7 .On the other hand, it was reported that the evaluation based on clinical conditions such as malnutrition, acute mental status deterioration, health conditions requiring home care, recent hospitalization, and low BMI should be used for severity assessment 52,53 .We also showed the usefulness of combining hypoalbuminemia with the PSI or qSOFA, which increased the AUC for mortality from approximately 0.7 to 0.75 compared to PSI or qSOFA alone in NHCAP patients 43 .In addition, the efficacy of various serum biomarker such as the neutrophil to lymphocyte ratio, pro-adrenomedullin, prohormone forms of atrial natriuretic peptide, and heparin-binding protein for mortality prediction have been demonstrated in pneumonia patients [54][55][56] .Thus, combining new items might be needed to be considered for predicting mortality in HCAP patients.There were some limitations in this systematic review and meta-analysis.First, the included reports had a large heterogeneity-a common drawback in meta-analyses 57 .In other word, this study had differences in each country, study design, category of pneumonia, study population, outcome, and the rates of ICU admission and DNR order, which we could not assess due to limited accessible data and a relatively small sample size.However, the heterogeneity in the HCAP population makes our findings significant.In addition, we evaluated only short-term mortality but evaluating the long-term mortality may be hoped in patient groups where a prolonged hospital stay is likely, such as HCAP cases.Second, the cut-off values of each assessment tools used for the AUC calculation may vary slightly in each report, but the cut-off values for the severity grade are generally defined in these four assessment tools and we believe that the influence on overall AUCs is therefore insignificant.Third, we could not evaluate the efficacy of A-DROP for scores more than "moderate" because many studies included in this analysis had a sensitivity of almost 100% and a specificity of 5% or less.But these results may indicate that the criteria of moderate grade in A-DROP do not have a mean for mortality prediction because most subjects, including those in the HCAP category, are adults aged 65 and above.Finally, this systematic review might have some selection bias due to the reason of limited searching database and languages included in search strategy.
In conclusion, the predictive role of PSI, A-DROP, I-ROAD, and CURB-65 for mortality was insufficient for predicting mortality in HCAP patients.We have described useful prognostic factors for mortality in HCAP patients, hoping to establish a more useful severity assessment tool with highly accurate prediction ability while considering the existing tools.

Figure 1 .
Figure 1.Flow chart of the study selection.
Figure because the data necessary to create it in both a cut-off value of ≥ I and ≥ III was insufficient.

Table 1 .
Calculation parameters of the PSI, A-DROP, I-ROAD, and CURB-65.

Table 3 .
Pooled characteristics of severity scores for predicting mortality.PLR positive likelihood ratio, NLR negative likelihood ratio, DOR diagnostic odds ratio.