Laboratory biomarkers associated with COVID-19 mortality among inpatients in a Peruvian referral hospital

Aim To evaluate the biochemical and hematological markers associated with the risk of death due to COVID-19 in a clinical cohort with a severe clinical profile. Methods A retrospective study was conducted among 215 anonymized inpatient records from the Hospital Nacional Almanzor Aguinaga Asenjo, Peru, between April and June 2020. The association between biomarkers and death due to COVID-19 was assessed using Cox regression, with a multivariable modeling of 1) biochemical and 2) hematological markers. Kaplan-Meier analyses and time-dependent receiver operating characteristic curves were calculated for each associated biomarker (p < 0.05). Results Data analysis of 215 inpatient records revealed an overall mortality rate of 51.30% (95% CI 44.70–58.50), a mean age of 63.90 ± 14.10 years, and a median oxygen saturation of 88% (interquartile range 82–92%). The best-fitted biochemical model included higher levels of C-reactive protein (CRP), D-dimer, fibrinogen, urea, and lactate dehydrogenase. Similarly, the best-fitted hematological model included higher absolute neutrophil and prothrombin time, and lower absolute platelet counts. The best area under the curve values in both models were found to be CRP and D-dimer values (>0.74) and the absolute neutrophil count (0.63). Conclusions Some specific biochemical markers outperformed hematological markers. Evaluated hematological counts analyzed in multivariable models proved to be better markers and could be useful to discriminate COVID-19 patients at high risk of death.


Background
The COVID-19 emergency has resulted in multi-level shortages of resources, leading to the collapse of health systems worldwide, notably in Latin America [1][2][3].Peru has the highest standardized mortality rate globally, with 6552 deaths per 1 million inhabitants [4].The lack of access to hospital care opportunities, the widespread use of therapies lacking scientific evidence, and self-medication contributed to poor outcomes in many patients [5,6].
SARS-CoV-2 infection is mediated by the interaction between the receptor-binding domain of the viral Spike protein and the human angiotensin-converting enzyme 2 (ACE-2) expressed in pneumocytes [7].Infection induces the overproduction of proinflammatory, leading to a cytokine storm that can result in acute respiratory distress syndrome and multiple organ dysfunction [8,9].The cascade of immunological and metabolic reactions produces key biomarkers that are over-and under-expressed during the evolution of COVID-19 within the host.
For clinical decision-making, it is recommended to use mainly biochemical markers such as elevated levels of D-dimer, CRP, LDH, troponin, ferritin, and CK, as well as a decrease in absolute lymphocyte counts and an increase of NLR, as indicators of COVID-19 severity [11][12][13][14][15][16].However, further studies are needed to determine cut-off points in populations with severe clinical profiles and to confirm their real diagnostic power in larger populations.Despite the usefulness of these biomarkers, their determination is not available in health centers with low complexity levels.Therefore, it is fundamental to use accessible laboratory biomarkers to classify patients early according to their prognosis.
In addition to biochemical markers, the complete blood count (CBC) provides valuable information related to inflammatory, infectious, and coagulation factors [8,17].Hematological markers are routinely assessed upon admission and during the follow-up of all inpatients (moderate or severe) due to their suitability.Despite massive vaccination decelerating the spread of COVID-19, there are still primary care centers attending COVID-19 cases operating under precarious conditions.Therefore, it is essential to use accessible resources to support clinical decision-making and for the timely allocation of resources to patients at risk.Understanding the role of biomarkers as a clinical decision support tool will allow the early classification of COVID-19 cases requiring specific care according to their prognostic risk.The present study aimed to evaluate the prognostic value of biochemical and hematological markers in patients with severe clinical symptoms associated with mortality due to COVID-19.

Study design
A retrospective cohort study was conducted at Hospital Nacional Almanzor Aguinaga Asenjo (HNAAA), which is part of the Seguro Social de Salud (ESSALUD), in the northern Peruvian city of Lambayeque.HNAAA has 404 hospitalization beds and 38 intensive care unit (ICU) beds to normally serve 660,506 insured persons.However, during the peaks of the COVID-19 pandemic, this hospital attended to any patient in need of hospital care in northern Peru, encompassing 1,197,260 inhabitants.

Population
Data from anonymized records of patients admitted to the hospital between April and June 2020 were evaluated.The study population consisted of inpatients admitted during the first wave of COVID-19 who manifested respiratory distress, oxygen saturation (SpO2) <93%, and a partial pressure of arterial oxygen/oxygen concentration ≤300 mmHg [18].Clinically compatible COVID-19 cases were confirmed using a rapid anti-SARS-CoV-2 antibody serological test (RT-PCR tests were not available) following national criteria [19].All patients were followed throughout the entire course of their progression until hospital discharge (death or survival), and none were referred to another healthcare facility.

Data
Demographic and clinical data, as well as patient disposition (discharge or death), were extracted from medical records and directly digitized into a database.The laboratory parameters included in the study corresponded to the first results evaluated in hospitalized patients during the admission period, presenting severe or critical COVID-19.

Variables
Death due to COVID-19 was the primary outcome, and the date of death was recorded by the attending physician.Sex and comorbidities, as well as the presence of granulocyte precursors and atypical lymphocytes in peripheral blood, were considered as categorical covariates.Age, disease duration, length of hospitalization, SpO2, complete blood count (hemoglobin, hematocrit, S. Montero et al. platelets, plateletcrit, leukocytes, lymphocytes, neutrophils, mean platelet volume, platelet distribution width, platelet large cell ratio), prothrombin time, partial thromboplastin time activated, International Normalized Ratio (INR), and biochemical markers (D-dimer, ferritin, LDH, urea, CRP, creatinine, ALT, AST, glucose) were analyzed as continuous variables.Reference ranges are found in Additional file 1.All laboratory biomarkers were evaluated using the first result at hospitalization.

Statistical analysis
Continuous variables were summarized as mean and standard deviation (SD) or median and interquartile range (IQR).Categorical covariates were summarized as absolute and relative frequencies.Bivariate analysis was performed using the Chi-square, Fisher exact, Student's-t or median test, depending on the data.We addressed missing data with multiple imputations (Additional file 2).We also plotted linearity between all biomarkers and the outcome (Additional file 3).
Patient survival was also estimated by comparing the tertiles of biomarkers using Kaplan-Meier.Log-rank test was calculated to determine global differences between survival curves without crossing patterns, and the Flemington-Harrington test for the crossed curves, which evaluates late differences (p < 0.001) [20].
Non-linear biomarkers were transformed into restrictive cubic splines (RCS) with four knots, and significant RCS variables, determined by the Wald test (p < 0.05), were then included for multivariable modelling.Multivariable models were generated using Cox regression and the nested model method with backward selection.A model composed of biochemical markers was built using all the variables from the database.Likewise, three models (A, B, and C) composed only of hematological markers (excluding biochemical markers) were built to evaluate indicators available in healthcare centers with low complexity levels.We modeled: A) all hematological variables, B) variables excluding relative hematological values, and C) complete blood count values.The most parsimonious models were selected using the Wald test (p < 0.05) for a pool analysis of imputed datasets.
Finally, the individual performance of explanatory biomarkers was evaluated by a time-dependent receiver operating curves (ROC) analysis (30 days), estimating the area under the curve (AUC) for each biomarker, sensitivity, and specificity for an optimal cut-off point according to the Youden index, as implemented in the package sttroccurve [21].Also, positive and negative predictive values were estimated.All analyses were conducted using Stata software v.17 (Stata Corp. College Station, TX).

Ethics
The study protocol received approval from the Ethics and Research Committee of the Red Prestacional Lambayeque -Hospital

Table 1
Characteristics and biochemical markers of the study population.Nacional Almanzor Aguinaga Asenjo, determination N • 026-CIEI-RPLAMB.Since the analysis was conducted with anonymized data, the Ethics Committee provided an exemption from the informed consent procedure.

Laboratory findings
Overall, the study population presented a severe clinical picture represented by leukocytosis, neutrophilia and severe lymphopenia.Coagulation markers such as platelet count, plateletcrit and PT ranged between normal values.However, the overall median fibrinogen levels exceeded references ranges (>500 mg/dl).Among biomarkers associated with COVID-19 severity, all the patients presented elevated CRP levels (>0.5 mg/dl), indicating the presence of an advanced inflammatory process at the time of admission (Tables 1 and  2).

Biomarkers and mortality by COVID-19
Leukocytosis and neutrophilia at admission were significantly higher among fatal cases (p < 0.001).Additionally, the NLR was higher among fatal cases, with a median of 15.43 (p = 0.01).Overall, patients with a poor prognosis had lower platelet counts, lower plateletcrit, and a longer PT than those who survived (p < 0.05) (Table 1).Coagulation alterations were higher among the fatal cases, presenting three times the median of survivors (both above the normal value < 0.50 μg/ml).Besides, CRP, ferritin, urea, and LDH values were severely elevated in non-survivors.In the bivariate analysis of survival vs. death, D-dimer, CRP, ferritin, urea and, LDH showed statistical significance p < 0.001 (Table 1), contrary to what was identified in granulocyte precursors and atypical lymphocytes in peripheral blood (Table 2).

Kaplan-Meier survival analysis
Differences in the mortality rate were observed along survival functions for SpO2 and tertile biomarkers.As expected, the survival of patients from the lowest SpO2 tertile (≤55%) was lower than the high (≥91%) and medium (56-90%) tertiles.Regarding the explanatory biomarkers, survival was much higher in patients with lower D-dimer and fibrinogen, and higher CRP.The survival curves for the other markers did not show sustained trends at each tertile over time (Fig. 1).

Non-adjusted and adjusted Cox regression models
For each additional year in age at hospital admission, the risk of mortality increased by 2% (p = 0.05).Moreover, a high SpO2 was associated with greater protection against death (3% less risk for each additional unit in the percentage of SpO2, p < 0.001).Hematological markers, including the absolute leukocyte count, relative and absolute lymphocyte count, relative and absolute neutrophil count (8% greater risk for each 10 3 cell increase), platelets, plateletcrit, and PT, were significantly associated with mortality (p < 0.05).Likewise, biochemical markers such as D-dimer (14% higher risk for each increase in one unit), fibrinogen, LDH, CRP, urea, ferritin, and creatinine (p < 0.005) were also associated with mortality (Table 3).
Several adjusted regression models were built with the incorporation of variables following technical criteria.One model composed of the most explanatory biochemical markers for mortality by COVID-19 included: low SpO2, and high levels of D-dimer, fibrinogen, CRP, urea, LDH and platelet count (Table 4a).In addition, models were built using only hematological markers, resulting in three explanatory models composed of hematological markers associated with risk for death, accompanied by low SpO2.Explanatory covariates of mortality by COVID-19 included: A) and B) both models converged in selecting high number of neutrophils, low platelet count and longer PT; and C) high number of neutrophils and low platelet count.(Table 4b).Moreover, the dynamic relationship between RCS of SpO2, D-dimer, CRP and LDH are shown in Fig. 2.

Time-dependent ROC
At day 30, the individual explanatory variable with the largest AUC was CRP (0.79), with similar sensitivity and specificity.Other biomarkers displaying an AUC above 0.70 were D-dimer and LDH.On the other hand, hematological markers shaping models A, B and C, generally had low individual performances (AUC ~0.60).The absolute neutrophil count best explained COVID-19 mortality at day 30.Complete information regarding the biomarkers' prognostic capacity is detailed in Table 5.The different ROC curves are shown in Fig. 3.

Discussion
The current study aimed to identify blood markers associated with COVID-19 mortality during the first wave of the pandemic in Peru during 2020.This is also one of the few studies to evaluate a cohort severely compromised by COVID-19 at hospital admission [6,22], considering the absence of widespread vaccination and during the circulation of more virulent SARS-CoV-2 strains.Due to limited-resources, Peru and other low-and middle-income countries struggled to provide adequate healthcare to COVID-19 patients.In these settings, the identification of critical patients based on available platforms, such as the use of hematological and biochemical markers, is essential to determine various levels of hospital care.
The overall hospital mortality proportion was high compared to other countries [23][24][25][26], but consistent with Peruvian studies; 46-49.59% in Lima [22,27] and 9% lower in the North Coast [6].The elevated mortality proportion (51.60%) was due to the downgrading of hospital attention, which forced affected people to use in-house oxygen therapies and non-evidence-based drugs [5,6].Delay in hospital management and increased disease duration influenced the high mortality rate compared to other contexts [28].
Oxygen saturation is the most important predictor of poor outcomes due its relation to alveolar deterioration, allowing the identification of cases with the greatest need for ventilatory support.The association of SpO2 <85% with a higher risk of death by COVID-19 was previously reported [12,22].Dysregulation of immune response mediated by IL-6 induces an elevation of CRP [8] and ferritin levels >800 μg/L [29], observed in patients with severe or fatal COVID-19.CRP, a fast-increasing acute inflammatory indicator, exhibited in our study the highest levels identified in COVID-19 patients to date [12,24].In our study, the role of fibrinogen predicted the risk of death in the study population.Similarly, high D-dimer levels produced during fibrin degradation suggest thrombosis and thrombolysis [8] and are associated with lower survival [12].In our study, the results suggested that even patients who survived could be at risk of thrombosis on admission; however other features should be considered.LDH suggests vital organ injury, in which overregulation of the glycolytic pathway leads to a lack of oxygenation and multi-organ failure in severe COVID-19 patients [10].However, in our cohort, LDH values remained within the reference ranges.On the other side, elevated urea indicates misfunction of kidney metabolism and is related to COVID-19 severity [30], alone or along with the albumin as a ratio [30,31].In the study population, urea values were severely increased on admission among survivors and patients who died.Although patients seeking clinical care exhibited low SpO2 at hospital admission, many had access to portable oxygen during home care, which could have prevented organ injury before hospitalization.
Neutrophilia and Lymphopenia are caused by the dysfunction of the innate and adaptive immune response in severe COVID-19 patients, respectively [10,32].In our study, survivors and fatal cases presented neutrophilia and lymphopenia, which was inconsistent with previous studies [33][34][35].Our study population also presented very high NLR values, which have also been described before [11,36].The lack of association of these values with the risk of death could be possibly because the entire population presented critical profiles on admission.Thrombocytopenia after inflammatory overregulation is controversially related to peripheral platelet consumption and thrombotic events among COVID-19 patients with poor outcomes [17,25,33,35,37].A significant association has been reported between prolonged PT, a coagulopathy indicator, and an unfavorable prognosis, even when the PT values are within the reference range [35,38].
Thus, the evaluation of the prognostic potential for SpO2 was previously estimated as an AUC of 0.70 [12], which is compatible with our results (AUC 0.69).The reliability of CRP values is widely inconsistent, documenting good performances (AUC 0.86-0.92) in some studies [8,12] and weak performances (AUC 0.69) in others [33].In our study, the AUC of CRP showed fair accuracy (0.79).
Many reports have proposed the D-dimer cut-off to be somewhere between 0.67 and 2.03 μg/ml, with an AUC between 0.81 and 0.88 [8,12,39], showing better performance than in our study.Similarly, the AUC for LDH, ferritin, and NLR were estimated to be between 0.62 and 0.64, with only the absolute neutrophil count presenting an AUC of 0.53 [33].Likewise, the AUC for urea was estimated at 0.69-0.78[30,31].Another study calculated the AUC for NLR at 0.77 [17], showing that this estimate can significantly differ from study to study.The platelet count in one study showed an AUC of 0.81 [17], presenting much better performance than in our study.Therefore, the efficiency of CRP, urea, LDH and D-dimer, the best-performing biomarkers, only achieved fair accuracy (none exceeded the accepted 80%).In relation to hematological markers, the individual diagnostic ability was not found to be good, with only the absolute neutrophil count showing a regular performance (AUC 0.63), albeit better than previous reports [33].
Models composed of combined covariates associated with the risk of mortality by COVID-19 aim to generate predictive tools capable of predicting poor COVID-19 outcomes.These models include different combinations of IL-6, CRP, D-dimer, absolute neutrophil count, and others, showing a generally good sensitivity and specificity (>90%) [8,35,40,41].However, confirming the efficiency of these models requires an adequate modeling process and a comprehensive external validation [42].The validation process reduces the overfitting of AUC estimation in training samplings, and thus, corrects the performance values.Other studies sought to generate models using only values of the CBC, the analysis of which is much more accessible, finding neutrophils, lymphocytes, and platelets to be explanatory variables for progression to unfavorable outcomes [37,43].The multivariable models constructed in the present study did not report AUC values, as it was not possible to carry out external validation that would allow calculating the real AUC.Furthermore, internal validation of these models was beyond the aims of this study.

Limitations
The use of retrospective data remains a significant limitation that could impact data quality and introduce information bias.However, given the study context, a retrospective design was the only feasible option.Despite the challenges it entails, we anticipate  b Dynamic aHR estimations are illustrated in Fig. 2 for variables transform in restrictive cubic splines due to their lack of linearity.Restricted cubic spline models fitted for Cox proportional hazards models with four knots for three significant non-linear associations from the principal analysis.We used as reference values the cut-offs estimated for ROC analysis, on the left, plots generated with case-complete data, and on the right side, plots generated with imputed datasets.Only SpO2 and Ddimer represent a reliable continuous relationship with HR estimations.Lower SpO2 % represents a higher risk, which is dynamic depending on the sample size (values < 75% lack of precision).On the contrary, higher D-dimer values are related dynamically to death by COVID-19 (values >9ug/ ml risk of lacking precision).
S. Montero et al. that any registration errors are distributed uniformly.The inclusion of patients from a single site may induce selection bias; therefore, further research is needed to confirm these findings in diverse cohorts.Additionally, we identified the potential existence of Neyman bias, where the study population, more severely affected by COVID-19 might have been at higher risk of death.This inadvertently led to the exclusion of related biomarkers from the analysis.Undetermined markers, such as IL-6 (which presented the best AUC identified AUC, area under the curve; PPV, positive predictive value; NPV, negative predictive value.The individual performance of explanatory biomarkers was evaluated by a time-dependent receiver operating curves (ROC) analysis (30 days), estimating the area under the curve (AUC) for each biomarker, sensitivity and specificity for an optimal cut-off point according to Youden index as implemented in the package sttroccurve.0.931) [12], albumin, troponin, and others, were not measured.The inclusion of these markers is uncommon in routine clinical practice (troponin, for example, had 86% missing data in our study), and their incorporation does not align with the objective of identifying biomarkers available in resource-limited hospital settings.As a result, we were unable to extract the medical indication for admission or determine actual ICU admissions.Local studies indicate that around 64% of patients may require ICU care, but only 3.30-10.20%are admitted [6,22,27].In terms of statistical analysis, we employed a traditional variable selection strategy for multivariable modeling, which may overestimate associations.However, we opted for a backward method, considered preferable over the forward method [44], as our dataset included numerous candidate variables that could potentially serve as prognostic factors.Nonetheless, our data will be made available for independent groups to conduct alternative statistical analyses, contributing to a more comprehensive understanding of the role of biomarkers in poor COVID-19 outcomes within the scientific community.
Laboratory biomarkers are linked to the risk of death from COVID-19, and these parameters hold significant prognostic value.However, this association is primarily observed with biochemical markers, which are not routinely included in laboratory monitoring for COVID-19 patients in low-and middle-income hospitals.Hematological markers, being more cost-effective, monitor parameters uniformly for all inpatients.Every patient, regardless of their clinical condition, has equal priority for the determination of these parameters.While the AUC of each hematological marker associated with the risk of COVID-19 mortality did not exhibit high accuracy, combined covariates demonstrated a good fit.Moreover, the models constructed could be validated in other cohorts.Our study highlights that hematological markers can elucidate mortality from COVID-19 and are valuable in aiding clinical decision-making to prioritize critical care for COVID-19 patients.These tools, alongside other public health initiatives, will bolster a rapid response and facilitate preparedness strategies for the ongoing COVID-19 emergency and future pandemics.

Conclusion
Biomarkers associated with the risk of mortality were identified in the severe COVID-19 cohort.Multivariable models revealed that, in addition to oxygen saturation, hematological counts-such as the absolute neutrophil count, platelet count, and PT-assessed at hospital admission exhibited a good fit and held significant prognostic value in discriminating patients at risk of death from COVID-19.When considering individual prognostic performance, biochemical markers outperformed hematological parameters.
influence the work reported in this paper.

Fig. 1 .
Fig. 1.Kaplan-Meier survival analysis for explanatory biomarkers categorized in tertiles.Log-rank test was calculated to determine global differences between survival curves without crossing patterns (SpO2, D-dimer, LDH, CRP; p < 0.001), and the Flemington-Harrington test for the crossed curves, which evaluates late differences (fibrinogen, urea, neutrophil count, platelet count and prothrombin time; p < 0.001).

Fig. 2 .
Fig.2.Non-linear associations between biomarkers and death by COVID-19.Restricted cubic spline models fitted for Cox proportional hazards models with four knots for three significant non-linear associations from the principal analysis.We used as reference values the cut-offs estimated for ROC analysis, on the left, plots generated with case-complete data, and on the right side, plots generated with imputed datasets.Only SpO2 and Ddimer represent a reliable continuous relationship with HR estimations.Lower SpO2 % represents a higher risk, which is dynamic depending on the sample size (values < 75% lack of precision).On the contrary, higher D-dimer values are related dynamically to death by COVID-19 (values >9ug/ ml risk of lacking precision).

Fig. 3 .
Fig. 3. ROC curves of explanatory biomarkers at day 30.The individual performance of explanatory biomarkers was evaluated by a time-dependent receiver operating curve.None of them exceeded individually 0.79 of AUC.

Table 2
Hematological count, presence of granulocyte precursors and atypical lymphocytes on peripheral blood.

Table 3
Non-adjusted analysis of biomarkers associated with mortality by COVID-19.

002-1.12) 0.040 HR
, hazard ratio; CI, confidence interval.The non-adjusted analysis was conducted using Cox regression, which included imputed datasets.Values in bold indicate significant associations (p < 0.05).aDatawithmissing values.bVariablesscaled/10 to better interpret the HR.S.Montero et al.

Table 4a
Adjusted analysis of biochemical markers associated with COVID-19 mortality.Variables scaled/10 to better interpret the HR.b Dynamic aHR estimations are illustrated in Fig.2for variables transform in restrictive cubic splines due to their lack of linearity.
a S.Montero et al.

Table 4b
Adjusted analysis of hematological markers associated with COVID-19 mortality.The multivariable (adjusted) models composed by hematological markers were generated using Cox regression and the nested model method with backward selection.Variables were specified according to the following criteria: Model A, modeling included all hematological variables; Model B, global modeling excluding variables expressing relative values; Model C, modeling including only complete blood count.The most parsimonious models were selected using the Wald test (p < 0.05) for a pool analysis of imputed datasets.Values in bold indicate significant associations (p < 0.05).
a Variables scaled/10 to better interpret the HR.

Table 5
Sensitivity and specificity of the explanatory biomarkers.