Risk factors for COVID-19 mortality among telehealth patients in Bangladesh: A prospective cohort study

Background and objective Estimating the contribution of risk factors of mortality due to COVID-19 is particularly important in settings with low vaccination coverage and limited public health and clinical resources. Very few studies of risk factors of COVID-19 mortality used high-quality data at an individual level from low- and middle-income countries (LMICs). We examined the contribution of demographic, socioeconomic and clinical risk factors of COVID-19 mortality in Bangladesh, a lower middle-income country in South Asia. Methods We used data from 290,488 lab-confirmed COVID-19 patients who participated in a telehealth service in Bangladesh between May 2020 and June 2021, linked with COVID-19 death data from a national database to study the risk factors associated with mortality. Multivariable logistic regression models were used to estimate the association between risk factors and mortality. We used classification and regression trees to identify the risk factors that are the most important for clinical decision-making. Findings This study is one of the largest prospective cohort studies of COVID-19 mortality in a LMIC, covering 36% of all lab-confirmed COVID-19 cases in the country during the study period. We found that being male, being very young or elderly, having low socioeconomic status, chronic kidney and liver disease, and being infected during the latter pandemic period were significantly associated with a higher risk of mortality from COVID-19. Males had 1.15 times higher odds (95% Confidence Interval, CI: 1.09, 1.22) of death compared to females. Compared to the reference age group (20–24 years olds), the odds ratio of mortality increased monotonically with age, ranging from an odds ratio of 1.35 (95% CI: 1.05, 1.73) for ages 30–34 to an odds ratio of 21.6 (95% CI: 17.08, 27.38) for ages 75–79 year group. For children 0–4 years old the odds of mortality were 3.93 (95% CI: 2.74, 5.64) times higher than 20–24 years olds. Other significant predictors were severe symptoms of COVID-19 such as breathing difficulty, fever, and diarrhea. Patients who were assessed by a physician as having a severe episode of COVID-19 based on the telehealth interview had 12.43 (95% CI: 11.04, 13.99) times higher odds of mortality compared to those assessed to have a mild episode. The finding that the telehealth doctors’ assessment of disease severity was highly predictive of subsequent COVID-19 mortality, underscores the feasibility and value of the telehealth services. Conclusions Our findings confirm the universality of certain COVID-19 risk factors—such as gender and age—while highlighting other risk factors that appear to be more (or less) relevant in the context of Bangladesh. These findings on the demographic, socioeconomic, and clinical risk factors for COVID-19 mortality can help guide public health and clinical decision-making. Harnessing the benefits of the telehealth system and optimizing care for those most at risk of mortality, particularly in the context of a LMIC, are the key takeaways from this study.


Introduction
The COVID-19 pandemic has claimed over 6.45 million lives [1] (as of August 2022) worldwide since the first official reported death in January 2020. Mortality rates and case-fatality ratio for COVID-19 have varied widely across countries and across different waves of the pandemic [2]. While several risk factors for severe disease due to COVID-19, most notably age, have been well-established [3], understanding who is most at risk of hospitalization and death remains an important public health priority. Understanding these risk factors is especially important in the context of low-and middle-income countries (LMICs) where resources are scarce and therefore need to be prioritized to care for those who are at highest risk of adverse outcomes. The relative importance of exposure to demographic, socioeconomic and clinical risk factors on COVID-19 severity and mortality in Bangladesh is not adequately studied. Bangladesh, a lower middle-income country in South Asia with a dense population, has observed 12,136 total cases per million population and 175 deaths per million population during the pandemic [4].
The majority of studies examining risk factors for severe disease and death due to COVID-19 have relied on population-based inferences, and have shown associations with age [5,6], gender [7], and socioeconomic status [8]. Data linkage between death records and other sources of individual-level data, such as surveys, is available in only a handful of countries [9][10][11][12][13][14], and rarely in a LMIC. The majority of studies from LMIC are from sub-Saharan Africa [15], while studies from South Asia are mostly cross-sectional, with small patient populations and limited data on potential predictors [16][17][18]. To date, two studies using small datasets with limited information on socioeconomic factors, symptom severity and preexisting conditions have examined the risk factors of severe illness and mortality due to COVID-19 in Bangladesh [19].
Here, we use data from 290,488 lab-confirmed COVID-19 patients who participated in a telehealth service in Bangladesh, linked with COVID-19 death data from a national database, to study the risk factors associated with mortality. All patients who participated in the telehealth service between May 17, 2020 and June 14, 2021 were eligible to be included in the study. During this period, there were approximately 807,704 COVID-19 cases and 12,844 COVID-19 attributable deaths recorded in Bangladesh [1]; our study, thus, covers about 36% of all recorded cases in the country during the study period. Mass vaccination was rolled out in Bangladesh on February 21st, 2021 and only a small proportion (<4%) of the Bangladeshi population was vaccinated during our study period ending in June 2021 [20].
The contribution of our study is two-fold. First, understanding the demographic, socioeconomic, and clinical risk factors for COVID-19 mortality can help guide clinical decision-making. These results are particularly relevant in resource-constrained settings where hospital beds and ICU capacity may be limited, or during infection surges when there is an impending shortage in healthcare services. Knowledge of the risk factors can also guide policymakers in targeting non-pharmaceutical interventions toward populations that are most at-risk. Second, our analysis provides support for the usefulness of telehealth services in Bangladesh. The physician's evaluation of patients, based on the interviews through the telehealth service, was highly predictive of eventual mortality; this finding highlights the feasibility of using telehealth services for triaging patients who are the most at-risk.

Study design and data sources
We conducted a prospective cohort study using routinely collected electronic data from PCR positive COVID-19 patients who received telehealth services provided by the Government of Bangladesh between May 17, 2020 and June 14, 2021. Under this program, PCR positive COVID-19 patients received a call from a health information officer (HIO) who confirmed their COVID status and conducted an initial assessment of their health condition. The HIOs then transferred the patients to a physician, who further assessed patients' health status and advised them regarding the next course of their treatment. Based on the physician's assessment, follow-up calls were scheduled. Every patient who agreed to receive the telehealth services received at least one phone call and a follow-up call after 3, 5, or 10 days depending on whether they were assessed as having severe, moderate, or mild symptoms respectively. An additional follow-up call was scheduled for 7, 10, and 14 days after the initial call based on the assessed severity of symptoms.
During the study period, 334,626 patients received telehealth services. Of these, we removed all patients whose age and sex data were not correctly recorded (7823 patients), patients with one or more questions with seemingly incorrectly entered data (103 patients), and patients without mortality information (36,212 patients). Our final study population included 290,488 patients.
Ethical review for this study was sought from The New York State Psychiatric Institute (NYSPI IRB protocol #8173) Institutional Review Board. The board determined that the secondary data analysis of routine patient data was exempt from ethical review and approval as it did not meet the definition of human subject research according to federal guidelines. We also obtained ethical approval from the institutional review board of International Center for Diarrhoeal Disease Research in Bangladesh, (icddr,b IRB protocol PR-23030)

Outcomes and predictor variables
The outcome of our analysis was a binary indicator of whether a death was recorded for a patient in the study population. The telehealth database recorded deaths when family members reported the death during follow-up calls. We obtained additional death information from the government's COVID-19 death database compiled by the Directorate General of Health Services. The telehealth and mortality database were merged by the authorities. To combine the databases, unique identifier in the death database was created by finding unique combinations of patients' mobile number, age, gender, and district of residence. The same parameters were matched on the telehealth database and the unique identifier was confirmed. This unique ID was later used for merging the two databases for further analysis. Then de-identified data was shared with us for the analysis of risk factors presented in this paper.
We analyzed potential risk factors of COVID-19 mortality based on the questions asked by the physicians during clinical assessment. The telehealth questionnaire captured data on patient demographics, living conditions, pandemic period, pre-existing health conditions, presenting symptoms, patients' self-assessment of health status, physicians' assessment of patients' mental status, and physicians' assessment of patients' overall health condition.
Patients' age was recorded as a continuous variable, which we categorized into 5-year groups, with ages 20-25 treated as the reference category. Use of categorical age variable allowed us to examine non-linear relationship of age with mortality. Five-year age groups were chosen as similar age categories was used in analyses of age effects on COVID-19 mortality in previous studies [21]. We used proxies for determining a patient's socioeconomic status (SES) as we did not have direct measures of income, occupation, education, etc. We categorized a patient as having low SES if they reported that they were unable to isolate at home (indicating crowded living conditions) or did not have a separate bedroom and bathroom for their use in the house.
Physicians ascertained the presence of comorbidities based on patients' self-reporting and then verified these self-reports by asking follow-up questions about specific medications taken for the condition(s) reported. Comorbidities were categorized as diabetes, hypertension, chronic kidney disease, chronic liver disease, chronic respiratory disease (asthma, COPD), and others (thyroid conditions, Alzheimer's disease, cancers, and heart disease). Patients' reports of symptoms including fever, cough, chest pain, loss of taste and smell, headache, weakness, diarrhea, and vomiting were recorded as none, mild, moderate, and severe. Physicians recorded their overall assessment of the patient's physical condition as mild, moderate, and severe, and patients were advised on treatment and follow-up plans accordingly. Measures of mental health include physicians' records of patients' mental condition: normal, stressed, or panicked, and also patients' report of having adequate sleep. Patients also reported their own assessment of the improvement of their health status.

Statistical analyses
To examine the association between potential risk factors and mortality, we used multiple logistic regression models, with death as the binary outcome. The odds ratios of mortality estimated from the logistic regressions approximates risk ratios given the low incidence of the outcome. We preferred logistic regression over binomial regression because of better model performance. The odds ratios (OR) presented in the manuscript are adjusted odds ratios obtained from the three separate multivariable models. We used separate multivariable models for three sets of predictor variables based on their temporal location in the disease pathway (S1 Fig). This allows us to avoid introducing bias in our analyses by adjusting for variables on the hypothesized causal pathway between a risk factor and mortality. The three models with different sets of predictors are: (1) model 1 that included patient's age, sex, location, period of the pandemic, socioeconomic status, comorbidities, and symptoms; (2) model 2 that included patient's age, sex, patient's mental health status, patient's rating of health improvement, and amount of sleep they were getting; (3) model 3 that included patients age, sex, and physician's rating of a patient's health condition. Missing values for covariates were replaced with a "missing" indicator and included in the multivariable models.
To evaluate which risk factors are most important in the context of clinical decision-making, we used classification and regression tree (CART) models to generate decision trees. The CART is a statistical technique based on recursive partitioning analysis and is well suited for the generation of clinical decision rules [23][24][25]. Unlike multiple logistic regression, it can handle numerical data that are highly skewed or multimodal and categorical predictors with either an ordinal or nominal structure. The CART involves segregating different values of classification variables through a decision tree composed of progressive binary splits. Every value of each predictor is considered as a potential split, and the optimal split is selected based on the reduction in the residual sum of squares due to a binary split of the data at that tree node. Each parent node produces two child nodes, which can become parent nodes producing additional child nodes. This process continues with tree building and pruning until the tree fits without overfitting the information contained in the data set. We used the R rpart function from the rpart package [26], with method argument "class". A node needed to contain at least 20 observations for a split to be attempted. Any split that did not decrease the overall lack of fit by a factor of 0.001 was not considered. 10 cross-validations were run.
In our data, far more patients survived than deceased. Due to the imbalance, the model tends to focus on the prevalent class and to ignore the rare events, and the scarcity of data leads to poor estimates of the model's accuracy. We generated artificial balanced samples according to a smoothed bootstrap approach for aiding estimation and accuracy evaluation of a binary classifier in the presence of a rare class using the R ROSE package [27,28]. All analyses were performed using R version 4.0.

Results
Our final study population included 290,488 patients, representing around 36% of total COVID-19 cases in the country during that period. 6,951 deaths were recorded among the patients included in our analyses (2.4% of study population). The characteristics of the study population are shown in Table 1. The majority (68%) of the patients were men and resided in Dhaka (54%). Of the 191,775 cases assessed by physicians, most were mild; only 16% were moderate and 1.3% were severe cases.  We observed variation in mortality risk by socioeconomic status, region of the country, and the period of the pandemic. People from low SES (as defined by our proxy measure) had 1.76 (95% CI: 1.56, 1.98) times higher odds of mortality compared to people from low SES backgrounds. Compared to Dhaka, patients from 3 divisions had higher odds of mortality, with the highest odds among the patients from Sylhet (OR: 1.62, 95% CI: 1.40, 1.87). Compared to the first period of the pandemic in our study, the odds ratio of mortality was elevated in the last period (May 14, 2021, to June 14, 2021), although the proportion of cases from period 3 was low (6.7%) due to the cutoff date for the available data.
Compared to people without any comorbidities, the odds of mortality was 2.34 times (95% CI: 2.04, 2.68) higher in patients with chronic kidney disease, and 2.08 (95% CI: 1.57, 2.74) times higher in patients with chronic liver disease. Patients with hypertension (high blood pressure) also had a slightly elevated risk of mortality (OR 1.13, 95% CI: 1.04, 1.23). However, the prevalence of chronic kidney and liver disease was low compared to the prevalence of hypertension, indicating a higher attributable risk of mortality due to hypertension (Fig 2). Diabetes did not have a significant association with mortality after adjustment for other comorbidities and presenting symptoms. Mortality risk was also not elevated among patients with chronic respiratory illness (asthma and COPD). Among the presenting symptoms, breathing difficulty, fever, diarrhea, and body ache were significantly associated with mortality (Table 2, model 1). The risk of mortality increased with  Table 2.
We also estimated the association between COVID-19 mortality and the patient's mental health status and assessment of their health (Table 3, model 2) and the physician's assessment of the patient's health (Table 4, model 3) while controlling for age and gender. Patients who reported that their health was improving and reported having enough sleep had lower odds of mortality; patient's whose mental health state was considered normal by physician's also had lower odds of mortality (Table 3). Physician's assessment of a patient's health status was highly predictive of mortality; patients who were assessed by a physician as having a severe episode of COVID-19 based on the telehealth interview had 12.43 (CI: 11.04, 13.99) times higher odds of mortality compared to those assessed to have a mild episode (Table 4).
Finally, Fig 3 shows the classification tree for death among COVID-19 patients using age, gender, pre-existing conditions, and COVID-19 symptoms. In our main model (excluding physician's assessment of the patient's condition), age and the presence of breathing difficulty played an important role, appearing in numerous nodes across the tree. For example, older patients over the age of 55 with breathing difficulties showed high risk of death (13% mortality rate within the group), while younger people below the age of 55 and without breathing difficulties and without pre-existing conditions showed very low risk (1% mortality rate). As expected, based on the results of model 3 above, the physician's health assessment appears important when included in the model (S2 Fig). For example, older patients over the age of 55   but without any breathing difficulties had a high risk of mortality if the physician's assessment was severe (19% mortality rate). Overall, these results highlight risk factors that are most pertinent in the context of clinical decision making and triaging resources towards patients most at risk of mortality.

Discussion
We conducted a prospective cohort study to examine the risk factors of COVID-19 mortality among 290,488 PCR-confirmed patients from Bangladesh. Results of our analyses show that the risk of dying from COVID-19 among infected patients in Bangladesh was associated with several demographic, socioeconomic, and clinical risk factors. Males, the very young and the elderly, patients from low SES backgrounds, patients with chronic kidney and liver disease, and patients with symptoms indicating severe illness such as breathing difficulty, fever, and diarrhea had a higher risk of mortality compared to other COVID-19 patients. Additionally, we show that a physician's assessment of a patient's health status, based on the telehealth interview, was highly predictive of eventual mortality due to COVID-19. Based on the association of risk factors with mortality in Bangladeshi population, we developed a decision tree to support clinical decision making and referral of COVID-19 patients Our findings confirm the universality of certain COVID-19 risk factors-such as gender and age [29]-while highlighting other risk factors that appear to be more (or less) relevant in the Bangladeshi context. In line with previous studies, we find a higher risk of mortality among males [30,31]. Our finding of a monotonic increase of mortality risk with increasing age is also consistent with prior reports [32][33][34]. Interestingly, the odds of mortality were relatively high below age 65 (the most common cutoff for determining "high-risk" status for vaccine prioritization, etc.), suggesting that in the LMIC context, mortality the risk may start to increase sharply at an earlier age than observed in high-income settings. We also observed elevated odds of mortality among very young children compared to young adults, which may be due to inflammatory multisystem conditions [35,36]. The higher mortality among very young children, however, could also be explained by limited testing among this age group resulting in the detection of cases with only severe symptoms.
We observed a significantly elevated risk of mortality associated with chronic renal disease, chronic liver disease, and hypertension, but not diabetes and chronic respiratory illness. Meta- Classification and regression tree for mortality among COVID-19 patients using age, gender, comorbidities, and presenting symptoms as predictors. N represents the number of patients in the original data (as opposed to the resampled data used to create the tree). Blue indicates the predicted risk of mortality less than or equal to 2%; red indicates the predicted risk of mortality greater than 2% in the data. analyses of prior studies have reported a similar magnitude of risk elevation for hypertension and a larger magnitude of elevation for chronic renal disease and liver disease [37]. Consistent with prior studies, we also found risk elevation among patients with cardiovascular disease and cancer which we grouped into "other" preexisting conditions [38].
Interestingly, we did not find a significant association between COVID-19 mortality with diabetes and chronic respiratory illnesses. The findings on the association between these comorbidities and COVID-19 mortality have been inconclusive in prior literature, with some studies showing a positive association [39]. Both diabetes and hypertension are correlated with old age, therefore residual confounding in previous studies reporting a significant association is plausible. On the other hand, the null results observed in our data could be due to non-differential misclassification. Given the limited access to health care for chronic disease detection in Bangladesh, many COVID-19 patients included in the study may not have been aware of their diabetes and hypertension status [40].
Our study is one of the handful of studies from LMIC to report significantly higher odds of mortality among patients from low SES backgrounds. Previous work has shown that a higher risk of COVID-19 infection and mortality is associated with a higher social vulnerability index [41], income inequality [42], low education [43], immigrant status [44], and black and Hispanic ethnicity ( [45]), but the majority of these studies have been conducted in high-income settings. The higher mortality among people from low SES backgrounds may be caused by a combination of factors including poor nutritional status and inadequate management of comorbid conditions, delayed presentation at healthcare facilities, and lack of access to quality care [46,47]. While we do not have a direct measure of the quality of care, our observation of higher mortality in areas outside Dhaka could be indicative of a lack of access to high quality clinical care in those areas. In addition, lower mortality in Dhaka could also be explained, in part, due to greater access to testing facilities leading to the detection of a greater number of mild and symptomatic cases. The proxy of SES in our study-not having a separate bedroom or a bathroom-is a crude measure and these findings need to be evaluated in future work with more proximal measures of SES such as income, education, and occupational categories.
We find higher odds of mortality among patients who were infected during period 4 (after 5/15/2021). This coincides with the circulation of the Delta variant in Bangladesh. Extending this analysis to cover the duration of the Delta wave-to assess possible heightened mortality risk due to the Delta variant-will be a key direction for future work.
Finally, we present a decision tree highlighting the risk factors that we identified (age older than 55 years, presence of breathing difficulty, and male gender) as most pertinent in the context of clinical decision-making and triaging resources towards patients most at risk of severe COVID-19 and COVID-19 mortality in Bangladesh. Decision trees consolidate our knowledge of the factors that lead to and determine the severity of disease and translate it into clinically actionable items [48]. The decision tree we present fills the gap for a handy decision-making algorithm that can be used by first-level health workers for the initial assessments of COVID-19 patients during telehealth visits. The decision tree presented in our paper can also support clinicians in different parts of Bangladesh to make quantitatively prudent decisions to refer critical COVID-19 patients to health centers equipped to manage severe cases.
Our study has several limitations. First, although all COVID-19 patients in the country were eligible for the telehealth program, there are likely selection biases in the sample as testing was not universal and not all patients may have had access to a telephone, answered the telehealth calls, or agreed to participate in the service. However, mobile phone ownership is high in Bangladesh (178.61 million subscribers as of 2021). According to recent reports, 56% of the population has a personal mobile phone and almost every household has at least one mobile phone [49,50]. Therefore the patients eligible for this study are not likely to be substantially different from the general population. Second, we lacked data on smoking status, obesity, immunosuppression, and clinical and laboratory markers of a patient's health condition-all of which may be associated with COVID-19 severity and mortality. Third, telehealth physicians rated the severity of the patient's condition based on patient's self-reporting of symptoms and preexisting conditions. Patients' self-reports and telehealth physicians' assessments were not validated against in-person clinical assessment and therefore the accuracy of the assessment cannot be established. Finally, there is also a significant amount of missing data for the outcome (S1 Table) and several of the risk factors analyzed in this study (Table 1). It is likely that the missingness of risk factor data was associated with the severity of the disease, given severely ill and hospitalized patients were less likely to complete phone assessments and respond to follow-up calls. Therefore, the proportion of missing risk factor data among severely ill patients is likely to bias our estimates toward the null. We partially addressed the issue of missing outcome data by merging the death database records with the telehealth records to capture a much larger proportion of all confirmed COVID-19 deaths in the country over that period. Our results are generalizable to unvaccinated patients in Bangladesh and other low resource settings, as most of the Bangladeshi population was unvaccinated during the period of telehealth services. Although, current vaccination coverage is high in Bangladesh, coverage among vulnerable poor and elderly in rural areas are still low and the provision of boosters is inadequate now [51,52].
This study is one of the largest prospective cohort studies of COVID-19 mortality in Bangladesh covering 36% of total COVID-19 cases in the country during the study period. These results can help guide public health and clinical decision-making during future waves of the COVID-19 pandemic in Bangladesh and other low resource settings. They are particularly relevant for settings where risk factors for mortality may differ from those commonly cited in the literature from high-income countries, and the need for targeted interventions is more acute due to resource constraints. The results of the classification tree may be helpful for rapid clinical decision-making and provide a useful model for classifying high and low-risk patients at initial screening by first level health care providers. Finally, our results show that a physician's assessment of a patient's health status during the telehealth interview was highly predictive of mortality, demonstrating the potential value of the telehealth service. Harnessing the benefits of the telehealth system and optimizing care for those most at risk of mortality are key directions for future research. tree for mortality among COVID-19 patients using age, gender, comorbidities, presenting symptoms, and physician's assessment of a patient's health status as predictors. N represents the number of patients in the original data (as opposed to the resampled data used to create the tree). Blue indicates the predicted risk of mortality less than or equal to 2%; red indicates the predicted risk of mortality greater than 2% in the data. (TIFF) S1 Table. Characteristics of the study population by availability of death outcome data. (DOCX)