Case-control study on post-COVID-19 conditions reveals severe acute infection and chronic pulmonary disease as potential risk factors

Summary Post-COVID-19 conditions (long COVID) has impacted many individuals, yet risk factors for this condition are poorly understood. This retrospective analysis of 88,943 COVID-19 patients at a multi-state US health system compares phenotypes, laboratory tests, medication orders, and outcomes for 1,086 long-COVID patients and their matched controls. We found that history of chronic pulmonary disease (CPD) (odds ratio: 1.9, 95% CI: [1.5, 2.6]), migraine (OR: 2.2, [1.6, 3.1]), and fibromyalgia (OR: 2.3, [1.3, 3.8]) were more common for long-COVID patients. During the acute infection phase long COVID patients exhibited high triglycerides, low HDL cholesterol, and a high neutrophil-lymphocyte ratio; and were more likely hospitalized (5% vs. 1%). Our findings suggest severity of acute infection and history of CPD, migraine, chronic fatigue syndrome (CFS), or fibromyalgia as risk factors for long COVID. These results suggest that suppressing acute disease severity proactively, especially in patients at high risk, can reduce incidence of long COVID.


Included material:
Methods S1: Additional details on previously developed methods used, related to STAR methods.
Figure S1: Reporting of any tracked phenotype as a function of time, related to Figure 1. Figure S2: Subtypes of chronic pulmonary disorder in the long COVID and control patients, related to Table 1. Figure S3: Lab test enrichments for matched long COVID and control groups, related to Figure 1c. Figure S4: Lab test enrichments for matched long COVID and control patients, related to Figure 1c. Figure S5: Comparison of medications administered or ordered for matched long COVID and control patients, related to main text results on medication use. Figure S6: Comparison of medications administered or ordered for matched long COVID and control patients for the baseline phase, related to main text results on medication use.
Table S1: Clinical characteristics, comorbidities, and clinical outcomes of long COVID and prematching control population, related to STAR methods.Table S2: List of long COVID phenotypes identified by CDC and nferX Signals, related to STAR methods and main text Figure 1.Table S3: List of data sources for nferX Diseases collection, related to STAR methods.Table S4: Clinical characteristics of long COVID and matched control patients, related to STAR methods.Table S6: Comparison of medications administered or ordered during the acute, post COVID-19 and baseline phases, related to main text results on medication use.

Methods S1: Additional details on previously developed methods used, related to STAR methods. nferX Signals Platform
The nferX Signals application (https://research.nferx.com/dv/202011/signals/)was used to determine candidate long COVID phenotypes from publicly available literature sources.This application enables the user to search for biomedical associations in free-text over 100 million documents from over 80K sources including but not limited to: PubMed articles, clinicaltrials.gov,patent applications, SEC filings, blogs, conferences, and news articles.

nferX Local Score
The nferX Local Score is the metric that the nferX Signals Platform uses to assess the association between two biomedical concepts in the literature.The local score measures how frequently two tokens are found within each other's local context in a particular corpus, normalized by the occurrences of those tokens in that corpus.We define the local context of a particular token as the five tokens immediately preceding and following every occurrence of that token.We additionally define the adjacency  !" between tokens A and B as the number of items token A is found in token B's local context, or vice-versa.We calculate the pointwise mutual information  !" between tokens A and B as the following: Where  ! is the occurrences of token A,  " is the occurrences of token B, and  ' is the summed occurrences of all tokens in the corpus of interest.We then calculate the local score  !" between tokens A and B as the following: !" = ( !" + 1) ⋅ 1 1 +  ((*+, !" (#..)   For this study, we used the nferX Signals application to compute local scores between "long COVID" and ~80K potential disease phenotypes from the nferX "Diseases" collection.The disease phenotypes with the highest local scores (and therefore highest literature associations to "long COVID") are shown in Table S2.The sources for the nferX "Diseases" collection are described in Table S3.

Phenotype Classification Using BERT
A Bidirectional Encoder Representations from Transformers (BERT)-based classification model was used to classify the sentiment for phenotypes, defined as symptoms and health conditions, mentioned in EHR clinical notes.Given a sentence that includes any phenotype, this model outputs one of the following labels: "Yes" -confirmed diagnosis, "Maybe" -possible diagnosis, "No" -ruled out the diagnosis, or "Other" -none of the above.A dataset of 18,490 manually annotated sentences extracted from EHR clinical notes containing over 250 different phenotypes was used to train the model.The classification model achieves an out-of-sample accuracy of 93.6% and precision and recall values above 95%. 10    In this study and consistent with previously published studies, we used the following criteria for counting an individual as positive for a phenotype.For the baseline phase, an individual was counted as positive for the phenotype if they had at least one mention of the phenotype with a "Yes" label and the confidence score was greater than 0.8 (a "positive sentiment").For each prediction, the confidence score is a number between 0 and 1 which reflects the certainty of the model that the prediction is correct, with 0 being the least certain and 1 being the most certain.We selected a threshold of 0.8 for the confidence score consistent with prior studies and validated using manual review of a subset of model predictions.For the acute and post-COVID-19 phases, an individual was counted as positive for a phenotype only if they had a positive sentiment for the phenotype during that phase (i.e."Yes" label and confidence score > 0.8) without any positive sentiment in the baseline phase.We term such phenotypes as "new onset".We have also quantified the overall prevalence of positive sentiments for any of the 64 phenotypes during 7-day intervals from 42 days before the positive PCR test to 42 days after the positive PCR test (Figure S1).Table S1: Clinical characteristics, comorbidities, and clinical outcomes of long COVID and pre-matching control population, related to STAR methods.For each categorical variable, the percentage of patients in each cohort is shown along with the odds ratio and corresponding 95% confidence interval.Odds ratios that are statistically significant (p-value < 0.05) are indicated with *, and those that are highly significant (p-value < 0.001) are indicated with ***.

Figure S1 :
Figure S1: Reporting of any tracked phenotype as a function of time, related to Figure 1.Data shown for the long COVID cohort (red) and their 1:1 matched controls (blue).The vertical dashed line indicates the date of the patient's positive SARS-CoV-2 PCR test.

Figure S2 :
Figure S2: Subtypes of chronic pulmonary disorder in the long COVID and control patients, related to Table 1.

Figure S3 :
Figure S3: Lab test enrichments for matched long COVID and control groups, related to Figure 1c.For each lab test, mean test values for the long COVID cohort were compared to those of the control group (see Methods).The error bars represent 95% confidence intervals, calculated by bootstrap resampling (1000 samples).The normal ranges for these lab tests 28-31 are shaded in gray.Fifteen lab tests shown here are significantly different (Mann Whitney U test, p-value < 0.05) between the long COVID and the control patients in the acute COVID-19 phase and also significantly different between the long COVID cohorts in the baseline and acute COVID-19 phases.(a) Lab tests indicating infection and tissue damage (b) Lab tests indicating risk of organ failure.

Figure S4 :
Figure S4: Lab test enrichments for matched long COVID and control patients, related to Figure 1c.For each lab test, the distribution of meanindividual test values for the long COVID cohort were compared to the control patients (see Methods).The error bars represent 95% confidence intervals, calculated by bootstrap resampling (1000 samples).The normal ranges for these lab tests 28-31 are shaded in gray.Here we show four of these 15 lab tests significantly enriched in the long COVID cohort (see Methods) with mean test values outside the normal range.

Figure S5 :
Figure S5: Comparison of medications administered or ordered for matched long COVID and control patients, related to main text results on medication use.Medications administered or ordered during the acute COVID-19 phase, (A), the post-COVID-19 phase, (B) for the matched long COVID and control patients.

Figure S6 :
Figure S6: Comparison of medications administered or ordered for matched long COVID and control patients for the baseline phase, related to main text results on medication use.

Table S2 : List of long COVID phenotypes identified by CDC and nferX Signals, related to STAR methods and main text Figure 1.
In the first two columns, the phenotype names are shown along with the data source (e.g.CDC, Signals, or both).In the third column, the nferX Local Score is shown, which is a measure of the strength of the association between that phenotype and long COVID in the biomedical literature.Phenotypes with the highest local score values are most strongly associated with long COVID in the literature.

Table S4 : Clinical characteristics of long COVID and matched control patients, related to STAR methods.
For each categorical variable, the percentage of patients in each cohort is shown.During the matching procedure, each of the categorical variables were matched exactly, so the distributions are exactly the same for the two cohorts.The numeric variables (age and number of encounters) were bucket matched, so there may be slight differences in these covariates between the two cohorts.

Table S6 : Comparison of medications administered or ordered during the acute, post COVID-19 and baseline phases, related to main text results on medication use.
For each drug, the number of patients in each cohort is shown along with the odds ratio and corresponding 95% confidence interval.Odds ratios that are statistically significant (p-value < 0.05) are indicated with *, and those that are highly significant (p-value < 0.001) are indicated with ***.