Investigating phenotypes of pulmonary COVID-19 recovery: A longitudinal observational prospective multicenter trial

Background: The optimal procedures to prevent, identify, monitor, and treat long-term pulmonary sequelae of COVID-19 are elusive. Here, we characterized the kinetics of respiratory and symptom recovery following COVID-19. Methods: We conducted a longitudinal, multicenter observational study in ambulatory and hospitalized COVID-19 patients recruited in early 2020 (n = 145). Pulmonary computed tomography (CT) and lung function (LF) readouts, symptom prevalence, and clinical and laboratory parameters were collected during acute COVID-19 and at 60, 100, and 180 days follow-up visits. Recovery kinetics and risk factors were investigated by logistic regression. Classification of clinical features and participants was accomplished by unsupervised and semi-supervised multiparameter clustering and machine learning. Results: At the 6-month follow-up, 49% of participants reported persistent symptoms. The frequency of structural lung CT abnormalities ranged from 18% in the mild outpatient cases to 76% in the intensive care unit (ICU) convalescents. Prevalence of impaired LF ranged from 14% in the mild outpatient cases to 50% in the ICU survivors. Incomplete radiological lung recovery was associated with increased anti-S1/S2 antibody titer, IL-6, and CRP levels at the early follow-up. We demonstrated that the risk of perturbed pulmonary recovery could be robustly estimated at early follow-up by clustering and machine learning classifiers employing solely non-CT and non-LF parameters. Conclusions: The severity of acute COVID-19 and protracted systemic inflammation is strongly linked to persistent structural and functional lung abnormality. Automated screening of multiparameter health record data may assist in the prediction of incomplete pulmonary recovery and optimize COVID-19 follow-up management. Funding: The State of Tyrol (GZ 71934), Boehringer Ingelheim/Investigator initiated study (IIS 1199-0424). Clinical trial number: ClinicalTrials.gov: NCT04416100

We herein prospectively analyzed the prevalence of nonresolving structural and functional lung abnormalities and persistent COVID-19-related symptoms 6 months after diagnosis. Using univariate risk modeling as well as multiparameter clustering and machine learning (ML), we investigated sets of risk factors and tested the operability of ML classifiers at predicting protracted lung and symptom recovery. The classification and prediction procedures were implemented in an opensource risk assessment tool (https://im2-ibk. shinyapps.io/CovILD/).

Study design
The CovILD ('Development of interstitial lung disease in COVID-19') multicenter, longitudinal observational study  was initiated in April 2020. Adult residents of Tyrol, Austria, with symptomatic, PCR-confirmed SARS-CoV-2 infection (WHO, 2021) were enrolled by the Department of Internal Medicine II at the Medical University of Innsbruck (primary follow-up center), St. Vinzenz Hospital in Zams, and the acute rehabilitation facility in Münster ( Table 1). The participants were diagnosed with COVID-19 between 3 March and 29 June 2020. In course of the study, including the 2020 SARS-CoV-2 outbreak and follow-up visits, the regional health system was able to guarantee an unrestricted, optimal standard of diagnostics and care for all participants. Corticosteroids were not standard of care during the recruitment period of the study, thus were not administered as a therapy of acute COVID-19. Some participants with nonresolving pneumonia received systemic steroids beginning from week 4 post diagnosis at the discretion of the physician ( Table 2). The analysis endpoints were the presence of any, mild (severity score ≤ 5), and moderate-to-severe (severity score > 5) lung computed tomography (CT) abnormalities, impaired lung function (LF), and persistent COVID-19 symptoms at the 180-day follow-up visit ( Table 3).
In total, 190 COVID-19 patients were screened for participation. Thereof, n = 18 subjects refused to give informed consent, n = 27 declared difficulties to appear at the study follow-ups. Data of n = 145 participants were eligible for analysis ( Figure 1). All participants gave written informed consent. The study was approved by the Institutional Review Board at the Medical University of Innsbruck (approval number: 1103/2020) and registered at ClinicalTrials. gov (NCT04416100).

Procedures
We retrospectively assessed patient characteristics during acute COVID-19 and performed follow-up investigations at 60 days (63 ± 23 days [mean ± SD]; visit 1), 100 days (103 ± 21 days; visit 2), and 180 days (190 ± 15 days; visit 3) after diagnosis of COVID-19. Each visit included symptom and physical performance assessment with a standardized questionnaire, LF testing, standard laboratory testing, and a CT scan of the chest. The variables available for analysis with their stratification schemes are listed in Appendix 1-table 1.

Variable overlap, kinetics, and risk modeling
Overlap between the 180-day follow-up outcome features was assessed by analysis of quasiproportional Venn plots (package nVennR) (Pérez-Silva et al., 2018) and calculation of the Cohen's κ statistic (package vcd) (Fleiss et al., 1969) . Kinetics of binary outcome variables in participants subsets with the complete longitudinal data record was modeled with mixed-effect logistic regression (random effect: individual, fixed effect: time, packages lme4 [Bates et al., 2015] and lmerTest [Kuznetsova et al., 2017] ). Analyses in the severity groups were done with separate models. Significance was assessed by the likelihood ratio test (LRT) against the random-term-only model. Univariate risk modeling was performed with fixed-effect logistic regression (Appendix 1-table 2). Odds ratio (OR) significance was determined by Wald Z test. In-house-developed linear modeling wrappers around base R tools are available at https://github.com/PiotrTymoszuk/lmqc.

Cluster analysis
Clustering of non-CT and non-LF binary clinical features (Appendix 1-table 1) was accomplished with PAM algorithm (partitioning around medoids, package cluster) (Amato et al., 2019) and simple matching distance (SMD, package nomclust) (Boriah et al., 2008) . Association analysis for the participants was performed with a combined procedure involving clustering of the observations by the selforganizing map algorithm (SOM, 4 × 4 hexagonal grid, SMD distance, kohonen package), followed by clustering of the SOM nodes by the Ward.D2 hierarchical clustering algorithm (Euclidean distance, hclust() function, package stats) (Vesanto and Alhoniemi, 2000;Kohonen, 1995;Wehrens and Kruisselbrink, 2018) . Clustering analyses were performed in the participant subset with the complete set of clustering variables. The selection of the optimal clustering algorithm was motivated by the highest ratio of between-cluster to total variance and the best stability measured by mean classification error in 20-fold cross-validation (CV) (  Lange et al., 2004) . The optimal cluster number was determined by the bend of the within-cluster sum-of-squares curve (function fviz_nbclust(), package factoextra) and by the stability in 20-fold CV ( Figure 6-figure supplement 1C and D, Figure 7-figure supplement 1D and F; Lange et al., 2004;Wang, 2010) , as well as by a visual inspection of the SOM node clustering dendrograms ( Figure 7-figure supplement 1E). Assignment of 180-day follow-up outcome features to the clusters of clinical parameters was accomplished with a k-nearest neighbor (kNN) label propagation algorithm (Appendix 1-table 3; Sahanic et al., 2021;Leng et al., 2013) . Cluster assignment visualization in a four-dimensional principal analysis score plot was done with the PCAproj() tool (package pcaPP) (Croux et al., 2007) . To determine the importance of particular clustering variables, the variance (between-cluster to total variance ratio) between the initial cluster structure and the structure with random resampling of the variable was compared, as initially proposed for the random forests ML classifier (Breiman, 2001) . Frequencies of the outcome events in the participant clusters were compared with χ 2 test. In-house-developed association analysis wrappers are available at https:// github.com/PiotrTymoszuk/clustering-tools-2.

Pulmonary recovery assessment app
Participant clustering and ML classifiers trained in the CovILD cohort were implemented in an opensource online pulmonary assessment R shiny app (https://im2-ibk.shinyapps.io/CovILD/; code: https:// github.com/PiotrTymoszuk/COVILD-recovery-assessment-app). Prediction of the cluster assignment based on the user-provided patient data is done by the kNN label propagation algorithm Leng et al., 2013) .

Patient characteristics
The CovILD study participants (n = 145) were predominantly male (57.8%), age ranging between 19 and 87 years. 77.2% of participants displayed preexisting comorbidity, predominantly cardiovascular and metabolic disease. The cohort included mild (outpatient care, 24.8%), moderate (hospitalization without oxygen supply, 25.5%), severe (hospitalization with oxygen supply, 27.6%), and critical (intensive care unit [ICU] treatment, 22.1%) cases of acute COVID-19 ( Table 1). The majority of hospitalized participants received anti-infectives during acute COVID-19, anticoagulative, and/or antiplatelet treatment introduced primarily in the ventilated patients. Systemic steroid administration was initiated at the discretion of the physician beginning from week 4 after diagnosis ( Table 2).

Clinical recovery after COVID-19
Most patients, irrespective of the acute COVID-19 severity, showed a significant resolution of disease symptoms over time ( Figure 1, Figure 2A). Persistent complaints at the 6-month follow-up were reported by 49% of the study subjects (Table 3), with self-reported impaired physical performance (34.7%), sleep disorders (27.1%), and exertional dyspnea (22.8%) as leading manifestations. The frequency of all investigated symptoms declined significantly, even though the pace of their resolution was remarkably slower in the late (100-and 180-day follow-ups) than in the early recovery phase (acute COVID-19 till 60-day follow-up) ( Figure 2B). Impaired LF was observed in 33.6% of the participants at the 6-month follow-up (Table 3). Except for the critical COVID-19 survivors (60 days: 66.7%; 180 days post-COVID-19: 50%), no significant reduction in the frequency of LF impairment over time was observed (Figure 3). At the 6-month follow-up, structural lung abnormalities were found in 48.5% of patients and moderate-to-severe radiological lung alterations (CT severity score > 5) were present in 19.4% of participants ( Table 3). The majority of the participants with impaired LF displayed radiological lung findings. However, a substantial fraction of CT abnormalities, especially mild ones, were accompanied neither by persistent symptoms nor by LF deficits ( The frequency, scoring, and recovery of CT lung findings were related to the severity of acute infection. Pulmonary lesions scored > 5 CT severity points at the 180-day follow-up were most frequent in the individuals with severe and critical acute COVID-19 ( Figure 3-figure supplement 3). Notably, the hospitalized group with oxygen therapy demonstrated the fastest recovery kinetics. As for the symptom resolution, LF and CT lung recovery decelerated in the late phase of COVID-19 convalescence ( Figure 3).

Risk factors of protracted recovery
To identify risk factors of delayed recovery at the 6-month follow-up, we screened a set of 52 binary clinical parameters (Appendix 1-  -table 2). By this means, no significant correlates for long-term symptom persistence were identified. Risk factors and readouts of severe and critical COVID-19 including multimorbidity, malignancy, male sex, prolonged hospitalization, ICU stay, and immunosuppressive therapy were significantly associated with persistent CT (Figure 4) and LF abnormalities ( Figure 5). Persistently elevated inflammatory markers, IL-6 (>7 ng/L) and CRP (>0.5 mg/L), were strong unfavorable risk factors for incomplete radiological and functional pulmonary recovery. Additionally, the biochemical readout of microvascular inflammation, D-dimer (>500 pg/mL) was significantly linked to LF deficits. Low serum anti-S1/S2 IgG titers at the 60-day follow-up and ambulatory acute COVID-19 correlated with an improved pulmonary recovery (Figures 4 and 5).

Clusters of clinical features linked to persistent symptoms and lung abnormalities
Employing the unsupervised PAM algorithm (Amato et al., 2019) , three clusters of co-occurring non-CT and non-LF clinical features of acute COVID-19 and early convalescence (Appendix 1-  The 6-month follow-up outcome variables were incorporated in the cluster structure using kNN prediction (Leng et al., 2013) . Long-term symptom persistence was associated with acute and longlasting COVID-19 symptoms in cluster 3, whereas pulmonary outcome parameters were grouped with cluster 2 features ( Figure 6A, Figure 6-figure supplement 2, Appendix 1-  . Kinetic of pulmonary recovery. Recovery from any lung computed tomography (CT) abnormalities, moderate-to-severe lung CT abnormalities (severity score > 5), and recovery from functional lung impairment were investigated in the participants stratified by acute COVID-19 severity by mixedeffect logistic modeling (random effect: individual; fixed effect: time). Significance was determined by the likelihood ratio test corrected for multiple testing with the Benjamini-Hochberg method. Frequencies of the given abnormality at the indicated time points are presented, and p-values and the numbers of complete observations are indicated in the plots.
The online version of this article includes the following figure supplement(s) for figure 3:     Figure 4. Risk factors of persistent radiological lung abnormalities. Association of 52 binary explanatory variables (Appendix 1-table 1) with the presence of any lung computed tomography (CT) abnormalities (A) or moderate-to-severe lung CT abnormalities (severity score > 5) (B) at the 180day follow-up visit was investigated with a series of univariate logistic models (Appendix 1-table 2). Odds ratio (OR) significance was determined by Wald Z test and corrected for multiple testing with the Benjamini-Hochberg method. ORs with 95% confidence intervals for significant favorable and were found the closest cluster neighbors of mild CT abnormalities (severity score ≤ 5). Moderate-tosevere structural alterations (severity score > 5) and LF deficits were, in turn, tightly linked to markers of protracted systemic inflammation (IL-6, CRP, anemia of inflammation) (Sonnweber et al., 2020; Figure 6B).
Risk stratification for perturbed pulmonary recovery by unsupervised clustering Next, we tested whether subsets of patients at risk of an incomplete 6-month recovery may be defined by a similar clustering procedure employing exclusively non-CT and non-LF clinical variables (Appendix 1-table 1). Applying a combined SOM -hierarchical clustering approach, three clusters of the study participants were identified (Figure 7, Figure 7-figure supplement 1; Vesanto and Alhoniemi, 2000;Kohonen, 1995) . Prolonged hospitalization, anti-infective therapy, overweight or obesity, pain during acute COVID-19, and low anti-S1/S2 titers at the 60-day follow-up were found the most influential clustering features (Figure 7-figure supplement 2; Breiman, 2001) . The patient subsets identified by the SOM approach differed significantly in frequency of radiological lung abnormalities and substantially, yet not significantly, in the frequency of LF impairment at the 180-day unfavorable factors are presented in forest plots. Model baseline (ref) and numbers of complete observations are presented in the plot axis text. Q1, Q2, Q3, Q4: first, second, third, and fourth quartile of anti-S1/S2 IgG titer; ICU: intensive care unit.   Odds ratio (OR) significance was determined by Wald Z test and corrected for multiple testing with the Benjamini-Hochberg method. ORs with 95% confidence intervals for the significant favorable and unfavorable factors are presented in a forest plot. Model baseline (ref) and n numbers of complete observations are presented in the plot axis text. Q1, Q2, Q3, Q4: first, second, third, and fourth quartile of anti-S1/S2 IgG titer; CKD: chronic kidney disease.  Figure 6. Association of incomplete symptom, lung function, and radiological lung recovery with demographic and clinical parameters of acute COVID-19 and early recovery. Clustering of 52 non-computed tomography (non-CT) and non-lung function binary explanatory variables recorded for acute COVID-19 or at the early 60-day follow-up visit (Appendix 1-table 1) was investigated by partitioning around medoids (PAM) algorithm with simple matching distance (SMD) dissimilarity measure ( Figure 6-figure supplement 1, Appendix 1-table 3). The cluster assignment for the Figure 6 continued on next page follow-up. In particular, most of the individuals assigned to the largest, low-risk (LR) subset were CT and LF abnormality-free. The frequency and severity of radiological pulmonary findings were elevated in the smallest intermediate-risk subset (IR) and peaked in the high-risk (HR) group ( Figure 8A). Despite a comparable frequency of long-term symptoms between the LR, IR, and HR subsets ( Figure 8A), the HR collective showed the lowest prevalence of dyspnea, cough, night sweating, pain, gastrointestinal manifestations, and complete absence of hyposmia at the 180-day follow-up ( Figure 8B). Although the LR subset primarily comprised mild COVID-19 cases and the HR subset ICU survivors, the cluster assignment (IR vs. LR, HR vs. LR) remained an independent correlate of persistent CT and LF abnormalities after adjustment for the acute COVID-19 severity (
All tested ML algorithms and ensembles demonstrated good accuracy (area under the curve [AUC] > 0.78) and sensitivity (>0.84) at predicting any lung CT abnormalities at the 6-month follow-up in the study cohort serving as a training data set. Their efficiency in CV was moderate (AUC: 0.69-0.81; sensitivity: 0.69-0.78) (Figure 9, Figure 9-figure supplement 3, Appendix 1-table 5). In turn, moderate-to-severe structural lung findings were recognized with markedly lower sensitivity both in the training data set (>0.43) and the CV (0.39-0.48). Even though impaired LF and persistent symptoms were common at the 6-month follow-up in the training data set (Figures 2 and 3), nearly half of the cases were not identified by any of the tested ML algorithms and their ensembles in the CV setting ( Figure 9, Figure 9-figure supplement 3, Appendix 1-table 5). The sensitivity of the ensembles and single classifiers at predicting CT and LF abnormalities was substantially better in severe and critical COVID-19 survivors than in ambulatory and moderate cases ( Figure 10, Appendix 1-table 6).
The most important explanatory variables for pulmonary abnormalities by three unrelated classifiers (C5.0, RF, and glmNet) included preexisting malignancy, multimorbidity, markers of systemic inflammation (IL-6 and CRP), and anti-S1/S2 antibody levels at the 60-day follow-up ( outcome variables at the 180-day follow-up visit (persistent symptoms, functional lung impairment, mild lung CT abnormalities [severity score ≤ 5] and moderate-to-severe lung CT abnormalities [severity score > 5]) was predicted by k-nearest neighbor (kNN) label propagation procedure. Numbers of complete observations and numbers of features in the clusters are indicated in (A). (A) Cluster assignment of the outcome variables (diamonds) presented in the plot of principal component (PC) scores. The first two major PCs are displayed. The explanatory variables are visualized as points. Percentages of the data set variance associated with the PC are presented in the plot axes. (B) Five nearest neighbors (lowest SMD) of the outcome variables presented in radial plots. Font size, point radius, and color code for SMD values. Q1, Q2, Q3, Q4: first, second, third, and fourth quartile of anti-S1/S2 IgG titer; GITD: gastrointestinal disease; CKD: chronic kidney disease; ICU: intensive care unit; COPD: chronic obstructive pulmonary disease.
The online version of this article includes the following figure supplement(s) for figure 6:     Figure 7. Clustering of the study participants by non-lung function and non-computed tomography (non-CT) clinical features. Study participants (n = 133 with the complete variable set) were clustered with respect to 52 non-CT and non-lung function binary explanatory variables recorded for acute COVID-19 or at the 60-day follow-up visit (Appendix 1-table 1) using a combined self-organizing map (SOM: simple matching distance) and hierarchical clustering (Ward.D2 method, Euclidean distance) procedure (

Discussion
Herein, we prospectively evaluated trajectories of COVID-19 recovery in an observational cohort enrolled in the Austrian CovILD study  . Despite the resolution of symptoms and pulmonary abnormalities at the 6-month follow-up in a large fraction of the study participants, the recovery pace was substantially slower in the late convalescence when compared with the first three months after diagnosis Huang et al., 2021a) . Persistent symptoms and CT findings were detected in more than 40% and reduced LF in approximately one-third of the cohort, which is in line with recovery kinetics and signs of lung lesion chronicity reported by others (Caruso et al., 2021;Huang et al., 2021b;Huang et al., 2021a;Faverio et al., 2021;Hellemons et al., 2021;Zhou et al., 2021) . By comparison, similar protracted pulmonary recovery was reported for SARS (Hui et al., 2005;Ng et al., 2004;Ngai et al., 2010;Lam et al., 2009) and non-COVID-19 acute respiratory distress syndrome (Wilcox et al., 2013;Masclans et al., 2011) . Of note, treatment approaches for hospitalized patients in our cohorts and similar cohorts recruited at the pandemic onset in early 2020 (Caruso et al., 2021;Huang et al., 2021b;Huang et al., 2021a;Faverio et al., 2021;Hellemons et al., 2021) differ significantly from the current standard of care for acute COVID-19, which includes early systemic steroid use and antiviral and various immunomodulatory medications. How improved standardized therapy and anti-SARS-CoV-2 vaccination affect the clinical and pulmonary recovery needs to be investigated.
In roughly half of our study participants with abnormal lung CT findings, and especially in those with low-grade structural abnormalities, no overt LF impairment at follow-up was discerned. Still, even subclinical lung alterations may bear the potential for clinically relevant progression of interstitial lung disease (Suliman et al., 2015;Hatabu et al., 2020) requiring systematic CT and LF monitoring. Conversely, symptom persistence was weakly associated with incomplete functional or structural pulmonary recovery.
Since PASC are found in as many as 10% of COVID-19 patients Venkatesan, 2021;Sudre et al., 2021b) , robust, resource-saving tools assessing the individual risk of pulmonary complications are urgently needed (Shah et al., 2021;Raghu and Wilson, 2020) . Covariates and characteristics of severe acute COVID-19 such as male sex, age, and preexisting comorbidities, hospitalization, ventilation, and ICU stay were proposed as the risk factors of persistent pulmonary impairment Caruso et al., 2021;Huang et al., 2021a;Faverio et al., 2021;Raghu and Wilson, 2020) . However, their applicability in predicting complications of pulmonary recovery from mild or moderate COVID-19 is limited. Our results of univariate modeling, clustering, and ML prediction point towards a distinct long-term pulmonary risk phenotype that manifests during acute COVID-19 and early recovery and whose central components are protracted systemic (IL-6, CRP, anemia of inflammation) and microvascular inflammation (D-dimer), and strong humoral response (anti-S1/S2 IgG) demographic risk factors and comorbidities (Sonnweber et al., 2020) . Hence, consecutive monitoring of systemic inflammatory parameters analogous to concepts of interstitial lung disease in autoimmune disorders (Khanna et al., 2020) and anti-S1/S2 antibody levels may improve identification of the individuals at risk of chronic pulmonary damage irrespective of the acute COVID-19 severity.
Clustering and ML have been employed for deep phenotyping and predicting acute and postacute COVID-19 outcomes in multivariable data sets Sudre et al., 2021a;Estiri et al., 2021;Demichev et al., 2021;Benito-León et al., 2021) . We demonstrate that subsets of COVID-19 patients that significantly differ in the risk for long-term CT abnormalities may be defined by an easily accessible clinical parameter set available at the early post-COVID-19 assessment. This approach did not involve any CT or LF variables. Furthermore, the cluster classification correlated with axes. (B) Presence of the most influential clustering features (Figure 7-figure supplement 2) in the participant clusters presented as a heat map. Cluster #1, #2, and #3 refer to the feature clusters defined in Figure 6. Q1, Q2, Q3, Q4: first, second, third, and fourth quartile of anti-S1/S2 IgG titer; GITD: gastrointestinal disease; CKD: chronic kidney disease; CVD: cardiovascular disease; GI: gastrointestinal; PD: pulmonary disease.
The online version of this article includes the following figure supplement(s) for figure 7:    the risk of long-term pulmonary abnormalities independently of the acute COVID-19 severity. Thus, these characteristics provide a useful tool for broad screening of convalescent populations, including individuals who experienced mild or moderate COVID-19.
We show that technically unrelated ML classifiers and their ensemble trained without CT and LF explanatory variables can predict lung CT findings independently of their grading at the 6-month follow-up with good specificity and sensitivity in the training collective and CV. By contrast, the more specific prediction of moderate-to-severe lung CT or risk estimation for LF deficits demonstrated a limited sensitivity. For the moderate-to-severe CT abnormalities, this can be primarily traced back to their low frequency resulting in a suboptimal classifier training, especially in CV. A substantial fraction of the participants (20.7%, n = 30) suffered from a preexisting respiratory condition (pulmonary disease, asthma, or COPD) likely paralleled by LF reduction, which possibly confounded the prediction of the post-COVID-19 LF deficits both by clustering and ML. Accumulating evidence suggests that post-acute COVID-19 symptoms are highly heterogeneous conditions with multiorgan, neurocognitive, and psychological manifestations Evans et al., 2021;Davis et al., 2021) , which may differ in risk factor constellations. This could explain why univariate modeling, clustering, and ML failed to estimate persistent symptom risk in our small study cohort. In general, the ML prediction quality may greatly benefit from a larger training data set and inclusion of additional explanatory variables such as cellular readouts of inflammation, in-depth medication, and broader acute symptom data. Nevertheless, the herein described cluster-and ML classifiers represent resourceeffective tools that may assist in the screening of medical record data and identification of COVID-19 patients requiring systematic CT and LF monitoring. To facilitate the identification of patients at risk for protracted respiratory recovery and enable validation in an external collective, we implemented the clustering and prediction procedures in an open-source risk assessment application (https://im2ibk.shinyapps.io/CovILD/).
Our study bears limitations primarily concerning the low sample size and the cross-sectional character of the trial. Because of the impaired availability of the patients and the prolonged inpatient rehabiliation, the 60-and 100-day follow-up visits in part showed a temporal overlap that may have impacted the accuracy of the longitudinal data. Missingness of the consecutive outcome variable record and the participant dropout, particularly of mild and moderate COVID-19 cases, may have also potentially confounded the participant clustering results and ML risk estimation for CT abnormalities and LF impairment since prolonged hospitalization was found to be a crucial cluster-defining and influential explanatory feature. Additionally, even though the reproducibility of the risk assessment algorithms was partially addressed by CV, cluster and ML classifiers call for verification in a larger, independent multicenter collective of COVID-19 convalescents.
In summary, in our CovILD study cohort we found a high frequency of CT and LF abnormalities and persistent symptoms at the 6-month follow-up, and a flattened recovery kinetics after 3 months post-COVID-19. Systematic risk modeling reveled a set of clinical variables linked to protracted pulmonary recovery apart from the severity of acute infection such as inflammatory markers, anti-S1/S2 IgG levels, multimorbidity, and male sex. We demonstrate that clustering and ML classifiers may help to identify individuals at risk of persistent lung lesions and to relocate medical resources to prevent longterm disability.  . Prediction of persistent radiological lung abnormalities, functional lung impairment, and symptoms by machine learning algorithms. Single machine learning classifiers (C5.0; RF: random forests; SVM-R: support vector machines with radial kernel; NNet: neural network; glmNet: elastic net) and their ensemble (Ens) were trained in the cohort data set with 52 non-computed tomography (non-CT) and non-lung function binary explanatory variables recorded for acute COVID-19 or at the 60-day follow-up visit (Appendix 1-table 1) for predicting outcome variables at the 180-day follow-up visit (any lung CT abnormalities, moderate-to-severe lung CT abnormalities [severity score > 5], functional lung impairment, and persistent symptoms) (Appendix 1-table 4). The prediction accuracy was verified by repeated 20-fold cross-validation (five repeats). Receiver-operating characteristics (ROCs) of the algorithms in the cross-validation are presented: area under the curve (AUC), sensitivity (Sens), and specificity (Spec) (Appendix 1-table 5). The numbers of complete observations and outcome events are indicated under the plots.
The online version of this article includes the following figure supplement(s) for figure 9:        Figure 10. Performance of the machine learning ensemble classifier in mild-to-moderate and severe-to-critical COVID-19 convalescents. The machine learning classifier ensemble (Ens) was developed as presented in Figure 9. Its performance at predicting outcome variables at the 180-day follow-up visit (any computed tomography [CT] lung abnormalities, moderate-to-severe lung CT abnormalities [severity score > 5], functional lung impairment, and persistent symptoms) in the entire cohort, mild-to-moderate (outpatient or hospitalized without oxygen), and severe-to-critical COVID-19 convalescents (oxygen therapy or ICU) in repeated 20-fold cross-validation (five repeats) was assessed by receiver-operating characteristic (ROC) (Appendix 1-