Assessing risk factors for SARS-CoV-2 infection in patients presenting with symptoms in Shanghai, China: a multicentre, observational cohort study

Background The outbreak of COVID-19 has led to international concern. We aimed to establish an effective screening strategy in Shanghai, China, to aid early identification of patients with COVID-19. Methods We did a multicentre, observational cohort study in fever clinics of 25 hospitals in 16 districts of Shanghai. All patients visiting the clinics within the study period were included. A strategy for COVID-19 screening was presented and then suspected cases were monitored and analysed until they were confirmed as cases or excluded. Logistic regression was used to determine the risk factors of COVID-19. Findings We enrolled patients visiting fever clinics from Jan 17 to Feb 16, 2020. Among 53 617 patients visiting fever clinics, 1004 (1·9%) were considered as suspected cases, with 188 (0·4% of all patients, 18·7% of suspected cases) eventually diagnosed as confirmed cases. 154 patients with missing data were excluded from the analysis. Exposure history (odds ratio [OR] 4·16, 95% CI 2·74–6·33; p<0·0001), fatigue (OR 1·56, 1·01–2·41; p=0·043), white blood cell count less than 4 × 109 per L (OR 2·44, 1·28–4·64; p=0·0066), lymphocyte count less than 0·8 × 109 per L (OR 1·82, 1·00–3·31; p=0·049), ground glass opacity (OR 1·95, 1·32–2·89; p=0·0009), and having both lungs affected (OR 1·54, 1·04–2·28; p=0·032) were independent risk factors for confirmed COVID-19. Interpretation The screening strategy was effective for confirming or excluding COVID-19 during the spread of this contagious disease. Relevant independent risk factors identified in this study might be helpful for early recognition of the disease. Funding National Natural Science Foundation of China.


Introduction
COVID-19, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is an international concern and the disease has spread globally. [1][2][3][4] As of late April, 2020, the outbreak in China appears to be alleviated each day, but it is worsening in other countries with growing numbers of cases reported daily. 5 With increasing evidence of human-to-human trans mission, the virus is considered to be highly contagious. [6][7][8] Up until May 5, 2020, more than 3·5 million cases had been confirmed globally. The present diagnostic criteria for confirmed COVID-19 are dependent on the RT-PCR assay. 9,10 A short supply of assay kits and inevitable low coverage of tests are delaying diagnosis of numerous suspected cases. An improvement in early diagnosis of the disease is urgently needed. 11,12 Inevitable international travel has increased spread of the pandemic globally. 12 Fever clinics were set up during the SARS outbreaks in 2003. 13 Fever clinics are an epidemic control system with strict isolation facilities used in China. 14 Although Shanghai has a large number of permanent residents and a substantial floating population of non-permanent residents, it only reported 516 con firmed patients with COVID-19 up until March 31, 2020, owing to the medical procedures done in fever clinics. These numbers are considered to show success in controlling the disease.
The main aim of this study was to ascertain the effectiveness of the screening strategy and provide meaningful insight for early diagnosis of COVID-19. The proportion of confirmed and suspected cases of COVID-19 among patients in the fever clinics of assigned hospitals was calculated. The features of confirmed COVID-19 cases and excluded cases in the suspected cases group generated from the screening strategy were compared. Factors obtained from the analysis for early recognition of the contagious disease were identified.

Study design and participants
For this multicentre, retrospective, observational cohort study, all patients visiting the fever clinics of 25 designated hospitals were analysed. Patients with fever (body temperature >37·5°C), or patients with pulmonary symptoms and epidemiological exposure history (appendix p 4) were requested to visit the fever clinics. All patients visiting the fever clinics during the study period were included. To make the study representative for Shanghai, at least one hospital in each district of all the 16 districts in Shanghai was selected. The observational period was set from Jan 17, 2020, when the first confirmed case was reported in Shanghai, to Feb 16, 2020, with a month-long study period.
The patients with suspected and confirmed COVID-19 were diagnosed according to the updated versions of the guidelines for the diagnosis and treatment of the disease issued by the National Health Commission of China. 15 An excluded case refers to the individuals with suspected COVID-19 who were eventually excluded based on negative RT-PCR tests. Details of the definitions of fever clinic patients for both suspected and confirmed patients are shown in the appendix (pp 4-5). The study was approved by the Research Ethics Committee of Shanghai Pulmonary Hospital and participating hospitals. The need for written informed consent was waived by the Ethics Commission of the designated hospitals for this emerging public health issue.

Procedures
The fever clinics of the assigned hospitals in each district were responsible for the first line of defence against COVID-19 as the first place to receive the patients. The physician in the fever clinic gave an initial differential diagnosis of the patient based on the medical history (eg, main complaints), epidemiological exposure history, comorbidities, clinical features (eg, symptoms and signs), and essential examinations (eg, peripheral blood routine tests, blood influenza tests, and chest x-rays), as well as optional tests that the physician deemed necessary (eg, blood biochemical indexes, chest CT scans). After consideration by the physician, if a patient could not be ruled out for possible COVID-19, three experts in the hospital (or experts in the same district) were invited to reconsider the diagnosis of the patient. If two or more experts considered that a diagnosis of COVID-19 could not be ruled out, the diagnosis of a suspected case was made. Initially, a suspected patient's diagnosis was made based on the guidelines. 15 Briefly, the patient should have exposure history and any two of the following clinical features; or three clinical features if there is no clear exposure history: (1) fever with or without pulmonary symptoms; (2) normal or reduced peripheral white blood cell count, or reduced lymphocyte count; or (3) chest imaging features of pneumonia. All initially suspected cases then received SARS-CoV-2 nucleic acid testing. A throat swab or respiratory or blood samples were collected and sent for RT-PCR to confirm infection with SARS-CoV-2.

Research in context
Evidence before this study The ongoing outbreak of COVID-19 has led to increasing international concern and global spread. We searched PubMed for articles published between database inception and April 1, 2020, using the terms "COVID-19", "SARS-CoV-2", and "2019-nCoV", with no language restrictions. We found 2076 relevant online publications. The studies reported the characteristics and outcomes of patients with confirmed COVID-19, information that helps us to understand the disease. However, the early recognition of COVID-19 in suspected cases could be the most important step for epidemic control in cities globally. Furthermore, the epidemiology and the early clinical findings of patients in Shanghai, a city with numerous imported cases, might be different from those reported previously and might be more instructive to other countries. Thus, a multicentre cohort study was done in which all patients visiting fever clinics of 25 assigned hospitals distributed in the 16 districts of Shanghai over a 1-month period were monitored. The clinical data were analysed and reported through this feasible screening strategy for patients with COVID-19.

Added value of this study
We present and evaluate a screening strategy for suspected COVID-19. Among 53 617 patients enrolled in this study, only 1·9% of patients were initially considered as suspected cases. Subsequently, 0·4% of all patients and 18·7% of suspected cases were diagnosed as confirmed cases. Although a very small portion of the screened population were eventually diagnosed with confirmed COVID-19, the strategy contributed to the control of the epidemic in a large and crowded city. Moreover, the comparisons of early clinical findings between confirmed COVID-19 and the excluded cases provided us with the key factors of early identification of COVID-19 from the suspected cases, to which other studies have not referred.

Implications of all the available evidence
As of May 5, 2020, the transmission of COVID-19 is far from under control globally. We have defined the clinical features of the disease; what we need to do next is pay more attention to early identification of the disease to control its spread. The screening strategy for COVID-19 in Shanghai has been effective in preventing the spread and helpful in the early identification of the disease. The independent risk factors found in this study could have benefits worldwide. More practices or policies based on these results should be made in the process of epidemic control. Patients with at least one positive RT-PCR test for SARS-CoV-2 were defined as confirmed cases and were uniformly hospitalised in the Shanghai Public Health Clinical Center. The suspected patients whose first nucleic acid test was negative had one repeat RT-PCR test routinely. If a patient had two negative RT-PCR tests but was still a clinically suspected case, RT-PCR was repeated for up to 3-4 times until their final diagnoses were confirmed (figure 1). During the period of isolation, meticulous medical histories, body checks, and medical examinations of all patients were undertaken. A standardised data collection spreadsheet was designed to obtain patient data from electronic medical records. Two attending physicians in every centre independently reviewed the data collection forms to check the data validity. Epidemiological data, demographics (age, sex, body-mass index, occupation), symptom onset (eg, respiratory, gastrointestinal, neurological), comorbidities, laboratory results (eg, blood cell count, hepatic function, arterial blood gas), chest radiological findings (eg, ground-glass opacity, affected lobes), and the virus detection results were obtained for analysis.

Outcomes
The proportion of patients who had confirmed or suspected COVID-19 in the fever clinics in 25 hospitals of all 16 districts of Shanghai was evaluated. The efficiency of the screening strategy for COVID-19 disease was reported. Comparisons were made between excluded patients and confirmed cases, and the early prognostic indicators associated with the diagnosis of COVID-19 were deter mined to facilitate the quick conversion of the diagnosis of patients with suspected disease to confirmed cases. Factors related to the early recognition of confirmed cases from the suspected cases were determined.

Statistical analysis
In the univariate analysis, the Kolmogorov-Smirnov test was used to analyse the distribution of quantitative variables. The t test was used to analyse quantitative variables that were normally distributed and homoscedastic, and the Mann-Whitney U test was used to analyse quantitative variables that were non-normally distributed or not homoscedastic. Qualitative variables such as sex, symptoms, comor bidities, and radiological features between the two groups were compared using the χ² test. Quantitative data were presented as median (IQR) in cases of non-normally distributed data or mean (SD) in cases of normally distributed and qualitative data presented as numbers. In the multivariate analysis, logistic regression was used to determine the risk factors for COVID-19. Variables with significance in the univariate analysis were preliminarily screened out. Variables with clinical significance based on clinical experience and previous studies, such as lymphocyte count, were also selected. 16,17 When the two groups were compared, p<0·1 was the threshold for variables to be included in the secondary analysis. For the determination of risk factors, p<0·05 was the threshold for identification. The Spearman coefficient analysis was used to assess the correlation of these variables. In the case of significant collinearity between two variables (Spearman correlation test >0·6), the variable with greater difference in two groups was selected. Finally, variables such as chill, fatigue, headache, poor appetite, myalgia, epidemiological risk, white blood cell counts less than 4 × 10⁹ per L, lymphocyte counts less than 0·8 × 10⁹ per L, and radiological type and site were simultaneously entered into the multivariate regression model. The odds ratio (OR) and 95% CI were also calculated for the independent variables. For all analyses, p<0·05 was considered significant.
We did a sensitivity analysis to investigate the robustness of the findings in line with missing data. We used the mean value imputation method to estimate missing data in two groups and then repeated all the univariate and multivariate analyses. The results were then compared in the two different samples. If the sensitivity analysis is consistent with the results of the primary analysis, it shows that the missing data has little effect on the overall research conclusion to some extent, and the results are relatively robust.

Role of the funding source
The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. The authors J-FX, J-MQ, and Z-JJ had full access to all the data in the study and had final responsibility for the decision to submit for publication.

Results
53 617 patients visited the fever clinics of 25 assigned hospitals in Shanghai from Jan 17 to Feb 16, 2020. Among 53 617 cases, 1004 (1·9%) were diagnosed as suspected patients and 188 (0·4%) were diagnosed with confirmed COVID-19. 850 patients were included in the analysis and 154 were missing data. Three patients with COVID-19 and 23 excluded patients had missing data for white blood cell count, 11 patients with COVID-19 and 132 excluded patients had missing data for neutrophil count, and eight patients with COVID-19 and 83 excluded patients had missing data for lymphocyte count. The visit timelines of 1004 suspected patients are shown in figure 2A. A waveform with a main peak from Jan 30 to Feb 8, 2020, was observed, 1 week after the Wuhan lockdown and the restriction of movement of people in other cities from Jan 23, 2020. Most cases were distributed before Feb 8, 2020, and after that, the number of both patients with confirmed COVID-19 and excluded patients gradually decreased. Except for the 188 confirmed cases, the final diagnoses of the excluded cases included bacterial pneumonia (n=622), common cold (n=59), influenza (n=50), other diseases (n=44), and unconfirmed cases requiring further follow-up (n=41; figure 2B).
Data for 1004 patients with suspected COVID-19 were included in the further analysis of the differences between the 188 confirmed patients and the 816 excluded patients (table 1)  J a n 1 7 J a n 1 8 J a n 1 9 J a n 2 0 J a n 2 1 J a n 2 2 J a n 2 3 J a n 2 4 J a n 2 5 J a n 2 6 J a n 2 7 J a n 2 8 J a n 2 9 J a n 3 0 J a n 3 1   figure 3D). Different amounts of lactate dehydrogenase, ESR, creatine, and procalcitonin were observed in both groups (table 2).
Of the 188 patients with confirmed COVID-19, an initial RT-PCR test revealed a positive result in 135 (72%) patients. Of the remaining patients, the first positive test result was obtained on a second test for 44 (23% of the total), a third test for seven (4%), and only after a fourth test for two (1%) patients. In the 816 excluded patients, 795 (97%) cases were tested twice before exclusion, and ten (1%) cases had one RT-PCR test before exclusion due to rapid clinical improvement during follow-up.

Discussion
To our knowledge, this study is the first and only report of a screening strategy for COVID-19 in a city. To enable  the results of this study to be more generalisable to other cities and countries, at least one hospital in each district of the 16 districts in Shanghai was selected to make the study representative for the whole city, and cases identified over a 1-month period were included. With use of the reported screening strategy for suspected cases via fever clinics, the epidemic of COVID-19 in Shanghai was considered to be under control as of Feb 19, 2020, with no new native cases even with its population of more than 25 million people and a floating population of around 9·7 million. The first imported case in Shanghai appeared on March 5. The role of fever clinics and the experience of managing suspected patients in Shanghai might be of benefit to other cities. Even in countries without facilities such as fever clinics, the screening strategy can be replicated in hospitals. It is recommended that fever clinics, or other departments with similar responsi bilities, should be established and evenly distributed throughout the country or city if conditions permit, which can help to avoid the transmission of the virus during long-distance travel.
To our knowledge this is also the first report to show early discerning differences between confirmed cases and excluded cases. The data previously reported were mostly from Wuhan, the hardest hit area in China. Large differences existed in the epidemiological status and clinical features of patients with COVID-19 between Wuhan and the rest of China (appendix pp 6-11) to show the differences of clinical features between this study and previous published data. [18][19][20][21][22] One interesting difference was the significantly higher proportion of patients with both lungs infected in Wuhan compared with Shanghai, showing that the lung lesions involved a wider area for    is very similar to that of Wuhan, but the epidemiological characteristics of other cities worldwide might be more similar to those of Shanghai, thus the experiences reported here might benefit other countries and cities. The distribution of patients showed that the number of confirmed cases in Shanghai reached a peak on Jan 30, 2020. The date of this peak is believed to be as a result of the national policy of lockdown of Wuhan, which was instigated on Jan 23, 2020, and can be explained by the incubation period of COVID-19. These data suggest that the results of lockdown are evident after 1-2 weeks.
In this study, the ratio of suspected to confirmed cases in Shanghai (188 of 1004) was higher than that of China as a whole (44 672 of 72 314), which was mainly the result of the high number of confirmed cases in Wuhan. 23 This finding indicated that more suspected cases were under surveillance in Shanghai, which might have resulted in the relatively successful epidemic prevention and control in the city.
The main monitoring point for the disease is body temperature. 84% of patients with COVID-19 presented with a fever in this study, which was lower than that in Wuhan at 98%, 18,20 indicating that clinical features might differ between patients with COVID-19 in Shanghai and those in Wuhan, and diagnoses might be missed when the surveillance case identification mainly focuses on fever detection. The mean highest temp erature of patients with COVID-19 was relatively lower than that of excluded patients in this study. 16% of patients with COVID-19 did not have a fever, and the analysis of multivariate regression also showed that a fever was not an indicator for COVID-19 diagnosis, suggesting that we should pay attention to symptoms other than fever. Extrapulmonary symptoms such as fatigue, headache, poor appetite, and myalgia were more common in the confirmed cases. Symptoms of COVID-19 vary at different stages of disease and systemic symptoms might be more common in the early stages of the disease. 21,24 A study has suggested that the receptor gene of SARS-CoV-2 is angiotensin converting enzyme 2, 25 which might account for the higher proportion of patients with COVID-19 who had hypertension than that of the excluded cases in this study. Radiologically, COVID-19 presented with more abnormalities such as GGO or GGO with patchy shadow in this study, which is consistent with other reports. 26,27 More importantly, the numbers of lobes affected and bilateral lung involvement in CT scans were associated with the diagnosis of COVID-19. Nevertheless, we still need to be cautious as these features observed in chest CTs are not exclusively seen in COVID-19, but also in other viral pneumonias, such as influenza.
Patients with COVID-19 presented with lower counts of white blood cells, neutrophils, and lymphocytes than excluded cases (the most common diagnosis in excluded cases was bacterial pneumonia). 16,28 These lower blood counts might be an important warning sign in the early identification of COVID-19 in fever clinics. Besides, the ESR in confirmed patients was significantly higher than that in excluded patients, while C-reactive protein was not, suggesting that ESR might be more sensitive than C-reactive protein in assisting in the diagnosis of COVID-19.
Results of multivariate regression showed that exposure history is the most important predictor of diagnosis for COVID-19, and suspected patients with a history of exposure are 4·16 times more likely to be diagnosed with COVID-19 than those without. Therefore, inquiring about exposure history is the most important step in the screening process. 22,29 The presence of fatigue, white blood cell counts less than 4 × 10⁹ per L, lymphocyte counts less than 0·8 × 10⁹ per L, and characteristics of chest CT scans were shown to have predictive value in the diagnosis of COVID-19, which will help physicians in early identification and surveillance of patients.
Similar to other studies on emerging novel virus infections, our study has several common limitations. Firstly, to increase the sensitivity of early detection and diagnosis, the epidemiological history we collected in this study was broadened from Wuhan to other regions with COVID-19 cases reported, with a resultant substantial increase in the number of suspected cases. Predictably, the increased study population might increase the number of different diseases, making the influence on the overall population characteristics uncertain. To eliminate or reduce this risk, we further confirmed the diagnosis in most, but not all, patients. Secondly, missing data were unavoidable as we did a retrospective study. Considering that the missing data were caused by non-human factors and many were from the same individuals, patients with missing data were not included in the subsequent analysis (ie, we restricted the analysis to individuals without missing data). The statistical power might be reduced as the sample size decreased. There might also be an effect on the findings and increased bias when using this method to handle missing data, while the relatively large sample in this study could compensate for this. Additionally, the sensitivity analysis was consistent with the primary Odds ratio (95% CI) p value Exposure history 4·16 (2·74-6·33) <0·0001 Fatigue 1·56 (1·01-2·41) 0·043 White blood cell count <4 × 10⁹ per L 2·44 (1·28-4·64) 0·0066 Lymphocyte count <0·8 × 10⁹ per L 1·82 (1·00-3·31) 0·049 Ground glass opacity in chest imaging 1·95 (1·32-2·89) 0·0009 Both lungs affected 1·54 (1·04-2·28) 0·032 Table 3: Multivariate analysis of the independent risk factors associated with diagnosis findings, suggesting that the missing data had little effect on the overall research conclusion, and the results are relatively robust (appendix p 12). Future studies including a larger cohort with standardised data collection is necessary to further validate these results.
In conclusion, the novel screening strategy for COVID-19 in Shanghai is effective in contributing to quarantining the infection source and preventing the spread of this contagious disease. The findings from the comparisons of suspected patients provide us with meaningful insights for early differentiated diagnosis of COVID-19. The screening strategy and the clinical findings found in this study could meet the urgent need for the prevention and early identification of the disease.

Declaration of interests
We declare no competing interests.

Data sharing
The investigators will share de-identified individual participant data and the study protocol following completion of a data use agreement. Data are available after the Article publication from jfxucn@gmail.com.