Effects of meteorological factors and air pollutants on the incidence of COVID-19 in South Korea

Air pollution and meteorological factors can exacerbate susceptibility to respiratory viral infections. To establish appropriate prevention and intervention strategies, it is important to determine whether these factors affect the transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Therefore, this study examined the effects of sunshine, temperature, wind, and air pollutants including sulfur dioxide (SO2), carbon monoxide (CO), ozone (O3), nitrogen dioxide (NO2), particulate matter ≤2.5 μm (PM2.5), and particulate matter ≤10 μm (PM10) on the age-standardized incidence ratio of coronavirus disease (COVID-19) in South Korea between January 2020 and April 2020. Propensity score weighting was used to randomly select observations into groups according to whether the case was cluster-related, to reduce selection bias. Multivariable logistic regression analyses were used to identify factors associated with COVID-19 incidence. Age 60 years or over (odds ratio [OR], 1.29; 95% CI, 1.24–1.35), exposure to ambient air pollutants, especially SO2 (OR, 5.19; 95% CI, 1.13–23.9) and CO (OR, 1.17; 95% CI, 1.07–1.27), and non-cluster infection (OR, 1.28; 95% CI, 1.24–1.32) were associated with SARS-CoV-2 infection. To manage and control COVID-19 effectively, further studies are warranted to confirm these findings and to develop appropriate guidelines to minimize SARS-CoV-2 transmission.

line with prior findings, it has been suggested that meteorological variables and air pollutants may play a critical role in SARS-CoV-2 transmission (Han et al., 2022;Zoran et al., 2020). In addition to direct transmission from person to person, they may favor indirect diffusion of SARS-CoV-2 (Frontera et al., 2020). Meteorological factors and air pollution have been independently identified as modifiable factors contributing to differential SARS-CoV-2 spread (Domingo and Rovira, 2020;Marquès and Domingo, 2022). While age and type of infection (i.e., cluster or non-cluster) are suggested to predispose the link between environmental factors and SARS-CoV-2 infection (Domingo and Rovira, 2020;Choi et al., 2021), few studies have taken these aspects into account completely. Therefore, this study aimed to investigate the overall effect of environmental factors, from meteorological factors to air pollutants, on SARS-CoV-2 infection using COVID-19 data in Korea after controlling for age-and infection-type-related variables.

Methods
The guidelines for Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) were followed in this study (Supplementary Table 1).

Study population
From January 20, 2020 to April 29, 2020, data on 3234 COVID-19 cases were collected from individuals with a mean age of 47.6 and a female distribution of 56.3%. COVID-19 cases were defined as individuals with laboratory-confirmed SARS-CoV-2 infection who subsequently completed the preliminary epidemiological surveillance conducted by each 174 basic units of local governments, 16 metropolitan local governments (Seoul, Incheon, Sejong, Daegu, Gwangju, Ulsan, Busan, Gyeonggi-do, Gangwon-do, Chungcheongbuk-do, Chungcheongnam-do, Gyeongsangbuk-do, Gyeongsangnam-do, Jeollabuk-do, Jeollanam-do, and Jeju) and the Korea Centers for Disease Control and Prevention (KCDC). Epidemiological surveillance data were gathered by epidemic intelligence service officers from each local government and the KCDC, using a unique monitoring system that included GPS (cell phone location), card transaction records, closed-circuit television (CCTV), and medical facility usage history . The Sejong University Institutional Review Board approved this study (SJU-HR-E-2020-003) and waived the requirement for written informed consent due to the urgency of data collection. Data on age, sex, area of residence, and route of infection were collected on each participant.

Meteorological and air pollutant data collection
Data on general meteorological factors and air pollutants were retrieved from two main sources. Ambient meteorological factors, including sunshine, temperature, and wind, were obtained from the Open Meteorological Data Portal, operated by the Korea Meteorological Administration (https://data.kma.go.kr). From 510 observation stations throughout the country, the KMA has collected and provided detailed information on meteorological factors (Korea Meteorological Administration, 2021). Additionally, the concentrations of ambient air pollutants were obtained for six distinct pollutants (SO 2 , CO, O 3 , NO 2 , PM 2.5 , and PM 10 ) from the AIRKOREA data repository (https://www.airkorea. or.kr) provided by the National Ambient Air Quality Monitoring Information System in South Korea (Koo et al., 2020). AIRKOREA is a comprehensive air quality reporting system that calculates surface air pollutants by organizing data on air pollutant concentrations from over 500 monitoring stations nationwide (Korean Environment Corporation, 2021). All datasets were collected from monitoring sites on a regular basis at hourly intervals and were transformed into daily intervals for analysis. The daily mean and maximum values were calculated for meteorological and air pollutant factors, respectively. Missing values were imputed using the values of the nearest monitoring station within 10 km (Table 1).

Measurement of the COVID-19 standardized incidence ratio
Older individuals and children are among the population groups at the greatest risk of environmental pollution (Meo et al., 2020;Wong et al., 2010). However, only a few prior studies have considered age-related factors, which also play a significant role in infection and viral spread (Domingo and Rovira, 2020;Sooryanarain and Elankumaran, 2015). Therefore, we calculated and used the age-standardized incidence ratio (SIR) as the outcome in this study. The SIR is calculated by multiplying the ratio of observed to expected events by 100. For example, an SIR of 100 shows that the incidence of COVID-19 recorded in the study sample is equal to the number predicted in the comparator or "normal" population (Becher and Winkler, 2017;Boyle and Parkin, 1991).

Propensity score analysis
Infection clusters may play an essential role in changing SARS-CoV-2 transmission patterns, according to a systematic review of 65 studies on 108 COVID-19 clusters from 13 countries (Choi et al., 2021). Therefore, we anticipated that there could be baseline differences in factors associated with cluster and non-cluster infections.
A cluster infection was defined as a collection of COVID-19 cases with a recognized chain of transmission/infection, particularly one that occurred in the same location within a short period of time . Patients with COVID-19 who were not associated with any other patients with COVID-19 in time or place were classified as non-cluster cases (Lee et al., 2014). A positive result of a real-time reverse transcription polymerase chain reaction test of a nasal or pharyngeal swab, was considered laboratory proof of SARS-CoV-2 infection, according to the WHO criteria .
We accounted for differences in baseline variables between cluster and non-cluster infection using a logistic regression model with propensity score weighting (stabilized weights). To reduce selection bias in observational research, stabilized weights have been used in propensity score (PS) analyses to model time-varying treatment status (Xu et al., 2010). PS analysis requires the creation of pairs of cluster and non-cluster groups with similar PS values. A logistic regression model is used to calculate and preserve the predicted probability of the dependent variable and PS for each observation in the dataset. This PS, which ranges from 0 to 1, reflects the link between multiple attributes and the dependent variable as a single feature (Shin et al., 2020). Age group, daily mean values of sunshine, temperature, temperature difference, wind, and daily maximum values of SO 2 , CO, O 3 , NO 2 , PM 2.5 , and PM 10 were used as independent variables, and infection type (cluster or non-cluster) was used as the dependent variable in the PS model.

Statistical analysis
After PS weighting, we performed chi-squared tests and t-tests to confirm no significant selection bias at baseline between the cluster and non-cluster groups. To identify the variables that affected the SIR of COVID-19, we first examined the distribution patterns using PSweighted data. The SIR values were log-transformed to satisfy the normality assumption. Based on its mean value, it was converted into a dummy variable (1, SIR of mean or above; 0, otherwise). Multivariable logistic regression analysis was then performed to identify the factors associated with COVID-19 incidence. The dependent variable was the log-transformed SIR for COVID-19. All variables used in the univariable analysis were included in the multivariable model, except O 3 because its effect on SIR was insignificant in the univariable analysis. The significance of multicollinearity was assessed by comparing the variance inflation factors (VIFs) between independent variables. Statistical analysis was performed using R version 4.1.1 (The R Foundation for Statistical Computing, Vienna, Austria) and the package 'survey' was used. All statistics were two-tailed, and p-values <0.05 were considered significant.

Results
The comparison of baseline characteristics between the cluster and non-cluster groups before and after PS weighting is presented in Table 2. Also, the distributions of PS before and after PS weighting are shown in Fig. 1. Before PS weighting, the mean propensity scores were 0.29 in the cluster infection group and 0.23 in the non-cluster infection group and the values differed significantly between the two groups (p < 0.001), indicating that variables were not similarly distributed between groups. However, after PS weighting both groups had similar PS distributions, with no statistically significant differences between the cluster and noncluster infection groups at baseline, with the means and standard deviations well adjusted (0.25 ± 0.10 and 0.25 ± 0.12, respectively; p = 0.515). After PS weighting, 2041 and 1182 individuals in the pseudopopulation had an SIR below the mean SIR (i.e., Low SIR) and at or above the mean SIR (i.e., High SIR), respectively. Compared to the low SIR group, the high SIR group was more likely to be older, exposed to fewer sunshine hours, lower mean temperatures, and to be the noncluster infection type. Additionally, the concentrations of air pollutants, except O 3 , all differed significantly between the two groups (Supplementary Table 2).
95% CI, 1.01-1.02) affected the risk of SARS-CoV-2 transmission. NO 2 , PM 2.5 , and PM 10 were significantly associated with SARS-COV-2 infection, but their risk was less than 0.3% (Table 3). The VIF among independent variables was from 1.089 to 2.879. Multicollinearity was not significant.

Discussion
We determined the effect of exposure to meteorological factors and air pollution on SARS-CoV-2 transmission. Older age, higher concentrations of CO and SO 2 , and non-cluster infection cases were associated with a substantially increased risk of SARS-COV-2 infection. In addition, while sunshine, mean temperature, temperature difference, NO 2 , PM 2.5 , and PM 10 were also significantly associated with the incidence of SARS-COV-2 infection, they did not appear to have a clinically meaningful effect because of the small effect sizes.
A few studies have found that air pollution is associated with an increased incidence of SARS-COV-2 infection. A meta-analysis based on 35 studies found that exposure to SO 2 increased the incidence of SARS-CoV-2 infection by nearly 7% (Zang et al., 2022). In the US, PM 2.5 and CO were found to be positively associated with the daily number of COVID-19 cases in the San Francisco area (Meo et al., 2020). In China, levels of air pollutants, including PM 2.5 , PM 10 , CO, and SO 2 , have been shown to be associated with an increased risk of influenza-like-illness, including COVID-19 (Zhu et al., 2020).
Particles of air pollutants can affect SARS-CoV-2 transmission by facilitating airborne transport of SARS-CoV-2 (Martelletti and Martelletti, 2020;Meo et al., 2020), thereby triggering the inhalation of SARS-CoV-2 on small airborne particles (Han et al., 2022). Additionally, enhanced expression of the viral receptor angiotensin-converting enzyme 2 increases the susceptibility to SARS-CoV-2 infection through an elevated number of viral entry sites or receptors on the host cells (Roy, 2021). In addition, exposure to air pollution induces oxidative stress and the generation of free radicals, which encourage a pro-inflammatory state, leading to immune dysregulation in viral infections (Ciencewicki and Jaspers, 2007), including SARS-CoV-2 . This enables the virus to survive longer and become more aggressive in an immune system already weakened by air pollutants (Han et al., 2022;Martelletti and Martelletti, 2020;Meo et al., 2020).
In particular, CO is a highly toxic gas that can cause lung damage (Meo et al., 2020). The capacity of CO to bind hemoglobin more strongly than oxygen, causes its toxicity in the human body (Zhao et al., 2019). High CO levels induce reactive oxygen species and myoglobin dysfunction, resulting in hypoxic tissue damage or even death from Fig. 1. Box plots of the differences in variables between the cluster infection and non-cluster infection groups before and after propensity score weighting. (a) Cluster cases before propensity score weighting; (b) Non-cluster cases before propensity score weighting; (c) Cluster cases after propensity score weighting; (d) Non-cluster cases after propensity score weighting. PSW, propensity score weighting. Box-and-whisker plot: First quartile (Q1 or 25th percentile), the bottom line in the box; Median (Q2 or 50th percentile), the middle line in the box; Third quartile (Q3 or 75th percentile), the top line in the box. The minimum and maximum observation in the whisker. Before PSW, the mean propensity scores were significantly different between 0.29 ± 0.11 for cluster group and 0.23 ± 0.11 for non-cluster group (p < 0.001). After PSW, the mean propensity scores were not significantly different between 0.25 ± 0.10 for cluster group and 0.25 ± 0.12 for non-cluster group (p = 0.515).
asphyxia (Zhao et al., 2019). This may explain why individuals exposed to higher CO levels had a higher incidence of COVID-19 in our study. Additionally, SO 2 exposure can make people more susceptible to viral infections of the respiratory tract (Cole et al., 2020). This is due to persistent inflammatory activity in the respiratory system via interleukin-8, interleukin-17, and tumor necrosis factor-α, both in vitro and in vivo (Conticini et al., 2020). This mechanism may explain why individuals exposed to higher SO 2 levels had a higher incidence of COVID-19 in our study.
Fine particles, such as PM 2.5 , are small in size, which increases the likelihood of them passing through the nose and throat and into the lungs (Frontera et al., 2020). Normally, individuals consistently exposed to particulate matter are more prone to develop progressive chronic inflammation of the airways, which causes increased mucus production and decreased ciliary activity. This can lead to severe respiratory illnesses following viral infections (Frontera et al., 2020). In this study, PM 2.5 and PM 10 were statistically significant factors, but we assumed that they were not of clinical importance because the magnitude of their effect was small.
Many studies have identified older age as one of the primary factors contributing to susceptibility toward SARS-COV-2 infection. Aging is accompanied by immune senescence, represented by T cell immune deficiency, which is associated with several changes in the innate and adaptive immune systems, leading to increased susceptibility to infections such as SARS-COV-2 (Hojyo et al., 2020;Xiao et al., 2021) in older adults (Qin et al., 2016). In addition, many older adults have chronic diseases, such as high blood pressure, diabetes, and heart disease, all of which weaken the functioning of the immune system. These immunological vulnerabilities and higher susceptibility to climate variations with increasing age may partially explain the study results.
In general, it is likely that cluster infections predominate during periods in which SARS-COV-2 infections are surging. However, our results showed a negative association between cluster infections and overall SARS-CoV-2 incidence. This is most likely because non-cluster infections also reflect a situation in which the infection route cannot be identified because the surveillance and tracing capabilities are exceeded during periods of increased incidence.
This study has some limitations. First, due to the data limitations, we could not consider the seasonal variation or trend of SARS-CoV-2 transmission because we only had data for a three-month period. Second, we were unable to combine clinical data with epidemiological surveys expeditiously. Therefore, we did not consider the time required to develop COVID-19-related symptoms. Third, we could not fully adjust for SARS-CoV-2 transmission-related variables, such as demographical information (e.g., sex), socioeconomic factors (e.g., household income, mask-wearing and hand-washing behaviors), and social distancing policies . These, together with meteorological and air pollution conditions, may have an impact on the daily incidence of COVID-19. Fourth, there was no control group and only confirmed cases were included, making external comparisons difficult. Finally, because epidemiological surveillance was not possible in certain regions, data from these areas were excluded.

Conclusion
In conclusion, the findings of this study indicate that exposure to ambient air pollutants, especially SO 2 and CO, significantly contributed to the SARS-CoV-2 transmission in South Korea during the initial months of the pandemic. Also, older age and non-cluster infection were associated with a greater risk of SARS-CoV-2 infection. Further large-scale studies are warranted to consider socioeconomic factors besides age and type of infection and allow external validation to confirm the findings of this study.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Table 3
Factors associated with a high standardized incidence ratio of COVID-19 in the multivariable logistic regression analysis using propensity score weighted data. The odds ratios were calculated using multivariable logistic regression analysis with a high SIR of COVID-19 as the dependent variable, adjusting for age, sunshine, temperature, temperature difference, wind, SO 2 , CO, NO 2 , PM 2.5 , PM 10 , and type of infection. Age, age ≥60 years (reference: age <60 years); Sunshine, daily mean sunshine (hr); temperature, daily mean temperature ( • C); temperature difference, daily mean temperature difference ( • C); wind, daily mean wind speed (m/s); SO 2 , daily maximum SO 2 (ppm); CO, daily maximum CO (ppm); NO 2 , daily maximum NO 2 (ppm); PM 2.5 , daily maximum PM 2.5 (μg/ m 3 ); PM 10 , daily maximum PM 10 (μg/m 3 ); type of infection, cluster infection (reference: non-cluster infection).