Cooperative simultaneous inversion of satellite-based real-time PM 2.5 and ozone levels using an improved deep learning model with attention mechanism ☆

Ground-level fine particulate matter (PM 2.5 ) and ozone (O 3 ) are air pollutants that can pose severe health risks. Surface PM 2.5 and O 3 concentrations can be monitored from satellites, but most retrieval methods retrieve PM 2.5 or O 3 separately and disregard the shared information between the two air pollutants, for example due to common emission sources. Using surface observations across China spanning 2014 – 2021, we found a strong relationship between PM 2.5 and O 3 with distinct spatiotemporal characteristics. Thus, in this study


Introduction
There have been numerous occurrences of air pollution episodes with high fine particulate matter (PM 2.5 ) and ozone (O 3 ) concentration worldwide in recent decades (Zang et al., 2021a,b;Luo et al., 2022). These pollutants have adversely affected human health and can cause various diseases (Smith et al., 2009;Patella et al., 2018;Yang et al., 2022a). In China particularly, the government has prioritized PM 2.5 and O 3 synergistic control (Xiao et al., 2022). Under these issues, real-time PM 2.5 and O 3 monitoring is critical to provide immediate alerts to residents and to help governments take timely actions (Kerekes et al., 2020;Geng et al., 2021;Ojha et al., 2022).
Although ground-based measurement sites can provide real-time surface PM 2.5 and O 3 information, the availability of timely updates of PM 2.5 and O 3 levels for locations with no monitoring station is severely limited (Chae et al., 2021). To address this limitation, satellite-based technology has been widely applied for spatial coverage of PM 2.5 and O 3 estimation (Li and Cheng, 2021;Tritscher et al., 2021;Bai et al., 2022;Zhu et al., 2022). Because the levels of air pollutants vary noticeably in space and time, and are associated with complex interplay between meteorological factors, deep learning-based methods have been widely utilized for PM 2.5 and O 3 retrievals  ☆ This paper has been recommended for acceptance by Pavlos Kassomenos. et al., 2021;Zang et al., 2021a,b;Pruthi and Liu, 2022). Although deep learning-based PM 2.5 and O 3 retrievals are readily available, data obtained via historical mapping and real-time monitoring are clearly different. In historical mapping based on deep learning, the data are typically randomly sampled into three subsets (Yan et al., 2020) (training, validation, and test data), and the network is trained on samples both before and after the predicted events. However, for real-time monitoring applications the training data include only past information. Consequently, the accuracy of PM 2.5 and O 3 data in real-time monitoring is significantly lower than that in historical data mapping (Geng et al., 2021), and this limitation needs to be addressed urgently.
Currently, many developed deep learning models are designed only for individual estimation of PM 2.5 or O 3 (Li and Cheng, 2021;Yan et al., 2021;Zang et al., 2021a,b;Bai et al., 2022;Luo et al., 2022). However, PM 2.5 and O 3 have common precursors, such as volatile organic compounds (see Fig. 1a), and both pollutants can be generated through the secondary reaction process, and should thus be linked to each other (Chen et al., 2019). This suggests that the monitoring accuracy for PM 2.5 and O 3 can be enhanced by exploiting their correlation in a joint retrieval. Additionally, the amount of training samples has a noticeable effect on model performance (D.R. . For PM 2.5 and O 3 modeling, the training data are commonly collected from surface monitoring sites. However, owing to calibration, maintenance, or data transmission issues of monitoring instruments, PM 2.5 or O 3 data can be missing for certain periods of time (Fig. 1b), and this phenomenon may be severe in specific sites (Samal et al., 2021). Gaps in training data can lead to high uncertainty in deep learning-based models (Shen et al., 2018). PM 2.5 and O 3 have distinct temporal characteristics (Deng et al., 2022) and present PM 2.5 and O 3 concentrations can partly depend on past air pollutant levels . However, predicting PM 2.5 and O 3 concentrations using temporal feature information in deep learning models continues to be challenging. Limited studies consider the historical temporal characteristics for PM 2.5 or O 3 modeling (Li et al., 2017;Pak et al., 2020). For this issue, a long short term memory (LSTM) network has been developed and incorporated in many deep learning models to capture the PM 2.5 or O 3 long-term dependency within a particular time range (Wen et al., 2019;Pak et al., 2020;Wu et al., 2020). Wang et al. (2021) revealed that employing LSTM to describe time-dependent effects promotes accurate estimation of PM 2.5 compared to the general random forest model. However, in PM 2.5 and O 3 time series, the effect of different past days could vary. This necessitates capturing the most important past time points and giving higher weights to them, which cannot be performed by LSTM (Abbasimehr and Paki, 2022).
To fill the research gap and overcome the aforementioned limitations, we propose a new deep learning model that simultaneously performs O 3 and PM 2.5 real-time monitoring, the Simultaneous Ozone and PM 2.5 inversion deep neural Network (SOPiNet). In contrast to general deep learning models for single PM 2.5 or O 3 estimation, SOPiNet jointly learns PM 2.5 and O 3 information and retrieves them simultaneously. We designed a two-task deep neural network framework for SOPiNet with a novel loss function, which allows the model to effectively use more training data. In addition, multi-head attention was introduced to make the network learn temporal relationships across different past days. We tested and evaluated SOPiNet for real-time PM 2.5 and O 3 monitoring in China in 2022. The evaluation demonstrates that joint learning by SOPiNet for simultaneous PM 2.5 and O 3 monitoring leads to improved general performance compared to single PM 2.5 or O 3 retrievals.

Ground-based monitoring data
Since 2013, China has established a large number of ground-based air quality monitoring stations that can provide hourly air pollutant information. (Fig. S1a). We collected PM 2.5 and O 3 in situ observations from 2019 to 2022 at 11:00 a.m. local time synchronized with overpasses of the MODIS Terra satellite. Abnormal values were removed at each site using the method reported by Zhong et al. (2022), which considers values three standard deviations away from the moving average over 1 month as outliers.
Ground-based meteorological data were collected from the National Climatic Data Center (NCDC). As a publicly available dataset, the NCDC provides access to over four hundred ground-based monitoring stations in China (Fig. S1b). In addition, boundary layer heights were calculated from the Integrated Global Radiosonde Archive (Fig. S1c) using the Richardson method. Compared with reanalysis data which are provided with a time delay (e.g., around 5 days for ERA5 and 1 month for MERRA2), the station data can offer more timely meteorological information for real-time applications. All collected air temperature, relative humidity, wind speed, boundary layer height, and visibility data were interpolated to a 5-km grid using the Empirical Bayesian Kriging method (Krivoruchko and Gribov, 2019). . Model 1 and Model 2 retrieve the two air pollutant concentrations independently, while a potential Model 3 could jointly retrieve PM 2.5 and O 3 . (b) Example of missing data from a hypothetical ground-based station due to sensor shutdown, system crashes, and other possible issues, which is a common problem for ground-based air quality measurements. Abbreviations: VOCs = volatile organic compounds.

MODIS data
In this study, the MODIS MOD02SSH data at a spatial resolution of 5 km were obtained from the Atmosphere Archive and Distribution System website (https://ladsweb.modaps.eosdis.nasa.gov). The MOD02SSH data product contains 36-band calibrated and geolocated at-aperture radiances, generated from MODIS Level-1A scans of raw radiance. As certain bands exhibit severe deficiencies, we used information from only band 1 to 12 and band 17 to 36 to as input for the model. Furthermore, we used MOD09, MOD13, MOD11, and MOD12 to provide surface reflectance, vegetation cover, land surface temperature, and landcover data sources for retrievals (Table S1). All these datasets were resampled to the same 5-km grid as the gridded station-based observational data.

Global forecast system data
The NASA Goddard Earth Observing System (GEOS) Composition Forecast (GEOS-CF) provides atmospheric composition data to the public in near-real time (https://gmao.gsfc.nasa.gov/). GEOS-CF along with the GEOS-Chem chemistry module expands GEOS-Chem's weather and aerosol modeling system to provide hourly atmospheric composition data including O 3 and PM 2.5 (Keller et al., 2021). In this study, we used surface-level GEOS-CF data for real-time monitoring at 11:00 (China Standard Time, CST) resampled to 5 km for the model training.

Correlation analysis between PM 2.5 and O 3
To explore the correlation between PM 2.5 and O 3 , we assessed the frequency of extreme events of each day during 2014-2021 using daily averages of in-situ measurements. We used the likelihood multiplication factor (LMF) to derive the co-occurrence factor of extreme events. LMF is defined as the ratio of the joint probability of two extreme events and the probability if they are assumed to be independent, and has been widely used to assess the relationship in compound events (Zscheischler and Seneviratne, 2017). Here we defined four types of extreme events: (I) high O 3 and high PM 2.5 , with O 3 and PM 2.5 both at or above their respective 80th percentile for all days; (II) high O 3 and low PM 2.5 , with O 3 at or above the 80th percentile and PM 2.5 at or below the 20th percentile; (III) low O 3 and low PM 2.5, with O 3 at or below the 20th percentile and PM 2.5 at or above the 80th percentile; (IV) low O 3 and low PM 2.5 , with O 3 and PM 2.5 both at or below their respective 20th percentile. The LMF formula is expressed as follows: where x is the percentile in the PM 2.5 data (20th or 80th percentile) and y is the percentile in the O 3 data (20th or 80th percentile). Notably, an LMF equal to or below one represents no increase in the co-occurrence probability (i.e., the two extreme events are likely to be independent), whereas a larger LMF indicates an increased likelihood of compound events.

Simultaneous Ozone and PM 2.5 inversion deep neural network (SOPiNet)
We developed SOPiNet to address the following deficiencies associated with single modeling retrievals: (1) PM 2.5 and O 3 have similar emission sources and significant commonalities, and single task modeling is limited in that it cannot use shared information to improve the estimation accuracy; (2) when ground-based observations for PM 2.5 or O 3 are missing, there are gaps in the training data for single modeling; and (3) variations in the air conditions of different days in the past have different impacts on current PM 2.5 and O 3 , and capturing this feature at different time points is challenging. Fig. 2a shows the framework of SOPiNet, which consists of the following three key parts: (1) a deep neural network (DNN) to process satellite and other ancillary data; (2) multi-head attention to learn temporal relationships across different past days; and (3) joint training applied by integrating DNN and attention-based features for shared representation learning. The codes of SOPiNet and its user guide are freely available online at https://github.com/RegiusQuant/ESIDLM. All input variables for SOPiNet are shown in Table S2. The collected data from 2020 to 2021 were used as training data to train the model, the data in 2019 were used as validation data for model hyperparameter tuning, and the data in 2022 were used as test data to evaluate the model performance.

DNN framework
In SOPiNet, we introduce a DNN-based framework called Entity-DenseNet (Yan et al., 2020). The input data are separated into two groups: categorical and numerical variables. The categorical variables are first processed by an embedding layer and then merged with numerical variables as inputs to hidden layers. We constructed three hidden layers in SOPiNet, each comprising one batch normalization layer, one fully connected layer, one dropout layer, and one rectified linear unit layer. The details of the feed-forward operation in this DNN framework are reported by Yan et al. (2020).

Multi-head attention
SOPiNet learns relationships across different past days via the multihead attention mechanism. An attention function takes as input the query (Q), keys (K), and corresponding values (V), which are all real N × d model matrices (Vaswani et al., 2017): where Q, K and V ∈ R N×dmodel are Real Number (R) matrix with N and d model dimensions. Then the head-specific representation subspaces Q i , where i denotes the head number, and W Q i , W K i , and W V i are headspecific weights for Q, K, and V: A common choice for computing the attention in head i is the scaled dot-product attention, which can be expressed as follows: The final multi-head attention outputs are processed by a linear combination (W O ) from all heads. Fig. 2c shows how the multi-head attention works in this study, using information from the past 3 days as an example (in actuality we used the past 20 days). The input X (3 × 4) has 3 rows (corresponding to the number of days) and 4 columns (one for each feature: PM 2.5 , O 3 , temperature and relative humidity). Then, through the fully connected layer process for a linear transformation, we obtained ×2 (3 × 8). Here, the original 4 feature information is transformed to 8 dimensional feature space (d model = 8). If the number of heads is 2, d k = d v = d model /I = 8/2 = 4; therefore, Q i , K i , and V i are 3 × 4 matrices. The output from head 1 and 2 are 3 × 4 matrices, and the combination of the two yields a 3 × 8 matrix. Through the linear transformation of W O , which has 8 × 8 dimensions (i × d v = 2 × 4 = 8, d model = 8), the final output from the multi-head attention is a 3 × 8 matrix. Evidently, the input and output matrices for multi-head attention have the same dimension. This way, multi-head attention allows for the model to jointly assess information from the past days to learn their temporal relationships. In this study, we used the Autoregressive Integrated Moving Average (ARIMA) model to determine the number of past days to use as input to SOPiNet (the detailed can be found in Supplementary information-Optimization of the number of past days' information for PM 2.5 and O 3 ) (Sakamoto et al., 1986;Tran and Reed, 2004;Aasim et al., 2019). Based on ARIMA results, we chose to use 20 days of information as inputs for the SOPiNet real-time O 3 and PM 2.5 retrievals.

Loss function
For effective knowledge sharing across PM 2.5 and O 3 retrievals, a new joint loss function was developed for network optimization: where N is the number of samples, Y PM2.5 truei and Y O3 truei are ground-based measured values taken as the truth at sample i, Y PM2.5 ei and Y O3 ei are model estimated values, and M PM2.5 maski and M O3 maski are mask values for ground-based measured true values. If the ground-based measured value is missing, M maski = 0; otherwise M maski = 1.

Spatial heterogeneity and gap-filling for cloud impact
Because climate and administrative policies may differ significantly among provinces in China, PM 2.5 and O 3 have distinct spatial characteristics . This study considered season, month, land use type (Table S3), and province (Table S4, Fig. S2) as categorical variables for the model to address the spatial heterogeneity issue. In addition, we used the Cartesian function to model the spatial-temporal feature interaction processes for PM 2.5 and O 3 in different locations and months: where s is one of the elements for province S and t is a one of the elements for month T. We input the pairwise feature S × T to SOPiNet to jointly learn the interactions between the spatial-temporal information with O 3 and PM 2.  2022). To address this limitation, we first classified the satellite pixels into two types: cloud and non-cloud (Luo et al., 2008). Then SOPiNet estimated full-coverage PM 2.5 and O 3 concentrations by mining the relationship between meteorological data, GEOS-CF forecasts and ground-based PM 2.5 and O 3 under cloud and non-cloud conditions. Geng et al. (2021) showed that this method is robust and counteracted the limitations of cloudy pixels.

Significant links between PM 2.5 and O 3
We first assessed the association between extreme PM 2.5 and O 3 events using ground-based observations from 2014 to 2021 in China. We used LMF to measure the increase in the co-occurrence probability of extreme PM 2.5 and extreme O 3 (above 80th percentile as extreme high and below 20th percentile as extreme low) compound events relative to the frequency if these extremes were independent. An LMF of 1 or below indicates no increase in the co-occurrence probability of compound extreme PM 2.5 and O 3 events. Fig. 3a-d shows that the mean LMF for four types of concurrent extreme O 3 and PM 2.5 events in four seasons during 2014-2021. The results indicate that the association between PM 2.5 and O 3 has strong seasonal and spatial characteristics. As shown in Fig. 3a and b for the winter season, the mean LMF was notably higher in northern China than in southern China, which shows the presence of both compound extreme low PM 2.5 _high O 3 and extreme high PM 2.5 _low O 3 events in northern China in winter. Many studies have presented a significant negative correlation between PM 2.5 and O 3 in winter in northern China due to temperature and emissions related to household heating (Li et al., 2019a,b;Duan et al., 2020). Li et al. (2019a,b) showed that high PM 2.5 concentrations in winter scavenge hydroperoxides (HO 2 ) and NO x radicals needed to produce O 3 , leading to a decrease in O 3 . On the other hand, the winter season can also lead to an increase in the co-occurrence of high O 3 and low PM 2.5 in northern China. This dependence has also been observed in 12 western US cities; O 3 increased with PM 2.5 at low peaks (approximately 30-50 μg/m 3 ) and declined at high PM 2.5 concentration levels (Buysse et al., 2019).
In spring and autumn, the LMF for extreme high PM 2.5 -low O 3 in the Beijing-Tianjin-Hebei region is generally above 2.0, but for extreme low PM 2.5 -high O 3 the LMF is always below 1 (Fig. 3e), leading to a doubling in the occurrence rate for high PM 2.5 and low O 3 events. In spring, this region often suffer from dust storms, which directly affect the radiative forcing and thus the secondary production of O 3 (Forkel et al., 2012;Huang et al., 2014;Kok et al., 2021). In summer, Fig. 3c and d shows that the co-occurrence of extreme high PM 2.5 -high O 3 is especially high. In this season, the enhancement of solar radiation promotes photochemical reactions that lead to O 3 generation. Then, high levels of atmospheric oxidants (O X = NO 2 + O 3 ) can lead to a low oxidation state, oxidizing organic aerosols and resulting in joint extreme high O 3 and PM 2.5 events (Duan et al., 2020). Previous studies have found that O 3 and PM 2.5 are more likely to exhibit a strong positive association in summer, especially in coastal regions such as the Pearl River Delta and Yangtze River Delta, which is consistent with our findings, as shown in Fig. 3c-e. Our analysis reveals a strong relationship between PM 2.5 and O 3 , and the compound extremes of PM 2.5 and O 3 events have clear spatial and temporal patterns in China. Therefore, jointly learning PM 2.5 and O 3 information through deep learning could potentially take advantage of this relationship to improve the retrievals of PM 2.5 and O 3 in different regions and seasons.

Model evaluation and comparison
To quantify the added value of the joint learning in SOPiNet, we predicted real-time values of PM 2.5 and O 3 in 2022 using SOPiNet and compared the results with a single modeling variant of the network that retrieved PM 2.5 and O 3 independently. Fig. 4a-d shows the predicted values evaluated against observations in a time-based validation (2020-2021 as training data, 2019 as validation data and 2022 as test data). For PM 2.5 , the results from SOPiNet are generally consistent with ground-based observations, with a coefficient of determination (R 2 ) of 72% and RMSE of 16.45 μg/m 3 , thus achieving an additional 6 percentage point of variance compared to the single modeling results (R 2 = 66%).
In addition, we divided the concentrations into three intervals: 0-50, 50-100, and 100-150 μg/m 3 . The result shows that SOPiNet has significantly higher R 2 than single modeling for PM 2.5 ranging from 50 to 100 μg/m 3 (SOPiNet R 2 = 72%, single modeling R 2 = 63%) and 100-150 μg/m 3 (SOPiNet R 2 = 62%, single modeling R 2 = 55%), which indicates that the joint learning especially improved the result for high  From Fig. 4a-d, it can be seen that SOPiNet has more reliable estimates for high levels of PM 2.5 and O 3 compared with the single modeling results. In particular, SOPiNet reduces the underestimation issue in the 50-150 μg/m 3 range for both the PM 2.5 and O 3 retrievals ( Fig. 4e and f). Table S5 lists the performances of models from previous studies which used the time-based method for validation R.Y. Liu et al., 2020;Wei et al., 2020;Yan et al., 2020;Chen et al., 2021;Geng et al., 2021;Huang et al., 2021;Yan et al., 2021;Dong et al., 2022;Luo et al., 2022;Wang et al., 2022). From the comparison with previous studies, SOPiNet exhibits certain improvements with respect to R 2 and RMSE.
To evaluate the performance of SOPiNet in areas where there are no ground-based stations to provide training data, we randomly excluded 200 observational sites (Fig. 4g) from the training set. The validation results in Fig. 4h and i shows that SOPiNet still performs well in areas with no stations, with an R 2 of 67% for PM 2.5 and 76% for O 3 .
In addition, compared with single PM 2.5 and O 3 retrievals, SOPiNet significantly reduced the training and inference time (see Table S6). SOPiNet decreased the training time by 35.5% and inference time by 32.0% on a computer with a 3960X 24-Core CPU and an NVIDIA GeForce RTX 3090 GPU.
One reason for the improved accuracy by SOPiNet is its ability to utilize more training samples than single modeling. As seen in Fig. 5a, many sites had missing data for PM 2.5 when O 3 measurements were available, and vice versa. Missing data rates exceeding 5% (PM 2.5 or O 3 ) were observed in 36.5% of the sites. Particularly, as shown in Fig. S4, certain sites can have missing data rates that exceed 20% with a maximum of 56.3%. Single modeling works only when there is no missing data in each estimation task (Fig. 5b), leading to many collected incomplete data not being used for model training. In contrast, SOPiNet can train a model despite either PM 2.5 or O 3 data missing, which may present additional samples for training data (used to train the model) and validation data (used for model hyperparameter tuning) compared to single modeling. Fig. 5c presents a comparison of the model performance under different data missing rates. The advantage of SOPiNet over single modeling clearly increases as the missing data rate increases.
When the missing data rate exceeds 20%, SOPiNet yields increased R 2 values by 11.1% and 6.4% (relative change) for real-time PM 2.5 and O 3 estimation, and decreased RMSE values by 12.4% and 6.9% (Fig. S5), respectively. Fig. 6 shows daily real-time monitoring results obtained by SOPiNet for three conditions: heavy, moderate, and few clouds on January 26, May 7, and 7 March 2022, respectively. On January 26, 2022, the cloudfree satellite pixels covered only 20-30% of the grid cells in China. Under this condition, many hotspots of PM 2.5 and O 3 were missing owing to the effect of clouds. SOPiNet first classifies the pixels as cloud or non-cloud, then using supportive information from GEOS-CF data (PM 2.5 and O 3 ) and meteorological data, conducts joint learning to explore the implicit relationship between variables from the actual PM 2.5 and O 3 states. Consequently, the weight coefficient associated with the cloud and non-cloud conditions can be learned during the SOPiNet training, which allows the model to effectively train cloudaffected areas, thereby driving other variables to fill in the gaps. From the results on January 26, 2022, SOPiNet can accurately capture the heavy PM 2.5 (>100 μg/m 3 ) and slightly O 3 polluted (<40 μg/m 3 ) condition in the Beijing-Tianjin-Hebei region, but this is not detected without gap-filling for cloud impact. These improvements were observed under moderate and few clouds conditions, during which SOPiNet can comprehensively capture daily variations in PM 2.5 and O 3 .  and GEOS-CF forecasts. On January 5, 2022, high PM 2.5 pollution was clearly observed in the Yangtze River Delta and northeast China. Although both SOPiNet and GEOS-CF captured these hotspots, GEOS-CF severely overestimated the PM 2.5 concentration in Sichuan; the PM 2.5 concentrations from ground-based measurements and SOPiNet retrievals were approximately 50 μg/m 3 , while the GEOS-CF PM 2.5 forecast exceeded 90 μg/m 3 . On Apr 12, 2022, GEOS-CF underestimated (overestimated) O 3 in the Northeast (Southwest) region of China, while SOPiNet accurately captured these events.

Conclusion and discussion
This study proposes a novel satellite-based inversion deep learning model, SOPiNet, for real-time and simultaneous PM 2.5 and O 3 monitoring. First, the relation between O 3 and PM 2.5 was illustrated. The presence of both compound extreme low PM 2.5 -high O 3 and extreme high PM 2.5 -low O 3 events were found in northern China in winter.
Additionally, the co-occurrences of both high O 3 -high PM 2.5 and low O 3 -low PM 2.5 were especially high in western China during summer. To better model this highly relevant relationship and improve real-time O 3 and PM 2.5 monitoring, SOPiNet was designed within a two-task deep neural network framework, which simultaneously learns PM 2.5 and O 3 retrieval tasks and shares the most relevant features of both.
Moreover, we found that historical air conditions from the past days to weeks contain information relevant for the estimation of current PM 2.5 and O 3 (Fig. S7). We determined that including information about the air quality from the past 20 days was optimal for current real-time PM 2.5 and O 3 estimation in SOPiNet. The important features from the past days were captured in SOPiNet using the multi-head attention mechanism. As shown in Fig. 4, validation results show that SOPiNet is suitable for the simultaneous inversion of PM 2.5 and O 3 , and results in an overall performance compared to the single-species inversion model. One reason for this improvement is that SOPiNet utilizes more samples in the training dataset when data is missing for one species but available for the other species; consequently, the joint retrieval is particularly beneficial in cases where missing data rates are high.
A limitation of SOPiNet that should be addressed in future studies is that the model cannot be trained when PM 2.5 and O 3 exhibit missing data simultaneously. Many existing research studies indicate that missing training samples have significant impacts on model performance (Shen et al., 2018;Samal et al., 2021;Yang et al., 2022b;Li et al., 2022b). Therefore, comprehensive utilization of the collected data is challenging, and further investigation is needed to handle missing data. On the other hand, SOPiNet employs multi-head attention to capture patterns existing in time series, but the mechanism for visualizing this pattern has not been determined. Moreover, the same meteorological factors in different weather conditions could have different impacts on the PM 2.5 and O 3 retrieval. Although the potential dependence of the shared meteorological factors on different impacts can be captured by SOPiNet with shared representation learning, it is currently challenging to physically interpret the results. Therefore, further work is needed to open the deep learning black box to understand the processes that improve the prediction of air pollution.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
Data will be made available on request. The codes of SOPiNet and its user guide are freely available online at https://github. com/RegiusQuant/ESIDLM.