A novel fusion framework embedded with zero-shot super-resolution and multivariate autoregression for precipitable water vapor across the continental Europe

Precipitable water vapor (PWV), as the most abundant greenhouse gas, significantly impacts the evapotranspiration process and thus the global climate. However, the applicability of mainstream satellite PWV products is limited by the tradeoff between spatial and temporal resolutions


Introduction
Precipitable water vapor (PWV) is a critical atmospheric indicator that has rapid feedback on global warming.As the medium of land surface-atmosphere energy exchange, PWV has been widely used in many meteorological and hydrological applications, e.g., drought monitoring (Jiang et al., 2017), rainfall forecast (Yao et al., 2017), lightning activity analysis (Price, 2000), watershed management (Ricciotti and Cordeira, 2022), and extreme climate simulation (Goergen and Kollet, 2021).The spatio-temporal variations of PWV at local and global scales are crucial for researchers to deeply understand the earth's climate system.
Over the past decades, PWV was either directly observed at ground stations or derived from satellite sensory (King et al., 2003;Tregoning et al., 1998).Radiosondes and the global navigation satellite system (GNSS) are widely used as ground-based observations to measure water vapor.Radiosondes can record water vapor at various pressures, but they contain large biases due to changes in hygrometers and observing practices (e.g., weather conditions) (Basili et al., 2001).Meanwhile, the GNSS can retrieve the total vertical integrated water vapor over the receiver with high precision and high-temporal resolution at the station scale, so it is usually taken as a ground reference for validation purposes of other PWV products (Niell et al., 2001).Moreover, the satellites provide an effective approach for large-scale observation of PWV.Specifically, the satellite-based PWV can be retrieved by using visible (Chan et al., 2020;Vaquero-Martínez et al., 2018), near-infrared (Abbasi et al., 2020;Gao and Kaufman, 2003), thermal infrared (Ren et al., 2015;Wang et al., 2015), microwave (Boukabara and Garrett, 2018;Dong and Zou, 2017), and hyperspectral (Acito et al., 2020;Barducci et al., 2004) sensors.Among various satellite sensors, the Moderate-resolution Imaging Spectroradiometer (MODIS) can deliver a high-resolution (1 km) PWV product (i.e., MOD05) with the near-infrared method.MOD05 can provide promising clear-sky PWV estimation but its high-resolution application is often limited by discontinuous observations under cloudy conditions.
As complementary information of ground-based or satellite-based PWV observations, the climate reanalysis data has been widely used at global scales (e.g., the MERRA-2, JRA-55, CFSR, and ERA5).They integrate various historical observations into continuous large-scale estimations using advanced modeling and assimilation techniques (Bosilovich et al., 2015;Hersbach et al., 2020).Among several reanalysis products, the ERA5 product has gained popularity due to its notable advancements in tropospheric representation and tropical cyclone simulation.It can provide hourly 0.25 • PWV at global scales which supports many scientific studies and engineering projects.As a valuable resource for long-term climate analysis, the ERA5 not only provides valuable insights into the hydrological cycle but also helps understand the energy balance of the earth system (Yu et al., 2021).If the spatial resolution of ERA5 could be further improved, many modeling systems and predictions would be benefited.
It has been challenging to acquire comprehensive water vapor information under complex climate conditions, hence it is necessary to investigate the complementary observations of different data sources.To enhance the spatio-temporal continuity of PWV, several types of methods have been developed in the past two decades.They include spatio-temporal interpolation (Sun et al., 2021;Wong et al., 2015), dataassimilation (Dee et al., 2011;Seko et al., 2011), and data fusion methods (Li and Long, 2020).The interpolation methods explore the spatio-temporal features of discrete point data to predict PWV with high continuity, but they are sensitive to noise in extreme weather conditions.The data-assimilation methods incorporate observations from various sources with assimilation models to generate high-resolution PWV.Nevertheless, the assimilation accuracies are prone to the inconsistency of input station-based and satellite-based data.The data fusion methods develop the connections between various PWV products under clear-sky conditions to reconstruct fine cloudy-sky PWV.Since the spatiotemporal fusion technique was initially developed to downscale surface reflectance, some methods (e.g., the enhanced spatial and temporal adaptive reflectance fusion model (ESTARFM) (Zhu et al., 2010) and the flexible spatio-temporal data fusion (FSDAF) (Zhu et al., 2016)) have been introduced into the reconstruction of climate variables (e.g., land surface temperatures, evapotranspiration, and water vapor).Li and Long (2020) fused MOD05 and ERA5 PWV with the ESTARFM method and achieved promising results in high-mountain regions.Nevertheless, if only considering several quantities with limited kernel sizes locally (e.g., spatio-temporal distances, spectral similarity, and residual correction), the accuracy remains uncertain due to the large scale difference between the coarse and fine resolution PWV images (about 25 times).
In recent years, deep learning image super-resolution has attracted a great deal of attention from the remote sensing community because of its exceptional capacity for image information mining.Since the superresolution convolutional neural network (SRCNN) was proposed, several supervised methods have been developed to enhance the resolution of different images (Dong et al., 2015).These SotA (State of the Art) image super-resolution methods achieved accurate results for biogeophysical pictures or reflectance images based on the predefined training data for the networks.However, these methods of training were based on externally supplied examples that might not be effective for PWV super-resolution, because they ignored the systematic bias of different datasets (e.g., differences in sensor noise and imaging conditions), along with the strong spatio-temporal non-stationarity of PWV.To improve the accuracy of non-ideal image prediction, some unsupervised methods are proposed to train image-specific CNNs on examples extracted solely from the input image itself (e.g., "Zero-Shot" superresolution (ZSSR) (Shocher et al., 2018)).These methods are parameterized based on internal statistical information, showing better performance than those based on external information, especially for nonideal bio-geophysical images.
Across the European continent which has significant PWV seasonal and interannual anomalies (Yuan et al., 2021b), the seamless PWV with high spatiotemporal resolution is often desired by many individuals because discontinuous observations hinder the development of longterm monitoring and near real-time warning systems (Khordakova et al., 2022) for flood, drought, forest fire, etc.To improve the accuracy and spatio-temporal continuity of PWV, this study proposed a novel PWV spatio-temporal fusion framework based on ZSSR and the multivariate autoregression models (ZSSR-ARF).It is performed to generate seamless PWV maps of high spatio-temporal resolution (0.01 • , daily) by fusing MOD05 and ERA5 PWV across the European continent from to 2021.The accuracy of different PWV products and their fusion results were evaluated with GPS PWV from 230 stations.The spatial variations and temporal dynamics of PWV are analyzed at the grid and continental scales.This study provides a perspective for the accurate estimation of PWV by combining image super-resolution and spatio-temporal fusion techniques.

Study area
The study aims at delivering highly accurate and complete coverages of the water vapor quantities based on satellite observations and reanalysis data across the continental Europe as shown in Fig. 1.In the figure, the distributions of the GPS observations used for validation in this study are shown as well.The study area lies mainly in the temperate climate zone which is susceptible to prevailing westerlies.The annual and seasonal variability of PWV is related to the local climate which is affected by land cover, sea-land distance, and topography (Saracoglu and Sanli, 2021).Hence, the Köppen-Geiger climate classification (Kottek et al., 2006) is used to define the climate zones, including arid, temperate, cold, and polar areas (Fig. 1a) for spatio-temporal analysis of PWV.
Compared to distributions of other greenhouse gases (e.g., carbon dioxide) over Europe, the distributions of PWV may exhibit significant variability as water vapor often has a residence time of approximately days within the troposphere when the evapotranspiration and the sinks get stronger (Gangoiti et al., 2011).In recent years, global warming has triggered extreme weather events in the European continent, such as extreme heatwaves, hurricanes, and forest fires, especially during the summer of 2022 (Armstrong McKay et al., 2022).According to the Clausius-Clapeyron formula (Trenberth et al., 2003), a 1 K change in temperature may result in a 7% increase in the water vapor storage capacity, hence there were remarkable increments in water vapor over Europe under extreme weather over the past decade.The interannual PWV variations in Europe in winter are mainly related to the North Atlantic Oscillation (NAO) and the East Atlantic (EA) patterns (Yuan et al., 2021b).In the past two decades, the PWV in Europe has been increasing at 0-0.4 kg m − 2 decade − 1 in the north and 0.4-1 kg m − decade − 1 in the south, given by the GPS-station analysis (Yuan et al., 2021a).The significantly varying climate and hydrological processes which are changing significantly in Europe have great impacts on the natural ecological environment and human social activities.As a result, a high spatio-temporal resolution of water vapor is always desired to track its variation and investigate its mechanism, which is in accordance with the aim of this study.

Satellite and reanalysis PWV
The PWV of MODIS (i.e., the MOD05 product observed from the Terra satellite) was used as one of the two major inputs for the proposed framework because it has a relatively high spatial resolution (1 km) and temporal recycle (daily) for large-scale PWV applications (Gao and Kaufman, 2003).To improve the continuity of MODIS PWV observations, the ERA5 reanalysis PWV datasets (0.25 • × 0.25 • ) were used as the second input for the fusion framework to reconstruct reliable water vapor information within the cloud-contaminated pixels.ERA5 provides more reliable hourly PWV estimates compared with its predecessor ERA-Interim and many other reanalysis (Albergel et al., 2018).For the European continent, the overpass time of the Terra satellite ranges from ~8:00 to ~12:00 a.m.UTC.To increase the fusion accuracy, the daily mosaic PWV map is retrieved by integrating hourly super-resolution enhanced ERA5 0.05 • PWV based on MODIS swath according to the overpass time of the Terra satellite.The PWV generated by the proposed framework is not daily synchronized at a global scale, but based on the local overpass time.To enhance the spatio-temporal consistency of the low-resolution (LR) and high-resolution (HR) image pairs, ERA5 and MODIS PWV images with minimum time difference are used to parameterize the fusion model, and they are resampled to 0.01 • using the nearest neighbor method before performing the fusion process.

GPS derived PWV
PWV measurements derived from a combined dataset of 230 GPS stations in continental Europe, comprising 219 stations of the second reprocessed product from EUREF Permanent Network (EPN-repro2) (Pacione et al., 2017) for the period 2001-2014 and 54 stations of the International Ground Stations (IGS) network (Thorne et al., 2017) for the period 2015-2021 with overlapping stations accounted for, are used for the validation of the fused results.The initial zenith total delay (ZTD) time series are homogeneously reprocessed by different licensed Analysis Centres (AC) with combined solutions and all estimated biases from GPS antenna replacements are eliminated.The outliers in ZTD time series with a standard deviation larger than 15 mm are excluded (Pacione et al., 2011) and the claimed accuracy is within 4 mm varying with the intercomparison schemes.With some minor technical exceptions, the processes for converting ZTD to PWV are identical for both EPN-repro2 and IGS datasets with an uncertainty of <2 mm.
The conversion of PWV from the GPS ZTD was carried out with auxiliary meteorological variables from ERA5 pressure level product.Details of the conversion can be found in the knowledge base of European Centre for Medium-Range Weather Forecasts (https://confluence. ecmwf.int/pages/viewpage.action?pageId=301961645) and hence only a brief introduction is given here.The GPS signals are attenuated during its tropospheric propagation, leading to the zenith hydrostatic delay (ZHD) and the zenith wet delay (ZWD) which are mainly caused by dry gas and water vapor, respectively.The ZHD can be estimated by using ERA5-derived pressure (Davis et al., 1985): where P, φ, and h are the pressure, latitude, and height of the station with units of hPa, rad, and m, respectively.The ZTD, the sum of ZHD and ZWD, can be estimated in GPS data processing.Then, the PWV can be estimated by using the following equation (Bevis et al., 1992): where R V is the gas constant, k ′ 2 and k 3 are the atmospheric refractivity constants.From Eq. (3), the weighted mean temperature (T m ) can be calculated with water vapor pressure (e) and temperature (T) profiles from ERA5 products: where e i , T i , and ΔH i are the average water vapor pressure (hPa), temperature (K), and atmosphere thickness at the ith layer of the atmosphere, respectively.

Ancillary data
In this work, we consume the cloud-mask quality assurance (QA) of the MOD05 product, different topographical measurements (i.e., GPSstation, MOD03, and ERA5 elevations), and the related environment variables (e.g., pressure and temperature of ERA5) as the ancillary data.The QA band stored the cloudiness information for each MOD05 pixel, and only confident clear pixels are treated as reliable inputs for the subsequent fusion.For the comparison of different PWV products (i.e., the GPS, satellite, and reanalysis-based PWV), the elevation information is used to integrate PWV with different elevation levels for comparison.Then, the environment variables are used to retrieve GPS-based PWV for validation.To improve the spatio-temporal analysis of the fused PWV, the mean annual air temperature and precipitation are extracted from the ERA5 products.

Methodology
The proposed framework, namely, ZSSR-ARF, is based on image super-resolution and data-fusion techniques, in which several deeplearning and multivariate autoregression processes are involved.As shown in Fig. 2, the proposed framework consists of three main steps: 1) Image enhancement with ZSSR; 2) Optimal input image pairs selection.
3) Spatio-temporal fusion based on multivariate autoregression.For Step 1, the ZSSR method is employed to enhance the resolutions of ERA5 PWV images with solely inputs of the images themselves.By performing so, the uncertainty for fusion between the ERA5 and MOD05 images can be reduced.For Step 2, the optimal image pairs at cloud-free dates are selected as input for the image prediction at cloudy dates.The selection process is implemented based on the confident clear ratio of pixels and correlation coefficients between images at basic and prediction dates.For Step 3, the proposed ARF approach is used to fuse the enhanced ERA5 and MOD05 to generate high-resolution seamless PWV over the study area.After that, the fused PWV is validated by using observations derived from 230 GPS stations.Prior to the framework, the preprocessing of input data, including quality assurance, is introduced.

Image enhancement with ZSSR
To reduce the scale differences (~25 times) between the ERA5 and MOD05 PWV, the resolution of ERA5 should be first enhanced with minimal external intervention.This could be achieved by adapting the ZSSR method (Shocher et al., 2018).The ZSSR is a deep-learning algorithm that trains self-supervised fully convolutional neural network (FCNN) with a series of images downsampled from the target image itself to raise its resolution.The basic concept is to utilize the internal recurrence of information of the same image to train the network, so external information is not required.In this work, we adapted the ZSSR approach to enhance the resolution of ERA5 PWV from 0.25 • to 0.05 • to vitalize the subsequent data fusion process.
As can be seen in Fig. 3, the original LR image (I) is first downsampled to different lower-resolution versions of itself (I = I k , k = 1, 2, 3, …n).Then, they are further downsampled by the desired scale-factor s to obtain the LR image derivatives (I ′ k ).The network can stochastically train over these image-specific LR-HR example pairs (I k and I ′ k ).Moreover, the geometric self-ensemble strategy is used to enhance sample diversity.These image pairs and their rotation as well as mirrorreflection versions are used to construct several LR-HR image pairs for the training.On the other hand, a relatively light FCNN (8 hidden layers; each has 64 channels) is employed to train the difference (e) between the LR and degraded HR images which can be expressed as follows: where I LR and I HR are the LR and HR images, respectively.w is the kernel size of convolution (*).
The FCNN is parameterized by using the L1 loss function with ADAM optimizer and ReLU activations, and then the feedforward network is applied to the test image for prediction.It should be noted that the super-resolution is performed gradually for each intermediate scale factor (s = 1, 2, 3, 4, 5) until the desired scale is acquired.The generated HR image and its LR version at each intermediate scale s are supplied to the gradually growing training datasets for prediction at scale s + 1.

Optimal input image pairs selection
The enhanced ERA5 and MOD05 PWV are further integrated into the spatio-temporal fusion for the prediction of images at cloudy dates.However, the performance of spatio-temporal fusion is greatly affected by the radiometric and geometric characteristics of the input LR-HR image pairs (Shen et al., 2022).To provide reliable inputs for the PWV fusion approach, we adopt a flexible strategy to select the optimal basic image pairs.Firstly, the ratio of confident clear pixels to total pixels (confident clear ratio) within the 0.25 • grid unit is obtained for each nearby date.The nearby dates are defined as the previous and following two months of the prediction date considering the seasonal trend of PWV under westerlies and cloud interference, and only the image pairs with a confident clear ratio more than the 95th percentile are retained (Li and Long, 2020).Secondly, the confident clear image pairs at nearby dates which are significantly correlated with the image at the prediction date are selected considering the PWV may vary greatly in a short period under extreme weather conditions.Specifically, the correlation coefficients of the 0.05 • ERA5 PWV between the prediction date and the pair dates are calculated to evaluate the consistency of the pair images (Xie et al., 2018).The paired images in the previous and next two months, which have the maximum correlation coefficients, are considered as the best consistent image pairs.Based on this criterion, the paired images are determined as the optimal base image pairs for the fusion.

Spatio-temporal fusion based on multivariate autoregression
By using the selected optimal LR-HR image pairs between two basic dates (t m and t n ), the proposed fusion method predicts the HR image at date t p based on a multivariate autoregression model.To match different images, all the LR images are reprojected to the coordinate of HR images and resampled to fine resolution by using the nearest neighbor algorithm.Moreover, a radiometric normalization is implemented by applying a linear regression between LR-HR image pairs within a moving window, and the value of the target pixel (i.e., the center pixel in the moving window) is estimated based on the neighborhood pixels.
For two dates t m and t n , the temporal changes within each LR pixel can be considered as the weighted combination of subpixel changes of different classes (Eq.5) according to the pixel unmixing theory (Zhu et al., 2016).The spectral classes, rather than real land cover classes of the subpixels in each LR pixel, are obtained by applying the unsupervised ISODATA to make the unmixing process more flexible.
where M c is the number of HR pixels of class c within the LR pixel C(x i , y i ).N is the total number of pixels within the LR pixel and l is the number of classes.From Eq. ( 5 ) .
Based on the prediction of temporal changes, there is a timedependent increment R T (x i , y i ) between two-source images caused by Fig. 3. Workflow of the ZSSR network.
differences in sensor bandwidth and imaging conditions which can be expressed as: The above temporal change prediction can effectively record class variability induced by climate change but cannot reveal the within-class variations contained in the LR image at date t n .These within-class variations can be characterized by the space-dependent increment which is retrieved from a spatial prediction process (Liu et al., 2019).Specifically, the HR images at dates t m (F SP (x i , y i , t m )) and t n (F SP (x i , y i , t n )), are predicted based on the advanced ZSSR model considering the spatial dependence of PWV, and these HR images will retain useful information on spectral class changes when these changes are significant enough to appear in LR pixels.The space-dependent increment, R S , is defined as the difference between the spatial predictions at dates t m and t n : The spatial and temporal predictions as well as their increments provide crucial information for the HR image prediction.Previous unmixing-based fusion approaches usually combine them weighted by the homogeneity and heterogeneity of landscapes (Liu et al., 2019;Zhu et al., 2016Zhu et al., , 2010)).Since PWV has stronger non-stationarity and autocorrelation than land covers or surface reflectance (Zveryaev and Rudeva, 2011), this study proposes a multivariate autoregression-based approach for effectively integrating spatio-temporal predictions and increments, with the capability to adaptively allocate weights according to different spatiotemporal patterns in PWV.To reduce the bias caused by the autocorrelation of PWV, the spatial terms of these subitems are introduced to the model (Eq.9) in which the spatial weight matrix W is filled using the inverse distance method (Deeter and Evans, 1997).
The predictions and increments retrieved from the LR-HR image pairs are used to estimate the parameters (i.e., α, β, δ, γ, φ, and η) based on the local weighted linear regression, with an aim to minimize the residual.The HR image at date t p is predicted by using the LR-HR image pairs at dates t m and t n based on the dual-stage image predictions.After that, the optimized parameters are utilized to blend multivariate images to obtain F m (x, y, t p ) and F n (x, y, t p ). Finally, the weighted combination of the two prediction results is considered concerning the temporal similarity of PWV (Zhu et al., 2010): where temporal weight T k (k = m and n) is calculated by: ( 1 where r and s are the indexes of columns and rows in the moving window, respectively.In this study, the similarity strategy is employed to reduce the uncertainties caused by image noise which mainly manifests in two aspects: (1) The temporal changes of subpixels are estimated class by class for the prediction of HR image.(2) Only neighbor pixels with spectral values less than twice the standard deviation from the center pixel, are regarded to have the least spectral difference and are included in the prediction process.Furthermore, the PWV mapping of the study area is derived from the fusion results within the coverage of the original MOD05 and ERA5, as adequate neighborhood samples are required in the fusion to ensure the integrity of spatial details and the significance of regression results, particularly for individual insular or coastal grid cells.

Performance assessment
GPS retrievals, satellite, and reanalysis-based PWV are obtained at different elevations.As a result, it is necessary to match multi-source PWV estimates by using elevation correction for validation.
where PWV 0 and PWV c are the origin and corrected water vapor, respectively.Δh is the elevation difference between the satellite or reanalysis grid cell and the GPS station.The constant p of 0.439 is an average estimate for general applications obtained by integrating the standard atmosphere model (Leckner, 1978).
Taking GPS retrievals as the reference for validation, four statistical metrics were used to assess the performance of different PWV products, including the correlation coefficient (r), overall bias (Bias), root-meansquare error (RMSE), and the Kling-Gupta Efficiency (KGE) (Kling et al., 2012): where x i is the evaluated variable (satellite and reanalysis-based PWV), y i is the referenced variable (GPS-based PWV).n is the number of samples.r is the correlation coefficient and β is the bias ratio of the average value of the evaluated variable to referenced variable.γ is the variability ratio of the standard deviation of the evaluated variable to referenced variable.As a comprehensive model performance criterion, KGE can be decomposed into the contribution of correlation, bias, and variability ratio between datasets with a considerable fitting indicated by a value approaching 1.

Accuracy assessment of multi-source PWV with GPS-station measurements
The estimated PWV by using the proposed ZSSR-ARF framework, along with the raw and intermediate inputs, are compared with the GPSderived observations as illustrated in Fig. 4 and Table S1 in Appendix A. To maintain spatial consistency, all grid images are interpolated to the station scale for comparison with GPS data.
It can be seen that the consistency of the eastern inland areas in the time domain is greater than that of the western coastal areas.Besides, the bias of the south is larger than that in the north.The reanalysis-based ERA5 product shows the lowest bias with GPS retrievals (RMSE = 0.70-2.68mm and Bias = − 0.72-1.90mm), compared with the satellitebased MOD05 product as well as its derivatives.Moreover, the correlation coefficients between ERA5 PWV and reference values at most stations reach 0.99, and the average KGE exceeds 0.96.The high accuracy and continuity of ERA5 show its potential to engage in data fusion.
The original MOD05 shows poor consistency with GPS estimates (r = 0.22-0.66and KGE = 0.17-0.64).Overall, they indicate a large difference in temporal variations, particularly in CREU, MALL, PULA, ROVE, UPAD, and VIGO stations where RMSE exceeds 9 mm (Table S1).The MOD05 has obvious underestimation compared with ERA5 in cloudy conditions which greatly limits its high-precision application.Since the cloudy pixels are removed, the accuracy of clear MOD05 PWV is improved with r ranging from 0.55 to 0.95 and KGE ranging from 0.21 to 0.89.Nevertheless, the bias of clear MOD05 PWV remains large (mean RMSE = 4.12 mm and mean Bias = 1.25 mm) for most of the stations.The fused PWV shows a greatly improved performance compared to the original and clear MOD05.The accurate and continuous fused PWV product has high correlations (r = 0.82-0.95) and low bias (RMSE = 2.21-4.01mm) with the GPS estimates.Additionally, the fused result greatly reduces the underestimate of the original MOD05 product (mean Bias = 0.76 mm) and is fairly consistent with GPS-based PWV (mean KGE = 0.83).Accordingly, the proposed framework holds great potential for generating PWV with high accuracy and continuity.
To demonstrate the superiority of the proposed framework explicitly, five typical stations including BELL, BOR1, MAD2, UNPG, and ZIMM are selected for scatter plot analysis, as shown in Fig. 5.These stations with diverse topography and climatic conditions have different characteristics in terms of PWV distribution, variation, and uncertainty, which can reflect the performance of the fusion framework under different situations.The MOD05 subplots have a large number of outliers at the bottom of Fig. 5 a 2 -e 2 , resulting in a large RMSE of over 5.48 mm.After removing cloudy pixels, clear MOD05 shows a significant improvement compared to MOD05, but it still has an obvious overestimation as the outliers are concentrated at the top of Fig. 5 a 3 -e 3 .The Fused subplots have a more concentrated scatter distribution, and they show a high accuracy at different stations (R 2 = 0.79-0.84).These results indicate that the fusion framework can effectively reduce the errors and uncertainties of the MODIS PWV products and produce more reliable PWV estimates.
We further compare the fitting effects of seamless fused PWV derived from ERA5, the ZSSR-ARF, and other methods (i.e., ESTARFM and FSDAF) versus GPS PWV at all stations (Fig. 6).The ESTARFM method is proposed to predict the fine cloudy-sky image based on pixel unmixing theory which has great performance in heterogeneous areas (Zhu et al., 2010), while the FSDAF approach improves the fusion accuracy in areas with land cover changes by adapting advanced residual distribution method (Zhu et al., 2016).It can be seen in Fig. 6 that the fitting of ZSSR-ARF has a higher R 2 (mean of 0.83) and lower RMSE (mean of 3.02 mm) than the other methods do.Moreover, ERA PWV has a significant consistency (R 2 of 0.95 and RMSE of 1.34 mm) with GPS data due to the superior data assimilation technology and integration of massive observation information.

Independent validation at the grid scales
To evaluate the spatial performance of multi-source PWV, ERA5, MOD05, and fused PWV within a station-centric grid cell are compared.Specifically, the fused results of ZSSR-ARF, ZSSR, and other classical methods (i.e., ESTARFM and FSDAF) are validated with the clear-sky MOD05 PWV (Figs. 7 and 8).The comparison dates with high confident clear ratios for typical stations are selected from July to September 2018 as examples, and these clear PWV images at comparison dates are Despite the improvement in spatial performance, the results of SR combination methods have local regional biases with clear MOD05 images (e.g., the artifacts at the BOR1 and ZIMM stations), which are caused by abrupt changes not captured in the ERA5 images.
In comparison to the SR combination methods, the ZSSR method as well as spatial prediction based solely on ERA5 images produce smoother PWV distribution indicating the superiority of integrating MOD05 images for the fusion.However, the distribution pattern in PWV of ZSSR differs from the results of other methods in some local areas which can be related to system bias between ERA5 and MOD05 images.Furthermore, the fusion results of ZSSR-ARF have more similar distributions and considerable accuracy (R 2 = 0.74-0.94,RMSE = 0.66-2.88mm) with the MOD05 PWV, indicating a great improvement (with R improved by over 24% and RMSE reduced by over 0.61 mm) compared with the ESTARFM and FSDAF methods, as shown in Fig. 8.

Comparison of distribution differences at continental scales
The fused PWV needs to be evaluated at continental scales for wider applications.For this purpose, the spatial distributions of different products including ERA5, MOD05, clear MOD05, and the fused PWV, are compared (Fig. 9).
For images in January and October, the ERA5 shows generally decreasing trends from south to north (Fig. 9 a 1 and a 4 ).The original MOD05 shows obvious cloud contaminations which results in very few available pixels in the clear MOD05 (Fig. 9 c 1 and c 4 ).For images in April and July, there are some aggregated outliers in local areas for the ERA5 PWV, e.g., the minimums occurred on the northwest coasts (Fig. a 2 ).and the maximums appeared on the south coasts (Fig. 9 a 3 ).ERA5 has good consistency with the clear MOD05 in the clear-sky areas, which shows its potential to perform the fusion.As can be seen from the original MOD05 (Fig. 9b), there are several abrupt changes caused by cloud covers between mountains and plains, and between coast and inland.Although the cloudy pixels are greatly reduced in the clear MOD05 PWV maps, they are not applicable for large-scale PWV monitoring because of spatial discontinuity.The proposed ZSSR-ARF fusion framework can generate continuous PWV estimates and capture more spatial details, particularly on the Alps and Apennine Mountains (Fig. 9d).However, the fused PWV may differ from the ERA5 products in local areas (e.g., the northwest in Fig. 9 a 1 and d   difference between the two data source.Since the ERA5 provides accurate background temporal information and the clear MOD05 provides detailed spatial information for the framework, the spatial performance of fused PWV relies more on the accuracy and reliability of clear MOD05 pixels. As shown in Fig. 10, we employed the differences between products and the Difference STandard Deviation (DSTD) to evaluate the performance of the MOD05-based and fused PWV against the interpolated ERA5 products (Yu et al., 2021).ERA5 data is used as a reference since it typically has high accuracy due to the assimilation of various observational data sources, such as satellite observations, ground-based measurements, and other in-situ data.Compared to ERA5 products, MOD05 PWV has a large underestimation with significant spatial fluctuations (mean difference = − 1.81 mm and DSTD = 7.74 mm).The clear MOD05 PWV is overestimated at the global scale (mean difference = 2.24 mm and DSTD = 3.77 mm) (Fig. 10 b 1 ), and the DSTD of the leeward side is greater than that of the windward side.In addition, the distribution of DSTD in clear MOD05 PWV is not as smooth as the fused results, which is related to spatial discontinuity observations under cloudy conditions.The fused results significantly decrease the overestimation with mean difference of 0.40 mm and mean DSTD of 3.22 mm, and water vapor variations in the western region exceed those in the eastern region.There is underestimation in the northwest and mountain areas because the westerlies lead to drastic climate changes and water vapor increases on the Windward slope of the mountains.Besides, the fused results achieve minimal differences and DSTD in the southern areas, indicating that the proposed framework has ideal performance in the Mediterranean climate.

Spatial variations and temporal trends of PWV
The spatial variations of the fused PWV should follow climatic regular patterns reducing the uncertainty in environmental analysis.In this study, the relative variations defined as the percentage of the PWV range to the average (Li and Long, 2020), are calculated for each 0.25 • grid cell based on the annual fused 0.01 • PWV.As can be seen in Fig. 11, the relative variations in most areas of the European continent are relatively low except for the southern coastal and central mountain areas.The areas with relatively large variations (up to 100%) are mainly located on the Mediterranean coast (Fig. 11a) because the PWV is inhomogeneous in the warm and humid marine climate (Fig. 11b).Moreover, there are larger spatial variations of PWV in the Alps than in the leeward-side plains under the influence of local mountain climate.The rising airflow leads to uneven distribution of water vapor, especially on the mountainsides at ~3000 m (cold and polar zones).There are moderate spatial variations (~50%) in the Carpathian Mountains, which indicates that topography is an important impact factor in inland areas.Overall, the proposed fusion framework is shown to accurately recover the spatial details in each grid unit.These spatial details cannot be retrieved through the original product, and it is difficult to obtain them through simple downscaling methods (e.g., interpolation or resampling).
The fusion results eliminate cloud pollution and image noise to the fullest capacity, which facilitates the long-term trend analysis of PWV.The Mann-Kendall method (Mann, 1945) is used to detect the trend of annual PWV from 2001 to 2021 across continental Europe (Fig. 11c), as this nonparametric test is insensitive to the distribution of input data and has been widely applied in climate analysis.Moreover, the Theil-Sen method (Sen, 1968) is employed to estimate the slope of the Mann-Kendall trend.
The PWV has an average growth trend of 0.057 mm/year across continental Europe and it increases greatly over 0.1 mm/year near the southern and western coasts.It should be noted that there are minor interannual changes that occurred in the Alps, Carpathian, and Pyrenees Mountains with higher rainfall and lower temperature (Fig. 11c and d).However, there is very large interannual growth (over 0.5 mm/decade) on their leeward-side plains with lower rainfall and higher temperature, which may be related to weakened westerlies under the effects of global warming (Dong et al., 2022).In the framework, the high temporal resolution of ERA5 products makes it possible to capture the temporal trends of PWV, while the high spatial resolution of MODIS images provides detailed spatial variations.By fusing these two datasets, we succeeded to generate high-resolution and seamless PWV images that offer a comprehensive understanding of PWV trends and distributions across continental Europe.

Performance of the proposed PWV fusion framework
In this study, the ERA5 and MOD05 PWV are fused to generate accurate and continuous PWV with high resolution (0.01 • , daily) based on the proposed ZSSR-ARF framework, thereby not only improving the spatial resolution of PWV images but also maintaining or enhancing their accuracy.This framework can accommodate data from different sources which may bear significant scale differences (e.g., 25 times) and cloud contamination severity.To deal with the spatio-temporal non- The image input for the fusion framework is enhanced through some preprocessing techniques.On the one hand, the framework uses the ZSSR super-resolution method to minimize the resolution disparity between the original and the target images, thereby reducing uncertainties caused by scale effects for the fusion.The ZSSR method can fully mine the internal image information and greatly reduce system errors compared with the outer supervision methods.In addition, the light CNN incorporated in the ZSSR architecture not only offers enhanced spatial details for the fusion but also addresses the computational challenges.On the other hand, the selection of input image pairs is properly addressed in the framework because it partially affects the accuracy of the fusion.Previous image fusion studies mainly used the nearest date strategy or the similarity criterion to select basic images.Generally, selecting the basic image pairs closest to the prediction date cannot ensure the best fusion performance since the PWV has strong non-stationarity, especially during extreme weather.A more effective approach is selecting the optimal image based on the similarity between image pairs of the predicted date and the basic dates for the PWV fusion considering both the phenological changes and radiation consistency (Xie et al., 2018).The strategy of selecting the basic clear image pairs that most highly correlate with the image on the prediction date provides accurate inputs for the fusion process.
This framework improves the pixel-unmixing-based fusion method by introducing a multivariate autoregression model.The pixel-unmixing fusion methods usually allocate temporal and spatial prediction residuals based on landscape heterogeneity, and particularly some models considering land cover changes have great performance in different surface conditions (Guo et al., 2020;Zhu et al., 2016).Differing from surface variables, the PWV is much more influenced by climate circulation than land covers, and it has strong spatial non-stationarity and autocorrelation (Zveryaev and Rudeva, 2011).Considering the nonstationarity of PWV, the proposed fusion method predicts the cloudysky images based on pixel decomposition strategy and estimates their increments in time and space domains.Furthermore, the multivariate autoregression model is used to allocate weights within the moving window adaptively which enhances the robustness of the fusion framework.To improve the accuracy of fused PWV, several relative control variables (e.g., precipitation and temperature) could be introduced to the fusion model which identifies the physical mechanism in the land-atmospheric interaction in future work.

Analysis and prospect of the fused PWV
The proposed framework primarily utilizes ERA5 data to reconstruct the water vapor values of cloudy pixels in the MODIS images, as ERA5 has high precision resulting from the assimilation of diverse observational data.The comparison results indicate the fused PWV has high consistency and acceptable deviation from GPS measurements (r = 0.82-0.95,RMSE = 2.21-4.01mm).Moreover, the fusion product provides a more accurate and continuous depiction of local-scale variations in PWV under small-scale climatic processes and terrain influences that can not be captured by the coarser resolution of ERA5 data alone.The superiority of the fused PWV at local scales can be further confirmed in the future by multiple in-situ measurements within the grid cell, rather than sparse observations at large scale, as the latter tends to average out the spatial variability of PWV.It also has a reasonable temporal difference (average of 0.40 mm) with the ERA5 product in different land covers and atmospheric conditions, which supports the reliability and effectiveness of the proposed fusion framework.It should be emphasized that the fusion framework was designed to leverage the strengths of both datasets, and the fusion mechanism ensures that the fused PWV values are a reasonable compromise between the ERA5 and MOD05 data sources.The inherent systematic bias between MODIS and GPS station data caused by atmospheric interference and sensor limitations can be reduced through station-based correction models (Zhu et al., 2021).The uncertainties involved in correcting satellite images based on stations include but are not limited to, the distribution of stations, differences in elevation, and the representativeness of the stations for the local climate variations.These factors may introduce additional errors in the correction process, so they are not included in this study.
The precision of fused PWV partly relies on the quality of the input data which are affected by meteorological conditions and sampling intervals.The MOD05 derived from near-infrared observations overestimates clear-sky PWV but substantially underestimates cloudy-sky PWV since it records the water vapor above the clouds rather than the total PWV.We account for this constraint in the fusion framework by using reliable ERA5 reanalysis data that provides better estimates of PWV under different sky conditions.Furthermore, it might be challenging to select complete and reliable input images of sparse sampling at large scales.Multisource time-series PWV images can be incorporated to reduce sampling intervals and increase temporal change information for the fusion.For example, the MYD05 and other reanalysis PWV can be used as HR-LR image pairs for input into the framework if the spatiotemporal consistency is maintained.As additional PWV datasets are included in the framework, the accurate and seamless fused products contribute to gaining a broader understanding of PWV distribution patterns and their implications for weather, hydrological, and climate studies worldwide.

Conclusion
A novel fusion framework (i.e., ZSSR-ARF) is presented, fusing MOD05 and ERA5 PWV across the European continent from 2001 to 2021.It generated seamless PWV with high resolution and accuracy, and the fusion results were evaluated with GPS data from 230 stations.Our major conclusions are summarized as follows.
(1) The PWV fused by the proposed framework shows excellent station-scale accuracy compared with GPS-station measurements (r = 0.82-0.95,RMSE = 2.21-4.01mm, mean Bias = 0.76 mm, and mean KGE = 0.83), which is superior to the original MOD05 PWV, with significant relieves on the overestimation issues for the clear MOD05 PWV.
(2) The fusion framework can effectively gap-fill the cloudy PWV image and accentuate the spatial variations of the clear image, which outperformed the other classical fusion approaches with R 2 improved by over 24% and RMSE reduced by over 0.61 mm.(3) The fused PWV has a comparable temporal consistency (mean difference of 0.40 mm and DSTD of 3.22 mm) with the accurate ERA5 products, and significant increasing trends (mean of 0.057 mm/year and over 0.1 mm/year near the southern and western coasts) are found across the European continent.
In conclusion, the proposed framework integrates image superresolution and spatio-temporal approaches, which greatly enhances the ability to estimate accurate PWV with high resolution based on satellite and reanalysis data.This study explored the spatial variations and temporal trends of seamless and time-series PWV across the European continent, which supports a variety of applications.Furthermore, the fused PWV provides a foundation for observing climate patterns and monitoring extreme weather events, particularly for the retrieval of related atmospheric variables and the study of the land-atmosphere cycle and energy balance.

Fig. 1 .
Fig. 1.Overview of the study area: (a) location and climate zones; (b) elevation and GPS-station distribution.
), the temporal change in class c ΔF(c) is obtained by using the local least-squares fitting based on the purest LR pixels of each class.The HR pixels at date t n , F TP ( x ij , y ij , t n ) are predicted based on the temporal change of HR pixels of each class and the values of HR pixels at the basic date F ( x ij , y ij , t m (α, β, δ, γ, φ,  η) = argmin ∑ [ αF TP + βF SP + δ ∑ WF TP + γ ∑ WF SP + φR S + ηR T -F m

Fig. 9 .
Fig. 9. Examples of PWV distribution for different seasons at continental scales.a-d represent the PWV of ERA5, MOD05, clear MOD05 and ZSSR-ARF products.Subscripts 1-4 indicate the results of selected dates in January, April, July, and October.

Fig. 10 .
Fig. 10.Spatial distributions of differences (Subscript 1) and DSTD (Subscript 2) between MOD05-based and fused PWV against ERA5 products in 2001-2021.a-c represent the comparison results of MOD05, clear MOD05, and fused PWV with interpolated ERA5 PWV.d indicates the distribution density of three products.

Fig. 11 .
Fig. 11.Spatial variations and temporal trend of annual fused PWV during 2001-2021.(a and c) The distributions of relative variations and the Theil-Sen slope.The colored areas indicate PWV significantly (P < 0.05) changes over time, and areas with no filling have no significant trends.(b and d) Averaged relative variations and temporal trends in PWV for different climatic backgrounds (mean annual air temperature and precipitation).