Characterisations of Europe’s integrated water vapour and assessments of atmospheric reanalyses using more than 2 decades of ground-based GPS

. The ground-based Global Positioning System (GPS) has been used extensively to retrieve integrated water vapour (IWV) and has been adopted as a unique tool for the assessments of atmospheric reanalyses. In this study, we investigated the multi-temporal-scale variabilities and trends of IWV over Europe by using IWV time series from 108 GPS stations for more than 2 decades (1994–2018). We then adopted the GPS IWV as a reference to assess six commonly used atmospheric reanalyses, namely the Climate Forecast System Reanalysis (CFSR); ERA5; ERA-Interim; the Japanese 55-year Reanalysis (JRA-55); the Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2); and NCEP-DOE AMIP-II Reanalysis (NCEP-2). The GPS results show that the peaks of the diurnal harmonics are within 15:00–21:00 in local solar time at 90 % of the stations. The diurnal amplitudes are 0–1.2 kg m − 2 (0 %–8 % of the daily mean IWV), and they are found to be related to seasons and locations with different mechanisms, such as solar heating, land–sea breeze, and orographic circulation. However, mismatches in the diurnal cycle of ERA5 IWV between 09:00 and 10:00 UTC as well as between 21:00 and 22:00 UTC were found and evaluated for the ﬁrst time, and they can be attributed to the edge effect in each ERA5 assimilation cycle. The average ERA5 IWV shifts are − 0 . 08 and 0.19 kg m − 2 at the two epochs, and they were found to be more signiﬁcant in summer and in the Alps and in Eastern and central Europe in some cases. Nevertheless, ERA5 outperforms the other reanalyses in reproducing diurnal IWV anomalies at all the 1, 3-, and 6-hourly temporal resolutions. ERA5 is also superior to the others in modelling the annual cycle and linear trend of IWV. For instance, the IWV trend differences between ERA5 and GPS are quite small, with a mean value and a standard deviation of 0.01 % per decade and 0.97 % per decade, respectively. However, due to signiﬁcant discrepancies with respect to GPS, CFSR and NCEP-2 are not recommended for the analysis of IWV trends over southern Europe and the whole of Europe, respectively.


Introduction
Water vapour is the most important component of the Earth's atmosphere regarding the transport of energy by latent heat and radiative forcing.It is the most important gaseous source of infrared opacity in the atmosphere and thus the largest contributor to the natural greenhouse effect (Kiehl and Trenberth, 1997;Harries, 1997).It plays a key role in water and energy cycles (e.g.Trenberth and Fasullo, 2013), climate change (e.g.Schneider et al., 2010), and various weather and climate processes (e.g.Vonder Haar et al., 2012) and is crucial for the understanding of many extreme meteorological phenomena, such as atmospheric rivers (e.g.Zhu and Newell, 1994), hurricanes (e.g.Ejigu et al., 2021), floods (e.g.Turato et al., 2004), and droughts and monsoons (e.g.Jiang et al., 2017;Fadnavis et al., 2021).However, due to its large spatiotemporal variability, its high-accuracy quantification remains a challenge.
Characterising the multiple spatiotemporal-scale variabilities and long-term trends of water vapour is of great importance for Europe, which is like a peninsula of the Eurasian landmass mainly surrounded by the Arctic Ocean, Atlantic Ocean, and Mediterranean Sea to the north, west, and south, respectively.Owing to the moisture from the oceans carried by the prevailing westerlies, most of western Europe has an oceanic climate with mild, wet, and turbulent weather in winter.In contrast, southern Europe is characterised by a wellknown dry summer Mediterranean climate.Europe is generally vulnerable to the extreme events associated with abnormal water vapour transport and is very sensitive to climate change (Field et al., 2014;Lavers et al., 2016) and has been the fastest warming continent in recent decades (e.g.Copernicus, 2019).Since a 1 K rise in temperature leads to a 7 % increase in the water-vapour-holding capacity of the atmosphere as implied by the Clausius-Clapeyron equation (Trenberth et al., 2003), Europe's water vapour amount is noticeably increasing (Yuan et al., 2021).The increasing water vapour content fortifies the radiative forcing, leading to a higher temperature, which becomes the most powerful if additional water vapour enters the upper troposphere and lower stratosphere (Solomon et al., 2010).The warmer and moister climate impacts all weather events and aggravates the risks of extreme events (Trenberth, 2012).
Atmospheric water vapour content can be expressed by using integrated water vapour (IWV), which is defined as the total amount of water vapour present in a vertical atmospheric column from the Earth's surface to the top of the atmosphere in units of kilogram per square metre (Jones et al., 2020).The IWV is also known as the total column water vapour (TCWV) or precipitable water vapour (PWV).At present, numerous techniques have been developed to measure water vapour, such as balloon-borne radiosondes (e.g.Durre et al., 2018;Kunz et al., 2013;Müller et al., 2016), aircraft measurements (e.g.Tilmes et al., 2010;Kunz et al., 2014;Krämer et al., 2020), satellite observations (e.g.Grossi et al., 2015;Beirle et al., 2018), and ground-based methods (e.g.Kämpfer, 2013;Vogelmann and Trickl, 2008).A global navigation satellite system (GNSS), represented by the USA's Global Positioning System (GPS), has been exploited for water vapour retrieval by using its ground-based measurements (e.g.Bevis et al., 1992) or space-based radio occultation (e.g.Kursinski et al., 1995) since the 1990s.Although accurate vertical distribution of water vapour in the upper troposphere can be obtained by the GPS radio occultation (e.g.Randel and Wu, 2005;Randel et al., 2007), its accuracy is limited in the lower troposphere (e.g.Ao et al., 2003;Awange 2018) where the water vapour is most abundant.On the contrary, the ground-based GPS has proven to be an effective technique for IWV retrieval with advantages of high accuracy, high temporal resolution, and all-weathercondition availability (Jones et al., 2020).The ground-based GPS has been utilised for IWV measurement in numerous global (e.g.Wang and Zhang, 2009;Vey et al., 2010;Chen and Liu, 2016) as well as regional (e.g.Bernet et al., 2020;Ejigu et al., 2021;Huang et al., 2021) studies.In addition, there have been nearly 3 decades of continuous GPS measurements with increasingly densified networks in Europe, such as the European Reference Frame (EUREF) Permanent Global Navigation Satellite System (GNSS) Network (EPN; Bruyninx et al., 2012).Given these benefits, groundbased GPS offers a unique tool to investigate the multiple spatiotemporal-scale variabilities of IWV over Europe.The homogeneously reprocessed long-term time series of GPS IWV is also quite beneficial to climate change studies (e.g.Van Malderen et al., 2020).GPS has increasingly been adopted in many studies to investigate various temporal features of Europe's IWV, such as diurnal cycle (e.g.Diedrich et al., 2016;Steinke et al., 2019), annual cycle (e.g.Parracho et al., 2018;Van Malderen et al., 2022), and trends (e.g.Nilsson and Elgered, 2008;Ning et al., 2016;Wang et al., 2016).However, the multi-temporal-scale variabilities and trends of Europe's IWV have rarely been comprehensively studied using GPS.
Atmospheric reanalyses have been extensively adopted as the data source of IWV acquisition for the last several decades, owing to the benefits of regional or global coverage, consistent spatiotemporal resolution, and the availability of many other meteorological variables.However, their products may still be subject to large uncertainty.On the one hand, this is due to the fact that the reanalyses from different providers and different versions are inconsistent in the input data and assimilation schemes, besides using different physical schemes and representations.For example, the newly released fifth generation global reanalysis from ECMWF (ERA5; Hersbach et al., 2020) has assimilated more datasets and instruments, which were not ingested in its predecessor ERA-Interim (ERAI; Dee et al., 2011).The assimilation system of ERA5 is also more advanced.On the other hand, systematic and random errors in the input data are unavoidable.For instance, the Integrated Global Radiosonde Archive (IGR; Durre et al., 2018) provides a long record of relative humidity observations for the assimilation of reanalysis, but long-term radiosonde humidity measurements are very sensitive to changes in instrumentation and measuring practice (McCarthy et al., 2009;Dai et al., 2011).Therefore, assessments of the reanalyses' water vapour products are indispensable for the accurate understanding and interpretation of Europe's weather and climate processes.The performances of various reanalyses' IWV products in Europe have been assessed by many regional or global studies using groundbased GPS data, as the GPS observations are not operationally assimilated by reanalyses (Hagemann et al., 2003;Bock et al., 2005;Heise et al., 2009;Vey et al., 2010;Alshawaf et al., 2018;Parracho et al., 2018;Wang et al., 2020;Yuan et al., 2021).However, it is still quite hard to draw consistent conclusions on the performances of different reanalyses from these studies, as they are different in many aspects, such as the numbers and locations of the GPS stations, period of observations, data processing strategies, and performance metrics.A consistent assessment of various reanalyses, by using homogeneously reprocessed long-term time series of GPS IWV as reference in Europe, is still lacking.In addition, few studies have focused on the performance of the latest ERA5 in reproducing the temporal features of Europe's IWV so far.
In this paper, we focus on characterising the multitemporal-scale variabilities and trends of Europe's IWV by using more than 2 decades of GPS IWV, and then we assess the performances of six reanalyses' IWV products over Europe.The paper is organised as follows: in Sect.2, we describe the GPS and reanalyses datasets, IWV calculation methods, and consistency evaluation metrics.Section 3 assesses the consistency of daily time series and representativeness differences, while the diurnal variations and associated diurnal cycle, annual cycle, and long-term trends of Europe's IWV are investigated using GPS in Sects.4, 5, and 6, respectively.The performances of the reanalyses are also assessed accordingly, and the main findings are summarised in Sect.7.

GPS data
To characterise the IWV over Europe and to assess the reanalyses, 1-hourly GPS IWV retrievals obtained from 108 stations are used (Fig. 1).Most GPS stations are from the EPN network (Bruyninx et al., 2001).The GPS observations are from January 1994 to December 2018 with an average time length of 21 years and an average integration rate of 92 % (Table S1 in the Supplement).The integration rate is the ratio between the number of available daily IWV data points and the theoretical number of all possible observations in the time range.The IWV retrievals were estimated with GPS zenith total delay (ZTD) provided by the Nevada Geodetic Laboratory (NGL; Blewitt et al., 2018).The GPS  S1.
ZTD products from NGL were used because it covers a long time period of 25 years so that it allows for better evaluations of the diurnal and annual climatological averages and long-term trends.In comparison, another ZTD dataset from EPN-Repro2 (Pacione et al., 2017) only has about 19 years of data, ending in 2014.
The NGL processes the GPS data using the newly improved GipsyX v1.0 software in precise point positioning (PPP) mode (Bertiger et al., 2020).Reprocessed orbits and clocks from the Jet Propulsion Laboratory (JPL) Repro 3 were used together with the 2014 International GNSS Service (IGS) reference frame (IGS14; Rebischung and Schmid, 2016).The observations were weighted based on an elevation (e)-dependent function of sin e.The cutoff elevation angle was set to 7 • .The first-order effect of the ionosphere was removed by employing ionosphere-free combinations of the GPS observations.The second-order effect was corrected with the International Geomagnetic Reference Field 12 (IGRF-12;Thébault et al., 2015) and JPL's ionosphere maps.As for the modelling of tropospheric delay, the Vienna Mapping Function 1 (VMF1; Boehm et al., 2006) and its associated a priori zenith hydrostatic delay (ZHD) were adopted.

Reanalysis data
The IWV derived from six commonly used global atmospheric reanalyses, namely the newly released fifth generation global reanalysis (ERA5) and its predecessor ERA-Interim (ERAI) from ECMWF; the Japanese 55-year Reanalysis (JRA-55) from the Japan  1.It is worth noting that JRA-55 only provides humidity information at 27 pressure levels from 1000 to 100 hPa, though it has 37 levels in total.These six reanalyses are selected as their IWV products have covered the period of the ground-based GPS data available since 1994.Despite ERAI having been decommissioned and superseded by ERA5 in August 2019, we still include it for the purpose of evaluating the progress of its successor ERA5.We therefore restricted the time range of all data records to a common period from 1994 to 2018.

IWV retrievals
The zenith total delay (ZTD) estimates derived from GPS data processing can be converted into IWV, the total amount of water vapour present in a vertical atmospheric column from the Earth's surface to the top of the atmosphere in units of kilogram per square metre, using the following equation (Bevis et al., 1992): where R V denotes the gas constant for water vapour, k 2 and k 3 are atmospheric refractivity constants (Bevis et al., 1994), and T m denotes the weighted mean temperature: where e and T are water vapour pressure and temperature profiles of the geopotential heights of the GPS station (H s ) to the top level of reanalysis (H top ), respectively.The ZHD was modelled as follows (Saastamoinen, 1972;Davis et al., 1985): where p s denotes the pressure at the GPS station with a latitude of ϕ s and a height of H s .The p s and T m are computed by using the ERA5 pressure-level product (see Yuan et al., 2021, and references therein).ERA5 is selected as it can provide a 1-hourly product without temporal interpolation.IWV values at the GPS stations are also calculated from the six reanalyses.In this calculation, the pressure-level products of the reanalyses are used rather than their surfacelevel IWV products, though it requires a heavier workload.This is because the GPS station and its nearby reanalysis surface grids are usually related to different heights.Vertical IWV adjustment is therefore usually required for the intercomparison between the IWV estimates from GPS and reanalyses.Compared to the reanalyses' surface-level products, their pressure-level products allow for a better characterisation of the vertical distribution of water vapour and tends to minimise the errors of the vertical adjustment (Parracho et al., 2018).
The scheme of the calculation is illustrated in Fig. 2. We assume a GPS station with a geopotential height of H s is located between two adjacent reanalysis pressure levels (p j and p j +1 ).We first determine its eight surrounding reanalysis grid nodes at these pressure levels (N (j ) i and N (j +1) i , i = 1, 2, 3, and 4) and four auxiliary points (A i ).The auxiliary points are located at the GPS station height, whereas their horizontal locations are identical to the associated reanalysis nodes.We then calculate the related meteorological variables at each auxiliary point.Exponential and linear interpolations are employed for the vertical corrections of pressure and temperature, respectively.If the auxiliary point is lower than the lowest pressure level (e.g.1000 hPa), an extrapolation of these variables is conducted.For details of the vertical adjustment, readers are referred to Schüler (2001) and Wang et al. (2005).Next, we calculate the IWV at each auxiliary point by using a vertical integration: where q and g are specific humidity and gravitational acceleration profiles from H s to the reanalysis top level, respectively.
It is noteworthy that a geopotential height system is employed in the reanalyses.Accordingly, the geopotential heights (H gp ) of the GPS stations rather than their ellipsoidal (H el ) or orthometric heights (H or ) are used in the above calculations.The conversion of height systems is carried out as follows (Dirksen et al., 2014;Wang et al., 2016;World Meteorological Organization, 2018): R(ϕ s ) = 6.378137 × 10 6 1.006803 − 6.706 where N is the geoid heights in metre from the Earth Gravitational Model 2008 (EGM2008; Pavlis et al., 2012).
In the end, we estimate the IWV at the GPS station using a horizontal interpolation with an inverse distance weighting algorithm (Jade and Vijayan, 2008).
Representativeness differences arise in the comparison of the IWV derived by the ground-based GPS and reanalyses (Bock and Parracho, 2019).This is because local variations of IWV measured by the GPS might fail to be resolved by the reanalyses due to their coarse horizontal resolution.The representativeness differences can be evaluated statistically as proposed by Bock and Parracho (2019): where IWV i and IWV k are the IWV values of the horizontal auxiliary points A i and A k , respectively (Fig. 2).Bock and Parracho (2019) found that the measure δ max was correlated with the standard deviation of the IWV differences between GPS and ERAI (σ ) and thus concluded that the representativeness differences between the IWV from GPS and ERAI contributed to their discrepancies.Here, we extend their work by estimating and comparing the representativeness statistics of the six reanalyses with various spatial resolutions (Table 1).

Pre-processing
The 1-, 3-, and 6-hourly IWV time series are screened by using a robust outlier detection method as follows (Yuan et al., 2021).For each data point, the data within a 30 d window centred at this point are extracted.The 25th percentile (Q1), 75th percentile (Q3), and interquartile range (IQR) of the subsequent time series are then calculated.Finally, the data point is identified as an outlier if it is outside the range of (Q1 − 1.5 × IQR, Q3 + 1.5 × IQR).The IQR threshold and the 30 d sliding window are adopted, as they allow for a good robustness and a proper accommodation for the natural variability of IWV.On average, about 1 % of data were filtered out by the data screening.
In this study, the performances of reanalysis IWV products in daily time series, diurnal cycle and variations, monthly annual cycles, and long-term linear trends are evaluated.As the temporal resolutions of the datasets are different, temporal interpolation and aggregation are conducted.To obtain daily mean IWV values, the 1-, 3-, and 6-hourly IWV time series are aggregated if they have at least 12, 4, and 2 data points in a day, respectively.The daily mean IWV time series at each station is further aggregated into monthly mean IWV series if there are at least 15 data points available in a month.

Homogenisation
The GPS IWV time series can be inhomogeneous due to changes in GPS data processing strategies and station-related changes like hardware changes or changes in the electromagnetic environment (Van Malderen et al., 2020;Nguyen et al., 2021, and references therein).We employed a homogenisation approach as described in Appendix A with step-bystep details and examples at two stations (HERS and ERLA).
Here, we only provide a brief introduction to the approach.We first avoided inhomogeneities due to changes in GPS data processing strategy by using the homogeneously reprocessed GPS ZTD product.We then homogenised the GPS IWV time series by using the RHtestsV4 software (Wang and  ).This software is especially developed for the detection and adjustment of changepoints in climatic time series, and it has been used in the homogenisation of IWV time series in previous studies (Ning et al., 2016;Schröder et al., 2016;Van Malderen et al., 2020).We took the IWV time series from all six reanalyses as references for the GPS IWV homogenisation and used a strategy to avoid the impacts of possible changepoints in individual reanalyses.However, we did not homogenise the reanalyses IWV time series because they represent the native quality of the reanalyses that we would like to assess.In addition to matching the detected changepoints with inspecting GPS metadata information (GPS station log files and IGS mail archives (IGSMAIL)), we also allowed for possible undocumented changepoints in the GPS IWV time series.

Metrics for consistency evaluation
To evaluate the performances of the reanalyses in reproducing IWV, the Kling-Gupta efficiency (KGE) is employed as a metric.The KGE is a composite index introduced by Gupta et al. ( 2009) and modified by Kling et al. (2012).It takes bias, variability, and correlation into account in the equation below: with where µ R , σ R , and CV R are the mean value, standard deviation, and coefficient of variation of the reanalysis IWV time series, respectively.µ G , σ G , and CV G are the corresponding parameters for GPS.β and γ indicate the consistencies in the mean and variability, respectively.In the case of µ R and µ G being equal to zero, such as for the diurnal anomalies in Sect.4.3, γ = σ R σ G .r indicates the Pearson correlation coefficient between the GPS and reanalyses time series.With perfect consistencies in mean, variability, and correlation, the values of β, γ , and r are identical to 1, respectively.From Eq. ( 9), it follows that a larger KGE score indicates a better consistency.Ideally, the KGE metric reaches its maximum of 1.
In addition, to be comparable to some previous studies, we also used the root mean square estimation of the IWV differences between the reanalyses and GPS: where IWV R and IWV G are the IWV time series at a specific station from reanalysis and GPS, respectively, and n is number of data points in the IWV time series.

Representativeness differences
Figure 3 compares the standard deviations of daily IWV difference (σ ) and the representativeness statistic (δ max ) for the six reanalyses.The first point to note is that σ and δ max are strongly correlated for all the reanalyses, with r values from 0.76 for NCEP-2 to 0.90 for ERA5.The result is in line with Bock and Parracho (2019), who compared ERAI to a global GPS network.Figure 3 indicates that the representativeness differences contribute to the discrepancies between the reanalyses and GPS., respectively.In contrast, NCEP-2 has the largest σ and δ max , with values of 1.1-3.0 and 1.8-5.2kg m −2 , respectively.The difference could be due to the fact that the spatiotemporal resolution of ERA5 is much higher than of NCEP-2 as indicated in Table 1.This result indicates that ERA5, with improved spatiotemporal resolution and data assimilation, is capable of reducing the discrepancy and representativeness difference with respect to GPS.

Assessments using KGE
Figure 4 shows the geographical distributions of the ratio of mean β, the ratio of coefficient of variation γ , the correlation r, and the synthetic KGE metric for the six reanalyses by using the daily GPS IWV time series as references.Moreover, the statistics of these scores are displayed in Fig. 5 with boxand-whisker plots.From the first column of Figs. 4 and 5a we can see that β scores are slightly larger than 1 for all the reanalyses except JRA-55, indicating a general wet bias with respect to the GPS IWV.For example, MERRA-2 has the largest median β with a value of 1.04, indicating a general wet bias of 4 %.Only JRA-55 scores a median β smaller than 1 (0.99, Fig. 5a), indicating a slight dry bias of 1 %.The wet and dry biases in the reanalyses have partly been reported by several previous studies.For instance, Schröder et al. (2018) concluded that CFSR, ERAI, and MERRA-2 are too moist over Europe compared to the ensemble mean of various satellite and reanalysis IWV records, whereas JRA-55 has negligible bias there.The wet bias in ERAI over Europe was also noted by Parracho et al. (2018).Analysing the reasons of the biases is of potential interest; however, it would go beyond the scope of this paper.
Regarding the consistency in variability, most of the γ scores are less than 1, with the associated median values ranging between 0.96 and 0.98 (Fig. 5b).The results indicate a good reproduction of daily IWV variability by the   reanalyses, albeit with a slight underestimation.The correlations are also pretty good with median values larger than 0.97 (Fig. 5c).However, the scores in variability and correlation are lower for the coastal stations, as shown in the second and third columns of Fig. 4, respectively.In particular, γ and r of NCEP-2 are less than 0.96 for some coastal stations located in western and southern Europe.This could be explained by the representativeness differences that impact the consistency of the IWV from reanalyses and GPS in their variabilities.For a coastal GPS station with an on-site measurement of IWV, part of its associated four reanalysis grid nodes may be located over the sea whereas others over land.As a consequence, the representativeness difference for such a coastal site is more severe than inland stations surrounded with flat terrain due to land-sea thermal contrast (Drobinski et al., 2018).Figure 5d compares the overall consistency in daily IWV evaluated using KGE.It can be seen that ERA5 is characterised by the largest median KGE (0.97).The result reveals the superiority of ERA5 over the other reanalyses.Moreover, comparing β, γ , and r (Fig. 5a-c) shows that r tends to contribute the least to the overall inconsistency, indicating that improving the consistencies in both mean values and variabilities is more important for the daily IWV time series.

Assessments of diurnal variations
Despite the atmospheric water vapour being quite unstable, it is characterised by a diurnal cycle.The diurnal IWV cycle can be driven by different mechanisms, such as evapotranspiration and condensation related to temperature, solar heating, and underlying surface conditions, as well as advection of air at different spatial scales (Dai et al., 2002;Diedrich et al., 2016;Koji et al., 2022).
In this section, the diurnal variations of IWV are investigated in two regards: diurnal cycle and diurnal anomaly.The diurnal anomalies (IWV DA ) are calculated as follows (Diedrich et al., 2016;Steinke et al., 2019): where IWV is the 1-hourly IWV time series in 1 d, and IWV is the associated daily mean.Each station's diurnal cycle is computed by averaging its diurnal anomalies for each season separately and for the entire time span.Diurnal and semidiurnal harmonics are calculated with least squares estimation (LSE) based on the seasonal-averaged and all-time-averaged diurnal anomalies: where t is local solar time (LST) in hours, and A i and ϕ i are the amplitudes and phases of the diurnal (D 1 ) and semidiurnal (D 2 ) cycles, respectively.The phases are then adjusted to the peaks of the associated harmonics, with ϕ 1 ∈ [0, 24) and ϕ 2 ∈ [0, 12) LST in hours.

Diurnal GPS IWV cycle
Starting with the all-time-averaged amplitudes of the diurnal GPS IWV harmonic shown in Fig. 6e, two remarkable values can first be noted at stations NICO (Nicosia, Cyprus; 33.14 • N, 33.40 • E; 161.9 m) and ZECK (Zelenchukskaya, Russia; 43.79 • N, 41.57 • E; 1143.4 m), with values of 1.2 and 1.0 kg m −2 , respectively.Moreover, the diurnal harmonics at the Mediterranean Coast are generally stronger than at the other regions in Europe (0.5-0.8 kg m −2 versus 0-0.5 kg m −2 ).Obvious seasonal differences can also be seen in their diurnal harmonics, with significantly larger amplitudes in summer (June, July, and August; JJA) than in the other seasons due to the stronger solar heating effect with minimal cloud coverage in Mediterranean summers (Enriquez-Alonso et al., 2016).However, the semidiurnal harmonics are much weaker, and their seasonal variations are less significant (Fig. 6f-j).The all-time-averaged semidiurnal amplitudes are lower than 0.22 kg m −2 , except for the two stations NICO and ZECK with values of 0.4 and 0.3 kg m −2 , respectively.The ratios between the all-time-averaged semidiurnal and diurnal amplitudes are lower than 30 % at 88 % of the stations (95 out of 108).As for the phases, the diurnal and semidiurnal terms are generally consistent over seasons (Fig. 6k-t), and their all-time-averaged peaks are within 15:00-21:00 and 01:00-06:00 LST at 90 % of the stations (97 out of 108), respectively.In order to compare the characteristics of diurnal IWV amplitudes at different stations, we calculated each station's relative amplitude as the ratio between their respective (semi) diurnal amplitudes and mean IWV as displayed in Fig. 7a  and d.We classified the GPS stations into three types according to their geographical characteristics (Fig. 7a) and analysed their relationships with each station's altitude and distance to sea (SeaDist).Firstly, we divided the stations with a limit of 20 km on their SeaDist.We further separated the stations located at the Mediterranean Coast (MedCoast; SeaDist < 20 km; 32 from the other coastal (OtherCoast) stations because their characteristics are quite different as can be seen from Fig. 7a.Consequently, the 108 stations are classified into 62 Inland stations, 12 MedCoast stations, and 34 OtherCoast stations.
As can be seen from Fig. 7a-c, all OtherCoast stations are lower than 300 m, and their relative diurnal IWV amplitudes are the weakest, with a range from 0.3 % to 1.9 % and a median of 1.1 %.Within the altitude limit of 300 m, the Inland stations are characterised by moderately larger diurnal amplitudes (1.5 %-2.5 %).The results indicate the effect of landsea breeze circulation on mitigating the intensity of the diurnal IWV cycle at the Atlantic coasts of Europe with respect to inland Europe.However, the land-sea breeze effect can be less significant for the MedCoast stations because their diurnal amplitudes are significantly larger (1.1 %-4.2 %) than the OtherCoast stations'.The stronger diurnal IWV cycles at the MedCoast stations can be explained by the stronger solar heating effect at the Mediterranean Coast than at the other European coasts, especially during summer daytime under stable and clear-sky weather conditions.In addition, it can be seen from Fig. 7b and e that the relative diurnal and semidiurnal IWV amplitudes are well correlated with altitudes, with correlation coefficients of 0.66 and 0.67, respectively.This relationship indicates the effect of orographic circulation, which can enhance the diurnal range of temperature at higher altitudes (Diedrich et al., 2016).
In addition, we selected six stations with various altitudes, SeaDist, and climates to illustrate the diversity of diurnal IWV cycles in Europe as shown in Fig. 8.The climate zones of the GPS stations are classified according to the Köppen climate classification system (Beck et al., 2018), and the properties of the stations are listed in Table S2.
Station NICO is located in Cyprus with hot semi-arid climate (BSh).Despite being only 21.5 km away from the coastline, NICO has the largest diurnal IWV amplitude with a value of 1.2 kg m −2 , equivalent to a relative amplitude of 6.7 %.This is mainly due to the large diurnal temperature range in Cyprus, especially in summer (Price et al., 1999).
Both the MedCoast station VALE (Valencia, Spain; 39.48 • N, 0.34 • W; 27.0 m) and the OtherCoast station NEWL (Newlyn, UK; 50.10 • N, 5.54 • W; 11.0 m) are very close to the coastline, with SeaDist values of 1.2 and 0.5 km, respectively.However, their diurnal amplitudes are quite different (absolutely 0.8 kg m −2 versus 0.1 kg m −2 , relatively 3.9 % versus 0.7 %).As explained earlier for the difference between the two station types, their weather conditions are different.VALE is located at the Mediterranean Coast with cold semi-arid (BSk) climate.Its strong diurnal cycle, especially in summer with an amplitude of 1.4 kg m −2 (4.6 %), is attributed to intense solar heating under minimal-cloudcoverage weather conditions.In contrast, NEWL is located at the coast of the English Channel with temperate oceanic (Cfb) climate, and its smaller diurnal amplitude can be due to the weaker solar heating effect under unstable and cloudy weather conditions, in addition to the land-sea breeze effect on mitigating diurnal temperature range.
Station ZECK is in the Greater Caucasus with humid continental (Dfb) climate.Its diurnal amplitude is much stronger than PENC's (Penc, Hungary; 47.79 • N, 19.28 • E; 248.3 m) with the same climate (1.0 kg m −2 and 7.9 % versus 0.2 kg m −2 and 1.5 %).As ZECK is much higher than PENC, their difference in amplitude is consistent with the pattern for most Inland stations, which can be explained by the effect of orographic circulation (Diedrich et al., 2016).In addition, KIR0 (Kiruna, Sweden; 67.88 • N, 21.06 • E; 469.3 m) is typical for many stations in northern Europe.Although its diurnal cycle is quite weak with an amplitude of only 0.1 kg m −2 (1.7 %), it is well fitted by the sinusoidal harmonic curve fitting.

Mismatches in diurnal ERA5 IWV cycle
Only the diurnal IWV cycle from the 1-hourly ERA5 time series was evaluated with respect to GPS, as the temporal resolutions of the other reanalyses are too coarse to characterise the diurnal cycle.However, we found significant shifts in the diurnal IWV anomalies between 09:00 and 10:00 (10:00-09:00) UTC as well as between 21:00 and 22:00 (22:00-21:00) UTC at part of the stations, such as NEWL and PENC displayed in Fig. 8  documentation#ERA5:datadocumentation-Knownissues, last access: 8 January 2023), and the problem is attributed to the edge effect in each ERA5 assimilation cycle (from 10:00 to 21:00 UTC and from 22:00 to 09:00 UTC+1 d).However, according to our knowledge, the magnitude of the mismatch in the diurnal cycle of ERA5 IWV and its spatiotemporal characterisations have not been investigated yet.Therefore, we will quantify and analyse the mismatch in the ERA5 IWV over Europe in this section.Figure 9 compares the diurnal IWV cycle of ERA5 and GPS for each season at station PENC.From this figure, we can derive that there are no mismatches in the diurnal GPS IWV cycle, although the GPS IWV (slightly) relies on T m (hence humidity and temperature) from ERA5 for the ZTD to IWV conversion, as shown in Eq. ( 2).GPS IWV estimates are therefore regarded as reference data to evaluate the mismatches in the diurnal cycle of ERA5 IWV.At station PENC, the mismatches in ERA5 are seasonal dependent, which are strongest in summer (JJA) but weakest in winter (DJF).
Figure 10 compares the shifts at 10:00-09:00 UTC and 22:00-21:00 UTC from ERA5 and GPS at all the stations, respectively.The shifts in the GPS IWV cycle are regarded as reference, representing the natural IWV changes at the two epochs.As can be seen from Fig. 10a-e and k-o, the ERA5 IWV series generally drop from 09:00 to 10:00 UTC and then jump from 21:00 to 22:00 UTC.The ERA5 artificial shifts at the two epochs are most significant in summer, with average values of −0.23 and 0.35 kg m −2 .In contrast, the average natural shifts in summer estimated from GPS IWV are only 0.11 and −0.08 kg m −2 .Moreover, the all-time-averaged natural shifts in GPS IWV are only 0.05 and −0.05 kg m −2 .However, the artificial shifts in ERA5 IWV are −0.08 and 0.19.As can be seen from the geographic distributions of the shifts shown in Fig. 10e and o, the ERA5 shifts at 10:00-09:00 UTC are most significant at the Alps and in Eastern Europe, whereas the shifts at 22:00-21:00 UTC are more widespread in central Europe.The reasons for their geographical patterns are unknown and needs further investigation.Since the average diurnal amplitude of the reference GPS IWV is only 0.32 kg m −2 , the artificial shifts in ERA5 IWV cannot be ignored when analysing the diurnal IWV cycle in these regions.

Diurnal anomalies
Diurnal anomaly represents high-frequency variations of IWV associated with weather phenomena like heavy rainfall.Therefore, evaluations of the consistencies between the IWV diurnal anomalies from reanalyses with respect to GPS are conducive to a better understanding of their performances in extreme weather, especially for ERA5 with a significantly enhanced temporal resolution of 1 h.We first evaluated all the reanalyses with respect to GPS at 1 h temporal resolution.For the reanalyses with coarser resolutions (3 and 6 h), we interpolated their time series to 1 h by using cubic spline, which is slightly superior to linear interpolation.Statistics of the evaluation results are listed in Table 2.At the temporal resolution of 1 h, ERA5 scores the highest average γ , r, and KGE with values of 0.98, 0.89, and 0.88, respectively.The results indicate that the 1-hourly ERA5 diurnal anomalies have the best agreement with GPS in variability and correlation.However, NCEP-2 performs the worst, as it scores the lowest γ , r, and KGE with values of 0.75, 0.69, and 0.60, respectively.To be comparable to many other studies, we also evaluated the consistencies by using the more commonly used indicator -rms of IWV difference (rms ) as shown in Eq. ( 12).Results show that ERA5 and NCEP-2 score the lowest and largest average rms values of 0.97 and 1.57 kg m −2 , respectively.Therefore, we drew the same conclusion as from the KGE scores.The evaluations confirm the superiority of the 1-hourly ERA5 over the other reanalyses, most likely because of its enhanced spatiotemporal resolutions and other improvements in data assimilation.
For a fairer intercomparison, we also evaluated ERA5 and the other reanalyses at their respective native resolutions.Accordingly, we extracted the ERA5 IWV every 3 and 6 h.The comparison with MERRA-2 at its native resolution of 3 h shows that ERA5 achieves larger KGE (0.89 versus 0.86) and smaller rms (0.97 kg m −2 versus 1.15 kg m −2 ).Moreover, the average KGE value of the 6-hourly ERA5 IWV is 0.89, which is higher than those of CFSR, ERAI, JRA-55, and NCEP-2, with values of 0.86, 0.83, 0.79, and 0.60, respectively.The results indicate that ERA5 is still superior to the other reanalyses at the coarser temporal resolutions.

Assessments of annual cycle
We analysed the annual cycles of the homogenised monthly GPS IWV time series as shown in Fig. 11a with increasing latitude from the bottom to the top.Most of the annual IWV cycles reach their maxima in July and August, with peak values from 17.1 to 32.2 kg m −2 .In contrast, the minima of the annual cycles are typically in January and February, with values from 4.2 to 17.1 kg m −2 .The maximum and minimum values of the annual cycles are generally increasing with decreasing latitude (Fig. 10a) and decreasing altitude (not shown).
Figure 11b-g present the differences between the annual cycle of IWV estimated by the reanalyses with respect to GPS.It can be seen that ERA5 has the least differences.Indeed, the quantitative evaluation shows that ERA5 obtains the highest median KGE score having a value of 0.97 (Fig. 5h), indicating that it outperforms the other reanalyses.CFSR ranks last (median KGE = 0.94) due to its overestimation of mean values (median β = 1.04) and underestimation of variability (median γ = 0.96).Moreover, although the consistencies between the annual IWV signals from reanalyses and GPS are generally rather good, there could be significant discrepancies in specific seasons and regions.For example, JRA-55 has an average dry bias of 0.5 kg m −2 with respect to the GPS IWV from May to September at the stations in the south (32-48 • N; Fig. 11e), whereas NCEP has obvious wet biases (0.7-2.4 kg m −2 ) in the annual average of IWV with respect to the GPS results at low latitudes (32-40 • N; Fig. 11g).In addition, significant biases (1.2-5.9 kg m −2 ) from March to September can be seen at station BZRG in the evaluations of CFSR, ERAI, and MERRA-2.The discrepancies are most likely related to the remarkable representativeness differences between the reanalyses and GPS at BZRG (Fig. 3), which is located in the Alps.

Assessments of linear trends
We carried out a homogenisation of the GPS IWV time series by using the RHtestsV4 software (Wang and Feng, 2013) be-   fore the analysis.The software is dedicated to the homogenisation of climatic time series.In addition, we adopted a homogenisation strategy which allows for changepoints with and without support from metadata.We took all six reanalyses as references and attempted to avoid the impacts of possible changepoints in specific reanalyses.However, we did not homogenise the reanalyses IWV time series because they represent the native quality of the reanalyses that we would like to assess.The homogenisation approach is detailed in Appendix A.
We then estimated the linear IWV trends from all six reanalyses and the homogenised GPS IWV time series after the removal of the annual cycle.In order to obtain realistic uncertainties of the trend estimates, we analysed the time series by using Hector software version 1.7.2 (Bos et al., 2012).We tested four commonly used noise models, namely white noise (WN), first-order autoregressive (AR(1)), autoregressive moving average (ARMA(1,1)), and power-law noise (PL).We then selected the optimal model of each time series by using the Bayesian information criterion (BIC; Schwarz, 1978).Readers are referred to Yuan et al. (2021) for more details.The IWV trend estimates, associated uncertainties, and specific optimal noise models are listed in Table S3.
Figure 12a shows the GPS IWV trends after the homogenisation.The IWV trends generally increase from the Atlantic coasts towards the southeast.The average trends of the Oth-erCoast, Inland, and MedCoast stations are 0.2, 0.5, and 0.9 kg m −2 per decade, respectively.The corresponding relative IWV trends are 1.7 %, 3.1 %, and 5.0 % per decade (Fig. 12d), respectively.The largest trend is found at station TUBI (Gebze, Turkey; 40.8 • N, 29.5 • E) with a value of 1.6 kg m −2 per decade (8.7 % per decade).As can be seen from Fig. 12b, AR(1) is the optimal noise model at 58 % of the stations (63 out of 108), which are mostly located close to sea or in northern Europe.Most parts of central Europe are characterised by WN, whereas Belgium and central Germany are best modelled by ARMA(1,1).In addition, PL is superior to the other models at four stations.The optimal noise models for the monthly IWV time series of the six reanalyses show similar geographical patterns as the GPS IWV.The geographical patterns of the optimal noise models obtained in this study are different from Yuan et al. (2021), although the geographical patterns of the IWV trends are consistent to previous studies (Parracho et al., 2018;Nguyen et al., 2021;Yuan et al., 2021).It is most likely due to the difference that monthly IWV time series are used here, whereas daily IWV series were used in their work.
We evaluated the IWV trends derived from the six reanalyses by taking the trends from the homogenised GPS IWV time series as reference.We compared the difference in the relative IWV trend rather than the absolute trend because the relative trend is not affected by the IWV bias in individual reanalysis.As can be seen in Fig. 13, ERA5 shows the best agreement in IWV trends with respect to GPS, with a mean value and a standard deviation of 0.01 % per decade and 0.97 % per decade in their trend differences, respectively, indicating an improvement compared to its predecessor ERAI (−0.04 ± 1.15 % per decade).JRA-55 also only has an average trend difference of as low as 0.01 % per decade but with a slightly larger standard deviation (1.12 % per decade) than ERA5.Compared to the GPS IWV trends, MERRA-2 has an underestimation of 0.22 % per decade on average (Fig. 13e), whereas CFSR resulted in an overestimation of 0.22 % per decade (Fig. 13a).The standard deviation of the CFSR GPS IWV trend differences is as large as 1.73 % per decade, which is mainly caused by the significant differences from −5.1 % per decade to 4.9 % per decade in southern Europe (Fig. 13a).The results suggest that the CFSR IWV trends are less accurate in southern Europe and should be carefully validated before climate change analysis.In addition, Fig. 13f shows that the mean value and standard deviation of NCEP-2 GPS IWV trend differences are 1.61 % per decade and 1.86 % per decade, respectively, indicating a general overestimation of IWV trends from NCEP-2 with respect to GPS.In particular, the average NCEP-2 GPS IWV trend differences is as large as 2.9 % per decade at 31 stations located in southern Europe (32 GPS IWV.Therefore, we concluded that the NCEP-2 IWV trends are not qualified for climate change analysis in Europe.

Conclusions
In this study, integrated water vapour (IWV) time series for the period 1994-2018 were retrieved from continuous GPS observations at 108 ground-based GPS stations in Europe, with an average period of 21 years for those time series.The temporal features of Europe's IWV, such as its diurnal cycle and variation, annual cycle, and linear trend, were then investigated.Moreover, the performances of six frequently used global atmospheric reanalyses in Europe were assessed for the first time, namely CFSR, ERA5, ERA-Interim (ERAI), JRA-55, MERRA-2, and NCEP-2.The main findings are summarised below: i.The agreement between the daily GPS IWV time series and the six reanalyses is found to be the best for ERA5 and the worst for NCEP-2, with standard deviations of IWV differences of 0.5-1.6 and 1.1-3.0kg m −2 , respectively.The standard deviations of IWV differences are well correlated with representativeness statistics of the six reanalyses, indicating that the representativeness differences contribute to the discrepancies between the reanalyses and GPS.
ii.The diurnal amplitudes are 0-1.2kg m −2 , accounting for 0 %-8 % of the associated daily mean IWV.The semidiurnal amplitudes are weaker, with values lower than 30 % of the diurnal amplitudes at 88 % of the stations.The peaks of the diurnal and semidiurnal harmonics are within 15:00-21:00 and 01:00-06:00 LST at 90 % of the stations, respectively.The diurnal amplitudes are larger at the Mediterranean Coast (1.1 %-4.2 %), which is most likely because of a stronger solar heating effect.In comparison, the diurnal amplitudes at other coastal regions in Europe are lower (0.3 %-1.9 %), which can be due to a land-sea breeze and a weaker solar heating effect.In addition, the relative diurnal and semidiurnal IWV amplitudes are correlated with altitudes with correlation coefficients of 0.66 and 0.67, respectively, which can be related to orographic circulation.
iii.Mismatches in the diurnal cycle of the ERA5 IWV product were found and evaluated between 09:00 and 10:00 UTC as well as between 21:00 and 22:00 UTC.
The problem can be attributed to the edge effect in each ERA5 assimilation cycle, and it has been noticed in some other meteorological variables provided by ERA5.
The average artificial shifts in ERA5 IWV are −0.08 and 0.19 kg m −2 at the two epochs.In contrast, the natural shifts in GPS IWV are 0.05 and −0.05 kg m −2 .
The ERA5 shifts are dependent on seasons and locations.The ERA5 shifts are more significant in summer than in winter.Moreover, the ERA5 shifts from 09:00 to 10:00 UTC are most significant at the Alps and Eastern Europe, whereas the shifts from 21:00 to 22:00 UTC are more widespread in central Europe.As the average diurnal IWV amplitude obtained from GPS is only 0.32 kg m −2 , the artificial shifts in ERA5 IWV cannot be ignored in the diurnal IWV cycle analysis in these regions.
iv. Regarding diurnal IWV anomalies, ERA5 shows the best consistency with GPS at a temporal resolution of 1 h, with an average rms of the IWV difference (rms ) of 0.97 kg m −2 , whereas the average rms is 1.14-1.57kg m −2 for the other five reanalyses.ERA5 also outperforms the other reanalyses at their respective resolutions (3 and 6 h), indicating the benefits of its enhanced spatial resolutions and other improvements in data assimilation, in addition to its higher temporal resolution.
v. All the monthly IWV time series are modulated with apparent annual cycles, with minima in January and February (4-17 kg m −2 ) and maxima in July and August (17-32 kg m −2 ).The maxima and minima of the annual cycles show consistent geographical patterns, with larger values towards the Equator and at lower-altitude sites.ERA5 ranks first in modelling the annual cycles of IWV (median KGE = 0.97) with respect to GPS IWV; CFSR ranks last (median KGE = 0.94) due to its overestimation of mean values (median β = 1.04) and underestimation of variability (median γ = 0.96).
vi. Europe's IWV is increasing as observed from more than 2 decades of continuous GPS observations.The trends generally increase from the Atlantic Coast towards the southeast.The average trends of the Atlantic coasts, inland Europe, and Mediterranean coasts are 0.2, 0.5, and 0.9 kg m −2 per decade, which are equivalent to relative values of 1.7 % per decade, 3.1 % per decade, and 5.0 % per decade, respectively.The monthly IWV time series are best modelled by first-order autoregressive (AR( 1)) noise at most stations located close to the sea and northern Europe, whereas autoregressive moving average (ARMA(1,1)), white noise (WN), and ARMA(1,1) models are preferred at the rest of the stations.Powerlaw noise (PL) is found to be optimal at only four stations.As for the performances of the reanalyses in reproducing IWV trends, ERA5 achieves the best consistency with GPS IWV trends, with a mean value and a standard deviation of 0.01 % per decade and 0.97 % per decade in their trend differences, respectively.However, CFSR is not qualified for the analysis of IWV trends in southern Europe due to its significant discrepancies with respect to GPS.The NCEP-2 IWV trends over the whole of Europe are also not recommend for the same reason.
Overall, it can be concluded that the reanalyses successfully reproduce the spatiotemporal IWV variability over Europe, as assessed by using the GPS IWV dataset, with ERA5 slightly outperforming the other reanalyses at most temporal scales.It is noteworthy that the pressure and weighted mean temperature obtained from ERA5 were used in the conversion of GPS ZTD to IWV due to a lack of long-term accurate and complete in situ meteorological observations at the GPS stations.This could partly contribute to the superior agreement between the IWV from ERA5 and GPS.Future studies could validate the IWV products of reanalyses using groundbased GPS when independent, quality-assured, and complete meteorological observations at the GPS stations are available.IWV measurements from other techniques could also be helpful.

Appendix A: Homogenisation of GNSS IWV time series
Long-term IWV time series often suffer from inhomogeneities due to changes in instrumentation, data processing methods, and local environmental conditions (Van Malderen et al., 2020;Nguyen et al., 2021, and references therein).These inhomogeneities can manifest themselves as changes in the mean of the time series ("biases") at specific epochs, i.e. breaks or changepoints.If such changepoints are not properly corrected, they can significantly modify the estimations of long-term linear trends and multi-temporalscale variabilities (e.g.Ning et al., 2016;Van Malderen et al., 2020;Yuan et al., 2021).Therefore, the homogenisation of the IWV time series is essential for a sound understanding and proper interpretation of IWV variability under climate change.
In this work, we examined the homogeneity of the monthly GPS IWV time series by using the RHtestsV4 software (Wang and Feng, 2013).This software is developed especially for the detection and adjustment of changepoints in climatic time series, and it has been used in the homogenisation of IWV time series in previous studies (Ning et al., 2016;Schröder et al., 2016;Van Malderen et al., 2020).The software is based on a penalised maximal t test with the consideration of linear trend, annual cycle, and AR(1) noise in the time series (Wang et al., 2007;Wang 2008).
We took all six reanalyses as references for the homogenisation of the GNSS IWV time series, meaning that we inspected the monthly IWV difference time series between the GNSS and each of the reanalysis IWV.It is noteworthy that the reanalyses may also contain changepoints (e.g.Ning et al., 2016;Schröder et al., 2016).However, we did not homogenise the reanalyses IWV time series because they represent the native quality of the reanalyses that we would like to assess in this work.Also, by taking all six reanalyses as references, we are confident that we can minimise the impact of inhomogeneities in either reanalysis of the homogenisation process of the GNSS IWV time series.Practically, we used the following strategy to avoid the impacts of changepoints in specific reanalyses on the homogenisation of the GPS IWV time series: 1. We examined the GPS IWV and metadata (station log file and IGSMAIL) carefully.If there was an instrumentation change within the first (or the last) year, we removed the several months before (or after) the epoch of change.Moreover, we also inspected the station upcoordinate time series and excluded periods with quality problems, as it is well known that they are strongly correlated with the GPS tropospheric delay estimates (Tregoning and Herring, 2006).An example is given in Fig. A1 and will be described later in this Appendix.
2. We used the FindU.wRefcommand of the RHtestsV4 software to identify all possible changepoints in each GPS-reanalysis IWV monthly mean difference time series, which can be significant at a confidence level of 99 %, no matter if they are documented in metadata or not.
3. If a changepoint was within 3 months before or after a documented change in instrumentation, we adjusted its epoch according to the metadata and set it as Type 0.
The rest of the changepoints were set as Type 1.
4. If identical Type-1 changepoints were reported within 6 months in at least four GPS-reanalysis IWV differences, but not supported by metadata, we recognised them as a single Type-1 changepoint at the median of the epochs.
5. We estimated the amplitudes of the Type-0 and Type-1 changepoints and tested their amplitude significances at a confidence level of 99 % with the StepSize.wRefcommand of the RHtestsV4 software.
6.We calculated the amplitude of each changepoint as the average of all the significant amplitude estimates from the six GPS-reanalysis IWV differences.
7. We removed a changepoint if its amplitude was only significant in less than four GPS-reanalysis comparisons or if its amplitude was less than 3 times its standard deviation.Then, we repeated the steps of ( 5) and ( 6) until the rest of the changepoints were significant.
The changepoint identification is finished after one or two iterations for most stations.In the end, we identified 44 Type-0 and 9 Type-1 changepoints as listed in   A2.The Type-0 changepoint on 19 August 2010 is significant at a confidence level of 99 % (green downward-pointing triangle), and its value was calculated as the average of the values estimated in each GPS-reanalysis comparison as shown in panels (a)-(f).
As the changepoint detection was carried out on a monthly level, the specific dates of the Type-0 changepoint are fixed as documented.However, for the Type-1 changepoints without support from metadata, their time of occurrence is fixed to the 15th day of the associated month.We adjusted the GPS IWV time series by adding the amplitude of each changepoint to the GPS IWV data points before its time of occurrence.Note that five out of the nine Type-1 changepoints are significant in all six GPS-reanalyses comparisons.Although we tried to minimise the impacts of changepoints in specific reanalyses on the results here, we cannot completely rule out that identical changepoints appear in all six reanalyses by ingesting the same observational datasets through data assimilation.
Figures A1 and A2 show the homogenisation results at two stations.Station HERS (Herstmonceux, UK; 50.87 • N, 0.34 • E) is characterised by abnormal variations in its upcoordinate time series before the changes of the antenna and receiver on 18 February 1998 as shown in Fig. A1h, indicating low-quality observations related to the instrumentation.In addition, obvious abnormal variations can be seen in all the GPS-reanalysis comparisons before September of 2001.We checked IGSMAIL-3503 (https://lists.igs.org/pipermail/igsmail/2001/004876.html, last access: 8 January 2023), which reported a repair of antenna at station HERS until 3 September 2001.Therefore, we excluded the GPS IWV data before the date of repair.Then, we used the RHt-estsV4 software to identify the changepoints in the rest of the GPS IWV time series and found one on 19 August 2010, Table A1.The identified changepoints in GPS IWV time series.The Type 0 and Type 1 are changepoints with and without changes in instrumentation as documented in GPS station log files, respectively.The full names of the Type-0 changepoint events are shown in Table A2, and the Type-1 changepoints are labelled as Unknown.G-C, G-E, G-I, G-J, G-M, and G-N indicate the GPS IWV changepoints in kilogram per square metre estimated by comparing it to CFSR, ERA5, ERAI, JRA-55, MERRA-2, and NCEP-2, respectively.The NaN indicates that the changepoint in the specific GPS-reanalysis comparison is insignificant at a confidence level of 99 %.The mean and SD are the mean value and standard deviation of each changepoint.
Competing interests.The contact author has declared that none of the authors has any competing interests.
Disclaimer.Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Special issue statement.This article is part of the special issue "Analysis of atmospheric water vapour observations and their uncertainties for climate applications (ACP/AMT/ESSD/HESS interjournal SI)".It is not associated with a conference.
Acknowledgements.We are grateful to NGL for providing the GPS ZTD products and many institutions for sharing the continuous GPS observations.We thank ECMWF, JMA, NASA GAMO, and NCEP for providing the reanalyses products.We would like to thank the German Research Foundation and the Program for Hubei Provincial Science and Technology Innovation Talents of China for financial support.
Financial support.This research has been supported by the Deutsche Forschungsgemeinschaft (grant no.321886779).Weiping Jiang is funded by the Program for Hubei Provincial Science and Technology Innovation Talents of China (grant no.2022EJD010).
The article processing charges for this open-access publication were covered by the Karlsruhe Institute of Technology (KIT).
Review statement.This paper was edited by Rolf Müller and reviewed by two anonymous referees.

Figure 1 .
Figure 1.Geographical distribution of the 108 GPS stations (red dots).An enlarged version of this figure with station names is provided in Fig. S1.The coordinates of the stations and their time lengths are provided in TableS1.

Figure 2 .
Figure 2. Schematic plot of the vertical and horizontal interpolation of the reanalysis pressure-level products.The IWV at each auxiliary point (A i ; orange dots) is calculated with vertical interpolation or extrapolation of the adjacent reanalysis nodes (N (j ) i and N (j +1) i; green dots).The IWV at the GPS station (red dot) is then estimated with horizontal interpolation of the auxiliary points.

Figure 3 .
Figure 3. Scatterplots showing the relationships between the standard deviations of daily IWV difference (σ ) and the representativeness error statistic (δ max ) for the 108 GPS stations.

Figure 4 .
Figure 4. Plots of the KGE parameters for the daily IWV time series of the 108 GPS stations.β and γ indicate the consistencies in the mean and variability, respectively.r indicates the Pearson correlation coefficient between the GPS and reanalyses time series.With perfect consistencies in mean, variability, and correlation, β, γ , and r are identical to 1, respectively.When β, γ , and r are identical to 1, the KGE score will reach its maximum value of 1.

Figure 5 .
Figure 5. Box-whisker plots of the KGE parameters for the daily time series (a-d) and monthly annual cycle of IWV (e-h) from the reanalyses compared to GPS for the 108 stations.

Figure 6 .
Figure 6.Plots of the amplitudes (a, f) and phases (k, p) for the first (D 1 ) and second (D 2 ) harmonics of the diurnal GPS IWV cycle averaged in March, April, and May (MAM; spring) at each station, respectively.The other subplots are for JJA (summer); September, October, and November (SON; autumn); December, January, and February (DJF; winter); and annual, from left to right, respectively.The phases are in local solar time (LST) at the peak of associated harmonics.

Figure 7 .
Figure 7. (a) Relative amplitudes of the first harmonic (D 1 ) of the diurnal GPS IWV cycle.Panels (b) and (c) are the variations of the relative D 1 amplitudes with respect to station altitude and distance to sea, respectively.Panels (d)-(f) are the same as panels (a)-(c) but for the second harmonic (D 2 ).The stations are classified into three types, namely Inland, MedCoast, and OtherCoast.The type of Inland includes 62 stations with their distance to sea (SeaDist) no shorter than 20 km.The type of MedCoast contains 12 stations located at the coastal region of the Mediterranean (SeaDist < 20 km; 32 • N < lat < 46 • N, 45 • E < long < 5 • W).The rest of the 34 stations are classified as OtherCoast.

Figure 8 .
Figure 8.Diurnal IWV cycles at selected six stations obtained from 1-hourly GPS (green dots) and ERA5 (red squares).The stations are selected with the consideration of different altitudes, distance to sea (SeaDist), and climate zones classified according to the Köppen climate classification system (Beck et al., 2018).The data points are fitted with diurnal (D 1 ) and semidiurnal (D 2 ) harmonics (dashed blue curve for GPS and orange curve for ERA5).The amplitudes and phases of the D 1 and D 2 harmonics are also given.The phases are shown as the local solar time (LST) at the peak of associated harmonics.For instance, the D 1 amplitude and phase of GPS IWV at station NICO are 1.2 kg m −2 and 15.8 LST, respectively.By comparison, the values of its ERA5 IWV are 0.8 kg m −2 and 15.2 LST, respectively.The vertical dotted black lines at 09:00 and 12:00 UTC indicate the time of possible mismatches in the ERA5 IWV cycle.

Figure 9 .
Figure 9. Similar to Fig. 8 but for the seasonal-averaged and all-time-averaged diurnal IWV cycles at station PENC.The green and red numbers are the IWV shifts from ERA5 and GPS, respectively.The numbers on the left and right are the IWV shifts from 09:00 to 10:00 UTC and from 21:00 to 22:00 UTC, respectively.

Figure 11 .
Figure 11.(a) Monthly annual cycle of GPS IWV in kilogram per square metre for the 108 GPS stations with increasing latitude from the bottom to the top.(b-g) The differences between the annual cycles of the various reanalyses compared to the GPS.

Figure 12 .
Figure 12.(a) Map of the absolute trends of the homogenised monthly GPS IWV anomaly time series.The squares and circles indicate the trend estimates being significant and insignificant at 95 % confidence level, respectively.The trend uncertainties are estimated based on optimal noise models (b).The relative trends (d) are calculated as the absolute trends (a) divided by the associated average IWV (c).

Figure 13 .
Figure 13.Comparison of the relative IWV trend differences (percent per decade) for various reanalyses with respect to GPS.The numbers in each subplot indicate the mean value and standard deviation of the associated relative IWV trend differences.

Figure A1 .
Figure A1.Monthly GPS-reanalysis IWV difference (a-f), monthly GPS IWV (g), and daily up-coordinate time series at station HERS.The GPS data before 3 September 2001 were deleted due to a problem in station antenna (see.The IWV time series before and after the homogenisation are labelled as blue dots and red crosses in panels (a)-(g).The types of instrumentation changes are listed in TableA2.The Type-0 changepoint on 19 August 2010 is significant at a confidence level of 99 % (green downward-pointing triangle), and its value was calculated as the average of the values estimated in each GPS-reanalysis comparison as shown in panels (a)-(f).

Figure A2 .
Figure A2.The same as Fig. A1 but for station ERLA.A Type-0 changepoint is significant on 18 August 2010 (green triangle) due to changes in the antenna and radome at the station.A Type-1 changepoint in July of 2005 (green vertical line and upward-pointing triangle) is significant but without support from metadata, and hence its date was fixed to 15 July 2005.

Table 1 .
Six atmospheric reanalyses used in this study and their characteristics.

Table 2 .
Statistics of the consistencies in diurnal IWV anomalies from reanalyses compared to GPS.

Table A1 .
Ning et al. (2016)f 53 changepoints is consistent with a previous global GPS IWV homogenisation study carried out byNing et al. (2016), which identified 45 changepoints in total at 101 stations.