1 Introduction

Water vapour in one of the most important constituents of the atmosphere. Apart from its obviously connection to the hydrological cycle, water vapour is crucial to the regulation of temperature on earth, and it is the most important natural greenhouse, accounting for about 60% of this effect (https://gcos.wmo.int/en/essential-climate-variables/upper-vapour accessed in November 2022). In fact, water vapour is defined by the Global Climate Observing System (GCOS) as an Essential Climate Variable because it critically contributes to the characterization of the Earth’s climate system. The water vapour amount, as well as its temporal and spatial variation, therefore, are needed to understand and predict global climate change.

Water vapour can be expressed using different terms. One of these terms is the Integrated Water Vapour (IWV) which is defined as the total amount of water vapour present in a vertical atmospheric column, using kg·m−2 as its measurement unit. Another commonly used term is Precipitable Water Vapour (PWV) defined as the height to which water would stand if completely condensed and collected in a vessel with a cross section of 1 m2. In this paper, water vapour is expressed as PWV, using millimetres as its measurement unit. Water vapour proportion in the atmosphere is between 1 and 4%; of this percentage between 45 and 65% is concentrated from the surface to the 850 hPa level (approximately 1.5 kms) (J. Wang et al. 2007). One of the principal characteristics of atmospheric water vapour is its high variability on time and space that renders its study rather complex.

Because of its great importance, many different types of instruments, based on different techniques, had been used to retrieve the water vapour present in the atmosphere. Some of these techniques are Water Vapour Radiometers (Köpken 2001), Very Long Base Interferometry (Niell et al. 2001), LiDAR (Hicks-Jalali et al. 2020), sun photometers (Firsov et al. 2013), as well as different sensors on satellites (Gao & Kaufman 2003).

Furthermore, for decades, the traditional technique that had proved its capacity to retrieve the water vapour is the radiosonde (Ross and Elliott 2001). This technique allows to obtain the vertical profile of different atmospheric meteorological parameters from which Precipitable Water Vapour can be calculated. The long-term historic data and its high precision and good vertical resolution are some of its most valuable characteristics. However, most of the radiosonde stations retrieve just one or two profiles per day, so radiosonde is considered a low temporal resolution technique (Deblonde et al. 2005; Liu et al. 2005). Moreover, the number of stations is not high, with an inhomogeneous distribution characterised by a higher number of stations on the northern hemisphere and a very limited number of radiosondes sites on the oceans. Therefore, the spatial resolution is not high.

Global Navigation Satellite Systems (GNSS) observations, originally designed to provided positioning, have become a powerful tool to retrieve the water vapour. There are some advantages on the use of GNSS observations to retrieve the atmospheric water vapour. On one hand, the network of GNSS sites continues to grow so the availability of data is increasing. In addition, temporal resolution is high, being able to retrieve the PWV values with a ratio of 15 min or even less, so the technique can be useful to study short-term water vapour variations. However, the distribution of GNSS stations is not homogenous. Moreover, the lack of meteorological stations collocated with the GNSS site makes it difficult, as will be explained later, to retrieve water vapour from GNSS measurements.

There have been various studies that proved the capacity of GNSS to retrieve the water vapour on a global, continental, or local scale (Bock et al. 2007; Vaquero-Martínez et al. 2018; J. Wang and Zhang 2009). The water vapour retrieved from GNSS observations have been validated with other reference measurements such as radiosonde (Ohtani and Naito 2000), radiometers on satellites like MODIS (Z. Li et al. 2003) or numerical weather models (Heise et al. 2009), reaching a quality between 1 and 3 mm. A selection of some multi-comparisons can be found in Guerova et al., (2016) Table 3.

The retrieval of the water vapour from GNSS observations is based on the delay that the signals undergo on the last part of its way to the antenna. This is commonly known as troposphere delay because it mainly occurs in this part of the atmosphere. The total delay of the signal, once mapped onto the zenith, is known as Zenith Total Delay (ZTD). ZTD is the sum of the Zenith Hydrostatic Delay (ZHD) and Zenith non-hydrostatic or Zenith Wet Delay (ZWD). ZHD depends on the atmospheric pressure and temperature and ZWD depends on the partial water vapour pressure and temperature so, the calculation of the delay would imply in-situ measurements of these atmospheric parameters on which they depend. Unfortunately, not all the GNSS sites have collocated meteorological stations to measure the needed atmospheric parameters. However, using just a set of GNSS sites with collocated meteorological stations could reduce the valuable high spatial resolution of the technique. In these cases, an empirical model can be useful to obtain the atmospheric parameters. Moreover, the real time PWV retrieval, needed on the short-term rainfall forecasting, can be easily accomplished with an empirical model. However, it is necessary to verify the capacity of the empirical blind model, if it is the unique source of meteorological data in PWV retrieval, to contribute to the capturing of the variations of PWV. Thereafter, this study focuses on the characterization of temporal and spatial variations of water vapour using an empirical blind model in each step in which atmospheric parameters are required.

This paper is structured as follows. In Sect. 2, a description of the retrieval of Precipitable Water Vapour from GNSS observations is shown. In Sect. 3, data sets and methodology are described. In the results section, Sect. 4, firstly the validation process of the ZTD using the official EUREF values and the PWV with radiosonde data is provided. After which, an analysis of the temporal and spatial variations of Precipitable Water Vapour in the investigated area are explained. Finally, Sect. 5 shows the summary and conclusions of the study.

2 Precipitable water vapour retrieval from GNSS observations

The starting point of the retrieval of PWV is the estimation of the neutral or troposphere delay, that is commonly obtained from the processing of the GNSS observations as a time dependent parameter. The troposphere delay, TD, can be expressed as the difference between the ray path and the geometrical vacuum distance:

$${T}_{D}=\int (n-1)ds$$
(1)

where n is the index of atmospheric refraction, ds is the differential increment in distance with respect to the line of sight. Often, the refractivity, N, defined by N = 106(n-1), is used instead of the atmospheric refraction, so Eq. 1 can be formulated as:

$${T}_{D}={10}^{-6}\int Nds$$
(2)

The refractivity can be divided into two terms:

$$N={N}_{HYD}+{N}_{WET}$$
(3)

where NHYD is due to the hydrostatic part of the atmosphere and NWET is due to the non-hydrostatic or wet part of the atmosphere. Then, the Eq. 2 can be rewritten as:

$${T}_{D}={10}^{-6}\int {N}_{HYD}ds+{10}^{-6}\int {N}_{WET}ds$$
(4)

The refractivity depends on the atmospheric temperature, pressure and water vapour pressure and can be calculated empirically (Essen and Froome, 1951; Thayer 1974). However, usually, the troposphere delay showed on Eq. 4 are calculated by considering the dependence of the delay on the satellite zenith distance z (Dach et al. 2015) so the integrals on Eq. 4 can be expressed as:

$${T}_{D}={mf}_{HYD}\left(z\right)\cdot ZHD+{mf}_{WET}\left(z\right) \cdot ZWD$$
(5)

where mfHYD and mfWET are the hydrostatic and wet mapping functions respectively and ZHD and ZWD are the Zenith Hydrostatic Delay and the Zenith Wet Delay respectively. Zenith Hydrostatic Delay is caused by the atmospheric gases, assuming the hydrostatic equilibrium, and accounts for 90% of the total delay (Hofmann-Wellenhof et al. 2007). Its low variability facilitates its estimation with enough accuracy from models. One of the most widely used is the Saastamoinen model (Saastamoinen 1972):

$$ZHD=\frac{0.0022767\cdot {P}_{O}}{1-0.00266\cdot \left(cos2\varphi \right)-0.00028\cdot H}$$
(6)

where Po is the atmospheric pressure in hPa, and φ (in degrees) and H (in kilometres) are the latitude and height of the site, respectively. With an error less than 0.30 hPa, ZHD can be modelled with a quality of around 1 mm (Duan et al. 1996). In those stations without a collocated meteorological station, the pressure could be obtained from empirical models like Global Pressure and Temperature 2 wet model (GPT2w) (Böhm et al. 2015) or Global Pressure and Temperature 3 (GPT3) (Landskron and Böhm 2017) or from reanalysis products like ERA5 (Hersbach et al. 2020).

The Zenith Wet Delay is defined as the integral of the wet refractivity along the vertical profile above the station and depends on the partial pressure of water vapour PV and temperature T (Yao and Hu 2018). ZWD contributes the remaining 10% of the total delay. According to (Bevis et al. 1992), ZWD is given by:

$$ZWD={10}^{-6}\int \left({k}^{{\prime}2}\cdot\frac{{P}_{v}}{T}+{k}^{3}\cdot\frac{{P}_{v}}{{T}^{2}}\right)dz$$
(7)

where the values and uncertainties of constants are k2 = (22.1 ± 2.2) K·mb−1 and k3 = (3.7390 ± 0.0012) 105·K2·mb−1 (Bevis et al. 1994). One of the characteristics of ZWD is its high variability, both temporal and spatial, that complicates its calculation from models so, if profiles of Pv and T are not available, ZWD is often obtained from the subtraction of ZTD and ZHD:

$$ZWD=ZTD-ZHD$$
(8)

Once ZWD is obtained, the last step to the retrieval of PWV from GNSS observations is the application of the dimensionless factor ∏:

$$PWV=\Pi \cdot ZWD=\frac{{10}^{6}}{\rho \cdot {R}_{V}\left[\frac{{k}_{3}}{{T}_{m}}+{k}_{2}-m\cdot{k}_{1}\right]}\cdot ZWD$$
(9)

where ρ is the density of liquid water, \({R}_{V}\) is the specific gas constant for water vapour, m is the ratio of the molar masses of water vapour and dry air, and Tm is the water-vapour-weighted mean temperature. The values and uncertainties of the physical constants are k1 = (70.60 ± 0.05) K·mb−1, k2 = (70.40 ± 2.2) K·mb−1 and k3 = (3.7390 ± 0.0012) 105·K2·mb−1 (Bevis et al. 1994).The approximate value of factor ∏ is around 0.15, with a variation of 20% related to the location, height, weather season and time (Bevis et al. 1994). Notably, other authors consider a factor value of 0.16 (Foelsche and Kirchengast 2001). Taking the value of 0.16 as a reference then, approximately 6.3 mm of ZWD would correspond to 1 mm of PWV.

The water-vapour-weighted mean temperature, Tm, can be obtained by:

$$T_{m} = \frac{{\smallint \left( {{\raise0.7ex\hbox{${P_{v} }$} \!\mathord{\left/ {\vphantom {{P_{v} } T}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$T$}}} \right)dz}}{{\smallint \left( {{\raise0.7ex\hbox{${P_{v} }$} \!\mathord{\left/ {\vphantom {{P_{v} } {T^{2} }}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{${T^{2} }$}}} \right)dz}}$$
(10)

where Pv is the partial pressure of water vapour, T is the absolute temperature and dz is the integral path. Relative error of conversion factor ∏ is closely related to the relative error of Tm so this magnitude is crucial to improve the quality of the conversion factor and, consequently, the PWV retrieval.

Water-vapour-weighted mean temperature can be obtained from three main methodologies. The most accurate one is using temperature and humidity profiles obtained from radiosonde or numerical weather models. This approach has the advantage of its high precision but the disadvantage of low temporal and spatial resolution (Wang et al. 2016). Moreover, it cannot be applied in real time. Another approach to obtain Tm is its estimation through its strong correlation with the surface temperature, TS. This correlation can be used to derive a linear model between the two temperatures. One of the most well-known relations is the Bevis model, that was established using 8718 profiles of atmospheric parameters from 2 years of radiosonde data along North America, achieving a root mean square (rms) on the Tm calculation of 4.7 k (Bevis et al. 1992):

$${T}_{m}=70.2+0.72\bullet {T}_{S}$$
(11)

Another well-known relation is the Mendes model (Mendes 1998). This linear model was found using 32,467 radiosonde profiles, achieving a rms of 3.07 k on Tm:

$${T}_{m}=50.4+0.789\bullet {T}_{S}$$
(12)

In Spain, Ortiz de Galisteo Marín, (2011) made a study using 8 radiosonde sites, analysing 37,179 profiles from 2000 to 2008, developing the relationship:

$${T}_{m}=61.92+0.75\bullet {T}_{S}$$
(13)

The rms of the water-vapour-weighted mean temperature obtained from this model was found in 3.19 K.

Apart from those relations, some have been developed, considering the spatial, seasonal, or even daily variation of the relation of the surface temperature and the water-vapour-weighted mean temperature. For example, Baldysz and Nykiel, (2019) established a set of models to Europe, using 49 radiosonde sites and covering a period of 24 years, with a rms of 2.8k on the estimation of Tm. However, no matter if the relationship is simpler or more complex, it is required to know the surface temperature to be able to apply this approach.

A third approach to determine the needed water-vapour-weighted mean temperature is the use of empirical blind models. These empirical models can even use different approaches to determine the Tm, but are characterized by simple input data, like site position and the day of year, and most of them can be used to determine Tm globally. Using this approach no meteorological data in situ is needed so it could be useful for those stations without a collocated meteorological station or in a real time retrieval. Some global empirical models have been developed. UNB3m model is an improvement of UNB3 model related to the water vapour pressure estimations. It permits Tm to be derived from a set of atmospheric parameters that are calculated from a linear interpolation in latitudinal direction using a set of prefixed tabulated values (Leandro et al. 2006). Global empirical Tm model X, (GTm_X) was established from global reanalysis data provided by the ECMWF, based on the semiannual and diurnal variations in Tm, and achieves a global spatial resolution of 1° × 1° (Chen and Yao 2015). Global Weight Mean Temperature D, GWMT_D considers the diurnal variation using the UTC hour as an input, as well as GTM_X model, and achieves a rms less than 5.0 k on Tm (He et al. 2016). Global Pressure and Temperature 2 wet model (GPT2w), as well as its successor, Global Pressure and Temperature 3 (GPT3) (Landskron and Böhm 2017) was derived using ten years mean monthly pressure level data from ERA-Interim and is provided in two resolution grids of 1° × 1° and 5° × 5° (Böhm et al. 2015). These two last blind models provide not only the Tm but also a set of atmospheric parameters, so can be useful in the whole process of the retrieval of PWV from GNSS observations. Therefore, the latest, GPT3 model, is the chosen model to provide the needed atmospheric parameters on this work.

The uncertainty of the PWV retrieved from the GNSS observations can be evaluated from the comparison with other techniques or using a theoretical analysis where the uncertainty of the PWV is calculated from the uncertainties of each variable from which PWV is derived. The uncertainty of PWV comes mainly from the ZTD uncertainty, contributing to more than 75% uncertainty of PWV (Ning et al. 2015). Apart from that, the uncertainty of the atmospheric pressure, used on ZHD calculation coupled with the variables that are involved in the calculation of conversion factor , especially the uncertainty of the water-vapour-weighted mean temperature, affects the final quality of the retrieved PWV. Some studies found, using a similar approach, a less than 2 mm theoretical error for the GNSS-retrieved PWV (van Malderen et al. 2014; J. Wang et al. 2007).

3 Data sets and methodology

3.1 Study area

The area of this study is the Atlantic coast of Spain and France. This area has an oceanic climate, typified by low annual temperature variation, with a mild winter and warm summer, but without high temperatures. Regarding the precipitation, it is consistent the whole year, especially on coastal places. One remarkable characteristic of this area is its high humidity. Covering the whole area, nine GNSS sites were used (Fig. 1). Five of the sites contribute to the EUREF network. The stations of Spain are managed by the National Geographical Institute of Spain (IGE) while the French GNSS stations are managed by the National Institute of Geographic and Forest Information of France (IGN). Table 1 summarizes the locations of the GNSS sites.

Fig. 1
figure 1

Location of the GNSS stations used in the study in the Atlantic coast of Spain and France

Table 1 Locations of the GNSS sites used in this study. Latitudes and longitudes are given in decimal degrees. The height is given with respect to the WGS84 ellipsoid

3.2 Procesing of GNSS observations

The processing of GNSS observations, to obtain the ZTD, was done with the Bernese software, version 5.2, which was developed by the Astronomical Institute of the University of Bern (Dach et al. 2015). This software was used on a double difference phase approach. The cluster of the relative processing (Fig. 2) was designed considering that, on a double difference processing approach, the quality of the result is sensitive to the network configuration used on the calculation and should include long baselines, with the aim of obtaining absolute ZTD estimates. Tregoning et al. (1998) showed that networks including baselines longer than 2000 kms provide more accurate PWV estimates from GNSS observations, increasing the recommended value of 500 kms found by the study of Rocken et al. (1993). The cluster consists of 17 stations. In this set, 13 GNSS stations contribute to the EUREF network, whereas 6 of them contribute also to the IGS network.

Fig. 2
figure 2

Distribution of the GNSS stations used in the GNSS processing

Troposphere modelling in the Bernese software is based on a priori model that is corrected by the site-specific troposphere parameters, that are represented in a piecewise linear, continuous way (Dach et al. 2015). Because of the fact that the a priori model absorbs virtually the entire hydrostatic component of the troposphere delay, the wet component is estimated by those site-specific parameters and its corresponding mapping wet function. Moreover, Bernese software allows to estimate the gradient parameters, as a time-dependent horizontal north and east parameters. Due to the use of low elevation data, estimation of these tropospheric gradient parameters is essential to decorrelate station heights and tropospheric corrections. Then, the total tropospheric delay is the sum of the a priori model, its correction by the site-specific troposphere parameters, and the horizontal gradients. The needed tasks to do the described troposphere modelling on Bernese processing can be summarized as follows. First step is the preparation of the pole and orbit information, generating the standard orbits from IGS final orbits. Also, the observation Rinex data from GNSS sites are converted, pre-processed, and synchronized. After the formation of the baselines, essential in the Double Difference approach, the phase baselines files are pre-processed to detect and correct cycle-slip. Then, a residual screening of the observation data is done. Next step is the resolution of phase ambiguities. According to the characteristics of the network, the Quasi-Iono Free (QIF) strategy was chosen for this study. Then, the ambiguity-fixed network solution is computed and coordinates as well as troposphere parameters are obtained. As the final step, the Back Substitution procedure is applied, and the final TRP and TRO files, with 15 min rate, are written. The main parameters used on these described tasks are listed on Table 2. The elevation cutoff was set to 3° according to the Guidelines of EUREF EPN (https://www.epncb.oma.be/_documentation/guidelines/ accessed in November 2022).

Table 2 Main processing parameters used in Bernese GNSS Software 5.2

3.3 Global pressure and temperature 3

Global Pressure and Temperature 3, GPT3, is an empirical model that can act as a full empirical troposphere model since it outputs all the information that may be required in troposphere modelling (Landskron 2017). As explained, the GPT3 model was developed using 10 years (2001–2010) of monthly mean pressure levels from ERA-Interim reanalysis and is provided in two resolution grids of 1° × 1° and 5° × 5°. Even though the GPT3 and its predecessor GPT2w are very similar, GPT3 achieves slightly better results using the 1° × 1° grid version (Landskron and Böhm 2017). As input parameters, GPT3 uses the modified Julian date and the coordinates of the site (latitude, longitude, and ellipsoidal height). Also, two different cases with or without time variation can be chosen on the input. If the time variation is chosen, the model will consider the characteristics of annual and semi-annual cycles on the atmospheric parameters using the equation:

$${y}_{i}={A}_{0i}+{A}_{1i} cos\left(\frac{DOY}{365.25} 2\pi \right)+{B}_{1i} sin\left(\frac{DOY}{365.25} 2\pi \right)+{A}_{2i }cos\left(\frac{DOY}{365.25} 4\pi \right)+{B}_{2i }sin\left(\frac{DOY}{365.25} 4\pi \right)$$
(14)

where yi is any of the atmospheric parameters or coefficients that can be obtained by the model and DOY is the day of year. A0i represents the mean value of yi, A1i and B1i are the annual amplitudes and A2i and B2i are the semi-annual amplitudes of yi, and all are obtained from the grid of the model. As a last step, a bilinear interpolation is done to calculate the coefficients and atmospheric parameters on the desired location.

In this study, the GPT3 model was used on the 1° × 1° grid version with time variation (VMF Data Server). The scripts and the corresponding needed files and grids can be downloaded from http://vmf.geo.tuwien.ac.at/codes/ (accessed in November 2022). From the set of GPT3 output quantities, the water-vapour-weighted mean temperature and pressure were selected. Then, atmospheric pressure on the sites was applied on the Saastamoinen model to estimate the ZHD (Eq. 6). Once ZTD and ZHD were calculated, ZWD was obtained from the subtraction of these two quantities (Eq. 8). Hereafter, the estimation of the water-vapour-weighted mean temperature from GPT3 model was used to calculate the factor . Using this factor, ZWD was finally converted into PWV (Eq. 9).

The behaviour of GPT3 model is influenced by geographical location, with worse performance in higher latitudes or with drastic climate changes (S. Li et al. 2022a, b, c) and has a limited horizontal resolution of 1° × 1°. Moreover, even the annual and semi-annual cycles are considered by the model, the diurnal variations are not, so GPT3 model is unable to capture the fluctuations of the atmospheric parameters along the day. However, GPT3 model, as a full troposphere model, provides each needed atmospheric parameter so can be used in a consistent way in the retrieval of PWV from GNSS observations without in situ meteorological measurements and in a real-time retrieval requirement. GPT3 model has been considered a highly efficient model and has been adopted in different applications on PWV retrieval (Tunalı, 2022; J. Li et al. 2022a, b, c).

3.4 Radiosonde data

Radiosonde technique consists of a balloon that carries sensors that can measure different parameters while ascending in the atmosphere. Radiosonde is then, a direct measure of the vertical profile of atmospheric parameters from which different atmospheric variables can be retrieved. Precipitable Water Vapour from radiosonde data can be obtained by:

$$PWV=\frac{{10}^{3}}{\rho }\underset{z=0}{\overset{z=\infty }{\int }}{\rho }_{W}dz$$
(15)

where ρ and ρW are the liquid water and vapour water densities respectively, and z is the height in meters (Castro-Almazán et al. 2016).

Radiosonde data can be downloaded from different data sets. In this work, radiosonde data was obtained from the Integrated Global Radiosonde Archive (IGRA) version 2 (Durre et al. 2018). Version 2 of the archive contained observations from almost 2800 radiosonde sites around the world and it has a growing volume of data and spatial coverage. IGRA archive has also a data set of derived quantities, including the Precipitable Water Vapour. One remarkable characteristic of this derived data set is that, for its calculation, pressure level is up to 500 hPa. This is justified by the poorer behaviour of the humidity sensors on small temperatures (Ross and Elliott 1996).

The IGRA dataset used in this work can be obtained from the NCDC FTP Web site (online at ftp://ftp.ncdc.noaa.gov/pub/data/igra accessed in November 2022).

The accuracy of radiosonde data is affected by the sensor design, calibration, data processing and contamination (Moradi et al. 2013). Moreover, the changes in humidity sensors on the radiosonde sites introduce a discontinuity in the radiosonde climatic records that can lead to an inaccurate climate trend estimate (Wang and Zhang 2008). Sapucci et al. (2005) focused on the quality of humidity measurement of the radiosonde on tropical areas and its impact over the IWV retrieved. The study concluded that different sensors on the radiosonde had a similar behaviour especially on the lower atmosphere, where a major part of the water vapour is contained, and that the IWV could be retrieved with a rms of only 1.3 kg·m−1. J. Wang and Zhang (2008) studied the uncertainty on the radiosonde data focused on the identification of its systematic errors using GNSS observations. They used 130 matched GNSS and Radiosonde sites, analysing 14 different models of radiosonde. Regarding to the measurement errors of the radiosonde, the cited study concluded that Vaisala RS92 model, the one used at the matched sites in this study, had a dry bias to values of PWV less than 40 mm. Miloshevich et al. (2009) investigated the accuracy of Vaisala RS92 radiosonde model, comparing the humidity measurements of this sensor to simultaneous measurements from three reference instruments, and estimated a precision of 5% of the PWV on the retrieval of the water vapour.

3.5 EUREF REPRO-2

The second EUREF Permanent Network (EPN) reprocessing campaign, EPN REPRO-2, might be used as a reference data set for meteorological and climate research (Pacione et al. 2017). On that campaign, GNSS data of about 280 stations were reprocessed by 5 Analysis Centres using 3 GNSS software packages: Bernese (Dach et al. 2015), GAMIT (Herring et al. 2010) and GIPSY-OASIS II (Webb 1997). The resulting values of the ZTD for the network were evaluated to independent data sets as radiosonde and ERA–Interim, showing an improvement in the overall standard deviation compared with the first reprocessing campaign (EPN REPRO-1). The period covered by the second reprocessing campaign, for the tropospheric products, is from GPS week 834 to GPS week 1824. In order to compare with this latest reprocessing and reference data of ZTD from EUREF, the processing on this study started in July 2013 covering 87 common weeks and with information of the four-weather season. The values of troposphere parameters from EPN REPRO-2 can be downloaded at ftp://igs.bkg.bund.de/EPNrepro2/products/ (accessed in November 2022).

3.6 Statistical parameters

Mean Bias Error (MBE), Standard Deviation (SD) and Root Mean Square (RMS), the statistical parameters used in this study, were calculated using the following formulas:

$$MBE=\frac{\sum_{i=1}^{N}\left(PW{V}_{GNSS}-PW{V}_{RS}\right)}{N}$$
(16)
$$SD=\sqrt{\frac{\sum_{i=1}^{N}{\left(PW{V}_{GNSS}-PW{V}_{RS}\right)}^{2}}{N-1}}$$
(17)
$$RMS=\sqrt{\frac{\sum_{i=1}^{N}{\left(PW{V}_{GNSS}-PW{V}_{RS}\right)}^{2}}{N}}$$
(18)

where PWVGNSS is the Precipitable Water Vapour obtained as explained on Sect. 2 and PWVRS corresponds to the value extracted from the IGRA radiosonde archive.

4 Results

4.1 Validation of ZTD estimates using EUREF REPRO-2

The common weeks between EPN REPRO-2 and this work, from GPS Week 1737 to GPS Week 1824, were analysed on the five GNSS sites of the network that contribute to the EUREF network. Statistical results of the comparison are summarized on Table 3.

Table 3 Overall statistical results of the comparison of ZTD estimated versus EPN Repro 2 from GPS Week 1737 to GPS Week 1824

Results showed a very high correlation factor, with values greater than 0.996 on all sites. Mean Bias Error are low and clearly under one millimetre. Standard deviation and root mean square have no remarkable differences between the sites. Considering all the stations, mean values for correlation factor, MBE, SD and RMS are 0.99, -0.05, 2.44 and 2.46 respectively. The achieved statistical parameters show that the ZTD calculated by the processing are consistent with the official products and are then validated to proceed to the next step.

4.2 Validation of PWV using radiosonde data

Radiosonde, as a traditional technique to retrieve the PWV, it is often used as a reference to validate the results of other techniques, like GNSS. The GNSS and radiosonde site, if not collocated, must be in a maximum range of horizontal and vertical separation to be able to compare their PWV values. In fact, minimizing the vertical difference between the sites minimizes the PWV differences between the two techniques (J. Wang and Zhang 2009). Ohtani and Naito (2000) suggests a maximal horizontal distance of 40 kms with a difference of height of 100 m. Gui et al. (2017) increases the horizontal distance to 50 kms, with the same limit on the height difference. According to those limitations, in this work, two radiosonde sites, managed by the State Meteorological Agency of Spain (AEMET) were used. The radiosonde stations are located near to the ACOR and CANT GNSS sites. Table 4 shows these matched stations and their differences on position and height.

Table 4 Relation of GNSS and RS matched stations

Apart from the spatial difference between the matched sites, the difference in temporal resolution of the PWV obtained from the two techniques must also be considered. While ZTD from the processing of the GNSS observations were obtained with a sampling rate of 15 min, temporal resolution of the two radiosonde sites involved in this study was 12 h, with 2 launches per day, at 0UTC and 12UTC. In addition, the radiosonde is often released before this time to reach the tropopause around the synoptic time. The release time of the sounding is provided on the radiosonde data obtained from the IGRA archive. A review of this release time shows that, in the case of ACOR station, more than 92% of the launches were done more than 15 min before the synoptic time. On the CANT site, around 85% of the radiosondes were released more than 15 min before the synoptic time. This is important because most of the water vapour retrieved by the radiosonde is measured in the first 15 min of its ascending way because it is contained in the lower part of the troposphere. Some studies, taking all of this into account, do not match the GNSS data with radiosonde data at the exact time of 0UTC and 12 UTC. For example, Ning et al. (2012) suggests a shift of 30 min on values from the GNSS data set. Ohtani and Naito (2000) with a temporal resolution of the PWV from the GNSS observations of two hours, used a mean value from their two values calculated before and after the synoptic time. In this study, the mean of the 15 min rate PWV retrieved from GNSS observations from the release time to the synoptic time is calculated and then compared to the PWV retrieved from the radiosonde.

The validation was done between GPS Week 1737 to 1824, with information of each one of the four climate seasons. Within this period, on the ACOR site, values of PWV from the IGRA archive varied from a minimum of 4.83 mm, in December of 2013, on the local season of winter, to a peak maximum of 43.19 mm in October of 2013. Apart from that peak, the maximum values for ACOR site are found in the local summer season. In the case of CANT, the range of PWV varies from a minimum of 3.59 mm to a maximum of 41.08 mm in winter and summer local seasons respectively. According to the values found in the IGRA archive for PWV at the stations, on ACOR, 70% of the PWV values are from 10 to 25 mm while on CANT, same percentage is obtained for values between 15 and 32 mm.

The statistical parameters are shown in Table 5. The two data sets agree well with a high correlation and MBE values are low and under 1.5 mm. The linear regression (Fig. 3) yields a high R2 value for both sites, with values of 0.86 and 0.88 to ACOR and CANT respectively. The slopes of the linear regression lines show a direct relation between the two datasets and the slope of the linear regression, less than 1, indicating the dependence on the amount of PWV since when PWV increases the regression line separates from the diagonal. Besides, considering the intercept values, when PWV are low, the GNSS technique tends to underestimate the value of PWV but, when PWV are high, the technique has the opposite behaviour. So, the difference between the two techniques is greater as the PWV increases. This performance is clearer seen on the CANT site. A similar behaviour can be found in other previous studies (Deblonde et al. 2005; Namaoui et al. 2017).

Table 5 Statistics summary of the comparison between PWV-retrieved from GNSS and Radiosonde
Fig. 3
figure 3

Scatter diagrams of PWV derived from GNSS and radiosonde at a ACOR and b CANT sites

Since the matched radiosonde sites had two launches, at 0 UTC and 12 UTC, the performance at day and night could be studied. Statistical comparisons calculated with Eqs. 10, 11, 12 are shown in Table 6. Correlation coefficients are again high, all above 0.93 and with no significant differences in this parameter between daytime and night-time measurements. Related to the MBE, SD and RMS statistical parameters, on the ACOR site the differences are also not significant showing on CANT site a higher difference on the three parameters associated with the day or night retrieval. This can also be seen in the linear regression over the data (Fig. 4). The analysis shows a very small difference on the ACOR site both on slopes and R2 parameter while on the CANT site differences are more significant. The different behaviour at daytime and night-time of the radiosonde Vaisala RS92-SGP, the one used on ACOR and CANT sites, have been observed to a greater or lesser extent in different studies and it is mainly related to the different performance of the humid sensor because of the solar heating during the day (Cady-Pereira et al. 2008; van Malderen et al. 2014).

Table 6 Statistical parameters of the comparison of Radiosonde and GNSS PWV regarding to the launching time
Fig. 4
figure 4

Scatter plots of the comparison of Radiosonde and GNSS PWV regarding to the launching time at a ACOR 0 UTC, b CANT 0 UTC, c ACOR 12 UTC and (d) CANT 12 UTC

Wang and Zhang (2008) explained that the differences between the two techniques, that can possibly make the expected theoretical error unachievable, come from three main sources: the distance, both horizontal and vertical, between the matched sites, the differences in the measurement technique and finally the errors in the data sets, both radiosonde and GNSS. Regarding the effect of not being collocated, the cited study claimed that there could arise an important difference in the water vapour retrieved from the two techniques, especially with strong precipitable water vapour gradients. About the measurement techniques, the PWV-retrieved from the two techniques comes from a different volume of air. In the case of the GNSS technique, the ZTD corresponds to the vertical mapped value, using mapping functions, from all the paths between the antenna and the satellites, with different azimuths and elevation angles. In the case of the radiosonde, the profile of atmospheric parameters, from which PWV is obtained, is measured in the radiosonde ascending path. This path could be non-vertical, suffering a displacement by the wind. Although the displacement that the radiosonde suffers on its complete path is difficult to quantify, the cited study explained that it could be as much as 20 kms in the horizontal direction in 10 kms of ascent. This displacement could affect more on sites situated near the coast because some part of the PWV profile retrieved by the radiosonde could be measured over the sea, increasing the differences between the GNSS and radiosonde technique (Vaquero-Martínez et al. 2019). Since the ACOR and CANT stations are located on the coast, the above-mentioned effect could have occurred. Related to the errors of the two techniques, even though radiosonde is considered as a reference, it has its own errors. The use of a third technique could further clarify if the differences could come from a mix of errors on the two techniques (Ning et al. 2015). However, the results achieved in this comparison between the two techniques and shown on Table 5 and Table 6, are consistent with other comparisons. For example, similar values of MBE can be found in other studies (Gui et al. 2017; Rezaei and Khazaei 2022). Related to the parameters SD and RMS, similar values can be found in previous studies (Gui et al. 2017; Ohtani and Naito 2000; Torres et al., 2010).

4.3 Characterization of PWV over the area of study

Once the processing was validated, the PWV of each GNSS site were calculated using the TRP and TRO files derived from our Bernese processing, with GPT3 as a model for the pressure and Tm. The calculated values of PWV covered four years, from July 2013 to the end of June 2017, between GPS weeks 1747 to 1955, with a sampling rate of 15 min. Table 7 shows the number of observations (N) and some statistical parameters of the PWV retrieved from GNSS for all the stations, using the daily mean. Table 7 also shows the mean value of PWV considering four seasons: June to August (JJA), September to November (SON), December to February (DJF) and March to May (MMA).

Table 7 Statistical parameters of the 4-year PWV-retrieved. The row ALL corresponds with the mean values in the network of study

Multiannual mean, using all the stations, is 19.70 mm. LROC, RENN and BRST stations have values below this mean. Considering each station, the multiannual maximum mean is found in SCOA, with a value of 21.30 mm while the multiannual minimum mean is found in RENN with 17.85 mm, the only station situated inland. As noticed in Table 7, PWV retrieved from the GNSS observations capture the seasonal PWV variation, with minimum values observed in the local winter season and maximum values in the local summer season. The multiannual daily mean in the local summer season is almost double the corresponding value in the winter season. The minimum mean value is found in winter in RENN of 12.88 and the maximum mean value is 29.61 in the summer season at SCOA site. The mean of autumn and spring seasons are between the values of the other extreme seasons, although the smallest differences are found between winter and spring season. As noticed in Table 7, PWV retrieved from the GNSS observations using the blind empirical model shows the seasonal PWV variation, with minimum values observed in the local winter season and maximum values in the local summer season. This behaviour can also be observed using the multiannual monthly mean of the PWV (Fig. 5). Clearly, in all the sites, the PWV rises from the minimum in February to its maximum in July. This evident seasonal variation is related with the rise in temperature associates with the changes on seasons in this area. SCOA, BIAZ and CANT, with higher values according to Table 7, have the higher changes between months, while on RENN or BRST the variation are small.

Fig. 5
figure 5

Distribution of multiannual monthly mean of PWV in each GNSS site

The temporal variation of PWV series were described from its mean and annual and semi-annual terms to the retrieved PWV in each GNSS site. The fit model was constructed as:

$$PWV={PWV}_{0}+{a}_{1}\mathrm{sin}\left(2\pi t\right)+{a}_{2}\mathrm{cos}\left(2\pi t\right)+{a}_{3}\mathrm{sin}\left(4\pi t\right)+{a}_{4}\mathrm{cos}\left(4\pi t\right)$$
(19)

where PWV0 is the estimated mean, a1 is the linear trend and t is the time in years. The annual amplitudes are the pair (a2, a3) while (a4, a5) corresponds to the semi-annual amplitudes. Parameters on Eq. 19 were estimated using the least square method and setting the first day of the series, 1st of July 2013, as the time epoch t = 0. Table 8 shows the results on each site.

Table 8 Estimated mean and annual and semi-annual terms of GNSS-retrieved PWV series

Annual amplitudes are clearly more important than the semi-annual in all the sites. Sites that are located closer, tend to have similar amplitudes. The annual amplitudes are stronger in SCOA, BIAZ and CANT, sites with the biggest variability in their PWV over the 4-year period. This annual cycle can also be observed in Fig. 6, which shows the daily mean for the whole four-year period on each site. Regarding each site, minimum values were found on BRST and RENN with values less than 1 mm, while the maximum, higher than 44 mm were found on CANT and SCOA, both in 2015. These two stations presented a high variability behaviour for the whole period. Maximum values of 43.4 and 43.7 were also seen in 2015 in VIGO and ACOR respectively. In general, minimums values are observed in most of the sites on the end of 2013 or the beginning of 2014. The distribution of mean daily values shows a bigger amplitude between maximum and minimum in 2015 and 2014 in VIGO, ACOR, CANT, SCOA, BIAZ and MZAN and a clear lower amplitude in 2016. However, this tendency is the opposite in LROC, RENN and BRST where the differences between maximum and minimum are bigger in 2016.

Fig. 6
figure 6

Time series of daily mean PWV on the GNSS sites from July 2013 to June 2017 at a VIGO, b ACOR, c CANT, d SCOA, e BIAZ, f MZAN, g LROC, h RENN and i BRST

Related to the spatial distribution of water vapour in the area of study, the PWV multiannual mean tends to decrease with increasing latitude, according to the distribution of the multiannual mean with the latitude (Fig. 7). In fact, the line on Fig. 7, that represents the best linear fit between the multiannual PWV mean and latitude, has a negative slope.

Fig. 7
figure 7

Distribution of multiannual mean of PWV with latitude

The diurnal cycle of PWV is also one of the most important of its variations (J. Wang and Zhang 2009). The high temporal resolution of the PWV retrieved from the GNSS observations, 15 min, facilitates the study of the short-term variations of the PWV on every site. The daily precipitable water vapour anomaly is calculated as the difference between the mean daily and each PWV value calculated from the processing. Figure 8 shows the multiannual diurnal anomalies on each site, calculated through the entire period, and considering All seasons, and each of the four seasons: DJF, MAM, JJA and SON.

Fig. 8
figure 8

Multiannual diurnal PWV anomalies considering All Seasons, and each DJF, MAM, JJA and SON season as indicated in the legend, at a VIGO, b ACOR, c CANT, d SCOA, e BIAZ, f MZAN, g LROC, h RENN and i BRST

Using the whole data set of the stations, the maximum multiannual amplitude considering all seasons is found in CANT with a value of 1.31 while LROC has the smallest: 0.54 mm. Diurnal variation is clearly smaller at night than during the daytime at most of the stations. Related to the extreme values, most of the sites have their minimum value between 5 and 7 UTC, while the maximum values are observed between 20 and 22 UTC except at the RENN site where the maximum is at 19 UTC but with a close value to the one observed at 20 UTC. Temperature mainly affects the diurnal cycle (Ortiz de Galisteo et al. 2011). This explain the minimum at night and the morning increase of PWV according to the rise of temperature that was also found in a study of the characterization of PWV in the Gulf of Cadiz, in Spain (Torres et al., 2010). Some stations exhibit double peaks on their multiannual diurnal anomalies. For example, MZAN has a minimum at 7 UTC, but another minimum is observed on this site at 16 UTC, to reach the maximum at 22 UTC. A similar behaviour is found in LROC, with two minimums, one at 6 UTC and another at 16 UTC to reach the maximum at 21 UTC, but with lower values. At the ACOR site, a change is observed on the diurnal anomaly around 18 UTC.

The diurnal anomalies also capture the seasonal variability (Fig. 8). Thus, as expected, the highest values of amplitude occur in summertime, with SCOA and its neighbour site BIAZ, with the highest amplitudes in this season. In fact, SCOA has the biggest value of amplitude with 2.04 mm in the summer season. The differences between summer and winter are more significant in VIGO, RENN, SCOA and ACOR where amplitude values are doubled in summer than in winter. This performance is consistent with the study in Europe from Wang and Zhang (2009) where the diurnal cycle detected at summertime was clearly higher than in winter. In the case of CANT, the site with the highest amplitude considering the entire 4-years period, shows a similar behaviour among the four seasons, although it maintains a greater variability in summer and spring than winter and autumn. In most sites, spring and autumn have intermediate amplitudes compared to those found for summer and winter; spring however tends to be more like winter. LROC, RENN and BRST, the sites with higher latitude, have less amplitude in every season than the rest of the sites.

There are many different causes for the diurnal variations of PWV. Dai et al. (2002) in their study over North America, explained some principal causes, for example, the surface evapotranspiration, the atmospheric large-scale vertical motion, the atmospheric low-level moisture convergence, the precipitation and, finally, the vertical mixing in the lowest atmosphere. Wind also plays an important role. On their study over a network in Japan, G. Li et al. (2008) concluded that, the phase of the diurnal cycle of stations situated on the coast, like the sites of this study, is clearly different from the one found on other areas because of the sea breeze that introduces humidity on the atmosphere of coastal sites and affects the PWV variation during the day. Related to the seasonal variation, Kannemadugu et al. (2022) in their study over East Africa concluded that the small variations of temperature during the winter season accounts for the small amplitude found on this season as in the occurrence in this study. In summary, the diurnal variations of the PWV on every site are caused by the complex mix of the aforementioned causes and other factors such as the orography that plays an important role. Therefore, these short-term variations, as well as the seasonal variability found with GNSS-retrieved PWV could be applied on further studies on climate over the area of study and can be useful to model the complex variability of the PWV. Biondi (2015) indicates that, the short-term variations captured by the GNSS technique, mixed with a correct distribution of the GNSS sites, help to characterize the rainy events. Consequently, GNSS-retrieved PWV could be used to improve real time short-term rainfall forecasting (Cao et al. 2015). In fact, the characterization of GNSS-retrieved PWV before a rainfall event has been used to develop rainfall forecasting models (H. Li et al., 2020, L. Li et al. 2022a, b, c). Besides that, Ortiz de Galisteo et al. (2011) claimed that the knowledge of the diurnal cycle of PWV could improve the atmospheric correction of the satellite images.

5 Summary and conclusions

This study presents the characterization of PWV on the Atlantic coast of Spain and France using GNSS observations without any meteorological in situ measurements, using the empirical model GPT3. The starting point of the PWV retrieval from the GNSS observations is the calculation of the ZTD that is estimated with other parameters in the double difference phase processing of the observations. As given its importance as the starting point, the ZTD obtained from the processing were validated with the official values from EUREF REPRO-2 obtaining a mean correlation factor, MBE, SD and RMS of 0.99, -0.05, 2.44 and 2.46 respectively. Then, the rest of the parameters needed for the calculation of the PWV, as the pressure and the weighted mean temperature were extracted from the global blind model GPT3, using its grid of 1° × 1° resolution. Thereafter, 15-min rate PWV values were calculated through a four-year period at the nine sites distributed on the area of study. The PWV calculated was validated using two matched Radiosonde stations. The statistical parameters found showed a good agreement between the two data sets with values of MBE less than 1.5 mm and mean SD and RMS of 2.66 and 2.95 mm respectively. Even if the values agree with previous studies, it will be desirable to include a third independent PWV retrieval technique in order to verify the discovered differences.

The characterization of PWV over the area of study using GPT3 model clearly captures the seasonal variability with the maximum value on local summer season and minimums on winter season in all sites. In fact, the multiannual mean in JJA season is almost double than of DJF season. Maximum mean values are retrieved on SCOA, CANT and BIAZ with 21.30, 20.83, 20.13 mm respectively, while the minimum corresponds to the only inland station: RENN with 17.85 mm. The annual cycle is clearly more important, in all the sites, than the semi-annual cycle. Related to the spatial distribution, PWV multiannual mean tends to decrease as the latitude increases. The multiannual diurnal anomalies also reflect the seasonal variability, with higher amplitudes in summer local season and smallest on winter local season. The highest amplitudes found in the CANT site that exhibit a great variability, followed by SCOA and BIAZ. Even though the sites have different behaviour on their diurnal variation, a similar pattern is detected with a minimum in the late-night hours to reach a peak in the afternoon.

The PWV characterization on the area using GNSS observations demonstrates the potential of the use of the GNSS observations as an excellent source of information of the spatial and temporal behaviour of the water vapour present in the atmosphere, even without meteorological in situ measurements. Future research on the long-term PWV variations will be desirable, enlarging the time series of GNSS-retrieved PWV over the area in order to be able to estimate the trends. Moreover, future investigation of the behaviour of the PWV before the severe rainfall events in the area will be done to develop a real time short-term rainfall forecasting model with the use of the empirical model GPT3.