Isoscape of precipitation amount-weighted annual mean tritium ( 3 H ) activity from 1976 to 2017 for the Adriatic-Pannonian region

Tritium (H) as a constituent of the water molecule is an important natural tracer in hydrological sciences. The 15 anthropogenic tritium introduced into the atmosphere became unintentionally an excellent tracer of processes on the time scale of up to a 100 years. A prerequisite for tritium applications is to know the distribution of tritium activity in precipitation. Here we present the spatially continuous gridded database (isoscapes) for amount-weighted annual mean tritium activity in precipitation for the period 1976 to 2017 on 1×1 km grids for the Adriatic-Pannonian Region (using 39 stations), with a special focus on post-2010 years which are not represented by existing global models. Three stations were used to check the model 20 performance independently confirming its capability to reproducing the spatiotemporal tritium variability in the region. This ‘Regional model’ is capable of providing reliable spatiotemporal input data for hydrogeological application at any place within Slovenia, Hungary and its surroundings. Results also show a decrease in the average spatial representativity of the stations regarding tritium activity in precipitation from ~600 km in 1970s when bomb-tritium was still prevailing in precipitation, to ~300 km in the 2010s. The post-2010 isoscapes can serve as benchmarks for background tritium activity for the region, helping 25 to determine local increases of technogenic tritium from these backgrounds. The gridded tritium isoscape is available in NetCDF-4 at doi: 10.1594/PANGAEA.896938 (Kern et al., 2019).

The quality of such curves is vital for the reliability of a hydrological model outputs when employed as input signal/data in hydrological modeling/calculations (Koeniger et al., 2008;Miljevića et al., 1992). Indeed, it has recently been found that the (in)accuracy of the used precipitation tritium time series is the key uncertainty factor for groundwater recharge estimations (Li et al., 2019).
The aim of this study was to create a spatially continuous gridded database for tritium (isoscape) in precipitation across the Adriatic-Pannonian Realm for the decades around the turn of the 21st century with a special focus on the post-2010 which is not covered by the existing global models.

2.
Materials and Methods

Used 3 H and precipitation data
An initial dataset was collected with 8053 monthly precipitation tritium activity values from 45 stations (GNIP ((IAEA, 2016)), ANIP (Kralik et al., 2003), (Krajcar Bronić et al., 2020;Palcsu et al., 2018;Vreča et al., 2006;Vreča et al., 2015;Vreča et al., 2014; current project) covering the period from Jan 1961 to Dec 2017. To maximize the spatiotemporal 85 density of the data set not only the Adriatic-Pannonian region, but the bordering areas were included in the analyses as well.
The availability of 3 H data varied in the investigated time period. Three time horizons were outlined with a relatively high abundance of data: early 1980s (number of annual data (na)≈15), early 2000s (na ≈14) and the early 2010s (na ≈21) (Fig. 1a).
Until 1973 tritium activity data was only available from Austria. Monitoring of isotopes in precipitation on a larger scale in the region began in the mid-1970s in Belgrade (RS), Zagreb (HR) and Budapest (HU) as well. Following the initiation of these 90 measurements becomes the network suitable -specifically from 1976 -for the spatiotemporal analysis of the large-scale variability of precipitation tritium activity in the region. Between 2003 and 2005, the number of stations dropped (<9, Fig. 1a) due to a halt in the data collection of the Austrian stations. This was the lowest number of active stations in the investigated period. For the purpose of further calculations, the geographical coordinates of the stations were converted from latitude and longitude (EPSG: 4326, WGS84 projection) to metric coordinate system (EPSG:3857, WGS 84 / Pseudo-Mercator projection), 95 since interpolation (variography see Sect. 2.3) has to be done on a metric scale. To be able to derive amount weighted annual tritium activity averages, (0.5° × 0.5°) monthly precipitation amounts were used from the GPCC database (Becker et al., 2013), derived as precipitation anomalies at stations interpolated and then superimposed on the GPCC Climatology V2011 (Meyer-Christoffer et al., 2011). averages later used in the interpolation . The largest distance between the neighboring active stations of the studied 3 H network in each year for 1976-2017 B). The spatial distribution of the monitoring sites C), where the height of the blue columns is proportional to 105 the number of monthly data available between 1976 and 2017 at a given station; max=479 data at Podersdorf Austria. The country codes follow the ISO-3166-1 ALPHA-2. The basemap was taken from Bing maps, HERE Technologies 2019; accessed on 27.09.2019.

Data preprocessing
A sequential univariate outlier detection procedure (Ben-Gal, 2005) was applied to the data to find possible outlying values, 110 which deviate to a high extent from the other observations (Barnett and Lewis, 1974;Hawkins, 1980). During the procedure, the time series of the stations were pairwise compared for each year. The approach is similar to the relative homogeneity test applied to meteorological data, in which e.g. a candidate station's time series is compared to its neighboring stations'; e.g. (Alexandersson, 1986;Lindau and Venema, 2019;Sugahara et al., 2012).
To avoid comparing a station with all the others from the network, including distant ones recording different environmental 115 conditions (e.g. Alpine region vs. Great Hungarian Plain), the comparison was done only within a given search radius. The network was screened for each station's distance to its nearest neighbor for each year. Then out of all the years, the most frequently occurring largest nearest neighbor (~320 km) was chosen (Fig. 1b) to serve as the search radius for the sequential univariate outlier detection. There were 15 years when a station did not have a pair to compare it with. In 1976In , 1993In -2000In and 2003 it was Belgrade-, while between 2013 and 2017 it was Debrecen due to their relatively isolated location from the 120 others in the network. These are the southeasternmost and northeasternmost stations (Fig. 1c).
Pairwise differences of 3 H data in monthly steps were calculated for each station with its neighbors within the ~320 km search radius. These pairwise differences were then averaged per month and the values belonging to the same calendar year were handled together. Due to the decrease in atmospheric concentration in tritium Rozanski et al., 1991), the difference values were not comparable between the years, so the outliers were identified annually. The monthly average 125 difference values were annually standardized.
It was found that the standardized mean differences were mostly within the ±1 interval (82 %; Fig. 2) suggesting the usually small difference between neighboring records. In rare occasions (n=6 occurrences; 0.09%) the difference value was outside the ±7 interval. These deviations were considered as threshold, determining the set of possibly erroneous data (outlier (Ben-Gal, 2005)), which were investigated one-by-one, if possible by consulting the data providers. For example, in Dec 1994 at 130 Zagreb, the standardized differences indicated a possible error (d= -9.33), which coincided with experimental research in the nearby facility in which technogenic tritium was used (Krajcar Bronić et al., 2020), thus the sample was excluded from the analysis (Fig. 2).   . The grey shaded background highlights the ±1 standardized difference interval. The standardized difference of -9 (in a red rectangle) corresponds to an outlier measured at Zagreb (Dec 1994); it is shown on the inset map along with the 3 H records from its neighbors within the search radius. For further details see text.

140
Annual amount-weighted means were only calculated if at least 85% of the fallen precipitation was analyzed for 3 H. If more than 15% of the fallen precipitation was not analyzed for 3 H, the year in question will be referred to as an "incomplete year". This required completeness is stricter criterion than the GNIP protocol (70%; (IAEA, 1992). These amount-weighted annual averages served as the input values for deriving the isoscapes with variography.
A robust hemispheric-scale pattern is a poleward increasing trend of precipitation 3 H (Rozanski et al., 1991). Regression 145 analysis between geographical latitude (using the metric coordinates in EPSG:3857) and amount weighted-annual precipitation 3 H activity concentration mostly yielded insignificant linear relationships or contradictory to what was expected (i.e. poleward decreasing values e.g. 1987). The limited latitudinal extent of the study area (°5) might explain the failure to detect the expected relationship. However, due to the lack of a clear spatial trend statistical trend removal was not conducted on the amountweighted annual mean 3 H activities instead they were used for regional isoscape modeling. 150

Derivation of precipitation amount-weighted annual mean tritium activity isoscapes
Semivariograms (Webster and Oliver, 2008) were used as the weighting function in kriging (Cressie, 1990) to explore the spatial variance of precipitation amount-weighted annual mean 3 H activity for the stations of the Adriatic-Pannonian region. The empirical semivariogram may be calculated using the Matheron algorithm (Matheron, 1965), where (ℎ) is the 155 semivariogram and ( ) and ( + ℎ) are the values of a parameter sampled at a planar distance |ℎ| from each other is the number of lag-h differences, i.e. n× (n-1)/2 and n corresponds to the number of sampling locations at a distance h.
The most important properties of the semivariogram are the nugget, quantifying the variance at the sampling location 160 (including information regarding the error of the sampling), the sill that is, the level at which the variogram stabilizes, which is the sum of the nugget (c0) and the reduced sill (c), and the range (a), which is the distance within which the samples have an influence on each other and beyond which they are uncorrelated (Chilès and Delfiner, 2012). If the semivariogram does not have a rising part and the points of the empirical semivariogram align parallel to the abscissa, a nugget-type variogram is obtained. In this case, the sampling frequency is insufficient to estimate the sampling range using variography (Hatvani et al., 165 2017).
For geostatistical modeling (e.g. kriging), theoretical semivariograms have to be used to approximate the empirical ones (Cressie, 1990). Gaussian semivariograms were obtained with a maximum lag distance of 400 km and 11 uniform bins (steps) in order to achieve the most balanced number of station pairs per bin in the analysis. The effective range (ae), which is the distance within which the samples have an influence on each other and beyond which they are uncorrelated (Chilès and 170 Delfiner, 2012) were determined and used to evaluate the spatial representativity of the network. The reported ranges in the study area are planar distances in km; conversion to geodetic distance in the region: dplanar×0.678 ≈ dgeodetic.
In a preliminary screening it was found that semivariograms had to have at least 3 station pairs in the first bin and more than 14 pairs in the first 3 bins to be applicable for interpolation; these were the minimum requirements for kriging. Semivariograms perfectly applicable for interpolation were obtained from years 1977, 1982, 2007, 2010, 2011 and 2012. The number of active 175 stations in these years varied between 13 and 24. These years where further on used as the reference years. The years with a reduced number of available stations (Fig. 1a) produced semivariograms not applicable for kriging (for technical explanation see Appendix 1), because the data were sporadically spread in space and/or none of the stations provided continuous measurements in time.
Both types of data gaps can be classified as missing at random (MAR) (Little and Rubin, 2002). Because most modern data-180 imputation-methods start by assuming the missing data is MAR, imputation tools could have been applied in years with insufficient data density for proper interpolation. However, in every case, no method can provide an 'automatic' solution to the problem of missing data, and any approach must be used with caution considering the context of the problem (Kenward and Carpenter, 2007); for instance, the accuracy of the imputed value will not be optimal and the spatial correlation and intravariable relationships will be corrupted (Barnett and Deutsch, 2015). Finally, 42 stations were considered for further evaluation out of which 39 stations were used for tritium isoscape derivation while three were excluded from interpolation and used to test the performance of the interpolated products (Table 1)

Tritium isoscapes (1976-2017)
According to the obtained regional gridded precipitation amount-weighted annual mean 3 H activity time series for the Adriatic-Pannonian region (referred to hereinafter as Regional model) the monitoring network provides a proper representativity of the study area (e.g. Fig. 3, in-set maps). 215 The most striking long-term temporal pattern (decrease in precipitation 3 H activity; Fig. 3) prevailing in the whole region seen from the isoscapes is also reflected in time series of distant locations (Fig. 4). Moreover, the distinctive interannual fluctuation of amount-weighted annual mean 3 H activity at Budapest and Ljubljana (Fig. 4) also indicate that the Regional model produced differing sub-regional variability over the modelled time. For instance, the maxima of the modelled precipitation 3 H activity occurs in 1979 and 1976, while a local minima from the early '90s in 1990 and 1991 at Budapest (Fig. 4a) and Ljubljana (Fig.  220 4b) are observed, respectively.
Although no significant relationship was documented between latitude and/or continentality, still increasing precipitation 3 H activity was observable inland with the lowest values documented along the Slovenian and northern Croatian coast in all years (see e.g. 2010; Fig. 3). This pattern can be related to the generally observed lower activity at maritime coastal stations due to the higher contribution of primary marine evaporation practically free from 3 H (Eastoe et al., 2012;Rozanski et al., 1991;225 Vreča et al., 2006) and higher contribution of recycled modern meteoric water over the continent. For instance, moisture originating from continental Europe and the Atlantic Ocean was found to be distinct regarding tritium concentrations (8.8 TU and ~0 TU, respectively) (Juhlke et al., 2019).
Results show a decrease in the spatial autocorrelation of tritium activity concentration of precipitation from the 1970s to the 2010s (Fig. 3): ~600 km in the 1970s, ~450 km in the 1980s, to ~300 km in the 2010s. This period  was 230 characterized by the removal of bomb-tritium from the atmosphere (Araguas-Araguas et al., 1996;Palcsu et al., 2018). The overwhelming activity of bomb-produced 3 H was several orders of magnitude higher than the natural background (Rozanski et al., 1991), and largely masked the smaller-scale natural variability. During last 2-3 decades the tritium activity in precipitation has declined globally and regionally, approaching the natural pre-bomb level and the bomb-tritium is barely present in modern precipitation. Since, in the Adriatic Pannonian region, the 3 H activity in precipitation approached natural 235 levels by the early-1990s (Krajcar Bronić et al., 2020;Palcsu et al., 2018;, it can be expected that the ~300 km range obtained for the 2010s reflects the range of similarity of natural 3 H variability in the study area (SE Europe and E Central Europe). Regarding spatial coverage, the northwestern part of the region was much more represented in all years, due to the expected denser station network along the Austrian border with Slovenia and Hungary ( Fig. 1c; Fig 3).   (upper panels: 1977, 1982, 2007; lower panels: 2010, 2011, 2012) in the Adriatic-Pannonian Region. The areas outside the union of the range ellipses of a given year are dimmed and the Adriatic Sea marked in white. Isoscape grid resolution: 1 × 1 km. Easting and northing in 10 5 km. The inset figures show the empirical-(empty black squares) and theoretical semivariograms (blue line) used for kriging along with the obtained effective ranges (ae planar distances in km) and 245 the fit (r 2 ) of the theoretical semivariograms. The dotted horizontal line indicates the average variance.

Verification of goodness of interpolation
Two of the longest records from both Slovenia (Ljubljana) and Hungary (Budapest) illustrate the performance of the estimations and their potential in mitigating lack of data. Budapest-and Ljubljana 3 H records -used both in the variograms of 250 the "anchor years" and in the interpolationwere compared to the interpolated product's time series of the nearest grid cell (Fig. 4). In the years when the measured values were used in interpolation, there is an expected perfect match between the measured and modelled values. It becomes clear that the estimated records are more than capable in filling the gaps of the measured time series, when there were no measurements (e.g. Ljubljana: 1985 and1996;Fig. 4b) or in the case of "incomplete years", when the ratio of fallen precipitation not analyzed for 3 H in a given year was >15% (e.g. Budapest: 1987 and1991, 255 Fig. 4a;Ljubljana: 1986, 1997-1998, 2000). In these particular years, when the measured 3 H values were not used for interpolation, the modelled values seem more capable of reproducing the actual 3 H variability using the neighboring stations' than from the fragmented 3 H data of the incomplete year.
The average differences between the Regional model and measured values were -0.03TU for Budapest, and 1.13TU for Ljubljana, excluding the years with not enough precipitation represented. In the meanwhile, the average difference in the so-260 called incomplete years was ~17 to 0 TU for Budapest and Ljubljana, with a general tendency of obtaining higher differences with a higher ratio of precipitation not represented by tritium measurements. It is noteworthy, that although the short-term intradecadal variability of atmospheric tritium is different at the two sites, their long-term decrease concurs even at a ~ 400 km distance, again indicating the goodness of the interpolation. The presented Regional model of tritium activity was compared with the spatially corresponding output of both currently 265 available global precipitation tritium isoscapes: the Modified global model of tritium in precipitation (MGMTP (Zhang et al., 2011)) and the Global inverse distance weighted model (GIDW) at Budapest (Fig. 4a) and Ljubljana (Fig. 4b). Between 1975 and 1980 the Regional model's and the MGMTP's estimates are very similar and resemble the actual weighted annual mean precipitation 3 H at Budapest. However, only at Ljubljana is the MGMTP capable of steadily reproducing the actual measurements until the late-1990. Afterwards, it indicates solely negative values, which are uninterpretable, just as most of 270 the MGMTP predicted values at Budapest after 1980. In the meanwhile, the Regional model gave much more accurate and reliable results (Fig. 4) as discussed above. Note here, that the weak estimation of the MGMTP can be attributed to the difficulties in reading the precipitation tritium activity values from the only available output (isoline map) of the model and the undocumented factors of the model in given years.
The GIDW (Jasechko and Taylor, 2015) model was capable of reproducing the measured precipitation tritium values much 275 more accurately at all locations than the MGMTP (Fig. 4). Nevertheless, the GIDW model produced a striking overestimation at the beginning of the modelled period, for example, in 1977, when the measured values at Budapest were overestimated by >20 TU (Fig. 4a). On the contrary, the GIDW model underestimated the actual values from 1981 to 1991, except for one year (Fig. 4a). It should be noted, that the Regional model gave an even better regional estimate, then either of the global models.  it was >0%. The empty circles indicate an "incomplete year" in which the given 3 H value was not used for interpolation.
As an additional out-of-sample verification, the measured precipitation tritium records at stations Zgornja Radovna (2010), Siófok (2013 and Malinska (Dec 2000 and2001) were compared to the Regional model's estimated 3 H time series of the grid closest to the stations. The average annual difference between the modelled and measured values was 3.8 TU 290 in 2001 at Malinska (Fig. 5a), 0.1 TU at Zgornja Radovna (Fig. 5b) and -1.7 TU at Siófok stations (Fig. 5c), while the st. dev. of the differences was 0.3 and 1.6 TU for Zgornja Radovna and Siófok respectively. The Regional model estimated annual amount-weighted 3 H activity at Zgornja Radovna very accurately, while the somewhat higher difference at Siófok could be explained by the closeness of the largest shallow freshwater lake in Central Europe, Lake Balaton (Hatvani et al., 2014). The mean residence time in the largest basin of the lake, Siófok Basin, was estimated to be between 2 and 6 yrs in the 1990s 295 (Istvánovics et al., 2002), which presumably in the same range in the 2010s as well. Keeping in mind the gradual decrease of 3 H in meteoric waters (in the region e.g. Fig. 4), the evaporation from this 'aged' reservoir can provide an isotopically detectable contribution to the atmospheric moisture measured at Siófok station, resulting in higher tritium activity values than the modelled ones (Fig. 5c).
The high difference (+3.8 TU) between the Regional model and the measured values at Malinska can be attributed to the high 300 portion of precipitation (20%) not having corresponding tritium measurements in either year. Moreover, at Malinska, the Regional model provided more reliable estimates than the MGMTP, which produced negative -thus meaningless -values in the period when direct measurements were available (Fig. 5a).

5.
Possibility of applications (outlook, conclusions) Continuous long-term records of tritium in precipitation are scarcely available worldwide, thus estimations or modelling are necessary to exploit its potential in hydrological researches. In order to decrease the uncertainty of tritium activity in the hydrological models, the application of regional 3 H models have to be increased, since these are more capable of producing 315 accurate estimations than global ones (Stewart and Morgenstern, 2016).
Instead of using remote station data or ad hoc composite curves, site specific time-series retrieved from the presented Regional precipitation amount-weighted annual mean 3 H isoscapes should be used. These isoscapes  can serve as a reference dataset for studies on infiltration dynamics, water transport through various compartments of the hydrological cycle, mixing processes, run-off modelling; e.g. to estimate mean residence time in surface waters and groundwater (Kanduč et al., 320 2014;Ozyurt et al., 2014;Szucs et al., 2015). As a specific type of hydrogeological application, the Regional model of 3 H time-series will serve as a benchmark in estimating the mean infiltration age of dripwater (Kluge et al., 2010)  The higher precipitation 3 H activity observed at a lakeshore station (Fig. 5c) reflects moisture recycling from the aged lake 325 surface water via evaporation to the local precipitation. The observed deviation highlights the potential of the database to reveal sub-regional anomalous local sources in the hydrological cycle. As a special case the post-2010 isoscapes can serve as benchmarks for background tritium activity for the region, helping to determine local increases of technogenic tritium from these backgrounds.
Our Regional model was able to provide better estimates than either of the global models for the study area. Prior to 1975 we 330 encourage the use of the GIDW model's estimations (Jasechko and Taylor, 2015) as a reference for studies dealing with precipitation tritium activity. The Regional model and the GIDW model should be spliced together at 1975 and can be used together in the need of a semi-centennial precipitation tritium activity dataset.

6.
Data format and availability 335 The final product, the spatially continuous annual (1976-2017) 1×1 km grids of precipitation amount-weighted annual mean tritium activity for the Adriatic-Pannonian Region is provided in a netCDF-4 (net-work common data form) format available at PANGAEA (https://doi.pangaea.de/10.1594/PANGAEA.896938) , compiled using the EPSG 3857 projection. A script written to be able to browse the dataset and convert the projection to EPGS 4326 is provided in the supplement. 340 https://doi.org/10.5194/essd-2019-244