The exploratory dataset of isotopic composition of different water sources across Kazakhstan

This work presents the dataset of stable water isotopes of oxygen and hydrogen measured in water samples from different sources (precipitation, surface water, groundwater, tap water) across Kazakhstan from 2017 to 2018 and from 2020 to 2023. The dataset includes results on isotopic composition of 399 water samples, namely precipitation: event-based (n = 108), cumulative monthly (n = 22); surface water: lakes, reservoirs, brooks, rivers, channels (n = 175), groundwater: shallow and artesian groundwater, spring (n = 85), tapwater (n = 9). For each sample name of the source, location, latitude, longitude and date of sampling, measurement uncertainty (one standard deviation) are available. The samples were assessed by plotting the data in dual δ18O vs. δ2H isotope space with reference to values found in the published literature and fitting a linear regression equation for Astana (event) precipitation. Overall, this is the first dataset covering wide range of sources across Kazakhstan, which could be used by global and regional water resource assessments and studies such as tracing water sources, hydrograph separation and end-member analyses, isotope mass balance, evapotranspiration partitioning, residence time analysis and groundwater recharge.

isotope space with reference to values found in the published literature and fitting a linear regression equation for Astana (event) precipitation.Overall, this is the first dataset covering wide range of sources across Kazakhstan, which could be used by global and regional water resource assessments and studies such as tracing water sources, hydrograph separation and end-member analyses, isotope mass balance, evapotranspiration partitioning, residence time analysis and groundwater recharge.
© Rainfall event samples were collected during abundant precipitation events using a large plastic container and immediately transferred into the vials and sealed.Monthly cumulative rain samples were collected by Palmex Rain Sampler (RS-1, Palmex d.o.o) and at the end of the month or the first day of the next month samples were transferred into the sealed bottles.Snow samples were collected after the end of the snowfall event and melted at room temperature before filling in 20 mL borosilicate glass or plastic scintillation vials with screw caps.Lake water and streamflow samples were collected by grab sampling at the shoreline during field trips.Groundwater samples were collected from boreholes using a bailer and transferred to the sealed bottles.Lake water, groundwater and streamflow samples were usually collected in duplicate during each sampling.Immediately after collection, all samples were sealed with Parafilm M to avoid evaporation.The samples were stored at room temperature or in a refrigerator until analysis.Samples were analyzed using two methods (Instruments): (

Value of the Data
• This is the first dataset [1] on isotopic composition of different water sources across Kazakhstan, that can be a useful reference for further isotope studies of Kazakhstan's water resources.
• The dataset contains the first three-year data of event-based (from the samples collected during or immediately after precipitation events, n = 87) and monthly precipitation data ( n = 22) for Northern Kazakhstan (Astana city) that allows to derive Local Meteoric Water Line for this region ( Fig. 2 ).• The presented data has isotope values from: large endorheic lakes in Central Asia such as Lake Balkhash, North (Small) Aral, Tushybas lake, Issyk-Kul Lake; and smaller regional lakes e.g.Kambash lake in the Aral Sea Basin; rivers e.g.Syr Darya, Nura, Yesil, Ertis (Irtysh).
• This dataset contains data on groundwater isotope composition from both shallow (unconfined) and artesian (confined) aquifers.• It extends the data of the previously published local dataset of precipitation, groundwater, streams and lake samples [2] on Burabay collected from 2015 to 2016 used to study isotope mass balance of lakes and regional hydrology.• The dataset also contains measurement error (one standard deviation) that can be used for uncertainty estimation.

Background
The main goal for producing this dataset was to generate a baseline exploratory data on isotopic composition of water samples from different sources for whole Kazakhstan.Global and regional assessments of water resources and climate based on environmental tracers often rely on compilation of data from past published work [ 3 , 4 ] or decades of observations [5] .In many cases, such analyses are often limited to data-rich regions of North America, Europe or rely on the older data [ 6 , 7 ].Moreover, in the published isotope studies in Kazakhstan [ 8 , 9 ] the underlying data is rarely available [2] .Considering the large area of Kazakhstan and wide range of landscapes and climate zones, the dataset can serve a good reference for global and regional groundwater recharge, isotope mass balance, paleoclimate and end-member analysis studies or aid local investigations [10] .

Data Description
The presented dataset contains results on stable isotopic characterization of water samples collected during 2017-2018 and 2020-2023 across Kazakhstan ( Fig. 1 ), which is located in the center of Eurasia.The data attached to the paper [1] is an original dataset showing the oxygen and hydrogen isotopic ratios ( δ 18 O and δ 2 H) expressed in per mil ( ‰ ) after normalization of raw measurements with international accepted reference standards (i.e.VSMOW2 and SLAP2) along with measurements error reported as one standard deviation.The dataset ( n = 399) consist of following water source types: precipitation ( n = 130) -event precipitation ( n = 108), monthly

Field methods and sampling
The water samples were usually collected in 20 ml plastic or borosilicate glass scintillation vials with screw caps and sealed with Parafilm M to prevent evaporation.The samples were stored at room temperature at a dark place or in a refrigerator.Prior to analysis, the samples were usually filtered through 0.45 μm PTFE syringe filter in 2 ml screw top vials with PTFE caps following the established procedure [2] .Samples were labeled in the field with sample ID, date and location.The coordinates were recorded using mobile phone GPS or available GPS device and verified afterwards with Google Earth.
Rainfall event samples were collected during abundant precipitation events using a large plastic container, where the water was immediately transferred into the vials and sealed.Monthly cumulative rain samples were collected by Palmex Rain Sampler (Model RS-1, Palmex d.o.o.) located at approximately 1.5 m height at Nazarbayev University campus at the end of the month or the beginning of the next month.Snow samples were collected after the end of the snowfall event and melted at room temperature before filling into vials.Lake water and streamflow samples were collected during the ice-free period by grab sampling at the shoreline and avoiding poorly mixed zones.Groundwater samples were collected from boreholes using a bailer and transferred to sealed bottles, or from the taps for free-flowing (artesian) wells.Lake water, groundwater and streamflow samples were usually collected in duplicate during each sampling.

Laboratory analysis
The samples collected in 2017-2018 were analyzed at GFZ (German Research Centre for Geosciences, Potsdam, Germany) on the Picarro isotope analyzer (model L2130-I , see below the detailed description).Before the shipping to GFZ, the samples were filtered through 0.45 μm PTFE syringe filter into 2 ml screw-top vials with PTFE caps without headspace and sealed with Parafilm M. The samples were placed in a vial case and sealed in a polystyrene foam box to prevent freezing (the samples were shipped in the winter) and posted via a fast courier service.The samples collected in 2020-2023 were analysed at Nazarbayev University on Los Gatos Research (LGR) Liquid Water Isotope Analyser (model IWA-912, ABB, Ltd, see below the detailed description).The samples in the dataset were accordingly marked in the column 'Measurement method' -Picarro or LRG.

The Picarro method
Water stable isotope δ 18 O and δ 2 H were measured with a Wavelength Scanning Cavity Ring-Down Spectroscopy (WS-CRDS) using a water isotope analyser Picarro L2130-i equipped with an A0211 vaporizer and an A0325 autosampler (Picarro Inc., USA) performed at the stable isotope laboratory at the German Research Centre for Geosciences in Potsdam.About 1.5 ml aliquots of filtered water samples were filled into glass vials and capped with silicon Teflon septa.All samples were measured within an optimized sequence layout (i.e. with independent prepared quality control samples and two internal laboratory reference standards) in high precision mode with nitrogen as carrier gas and the sample sizes were kept always at the same amount of water [12] .
Raw data is then post-run corrected for analytical effects, i.e. drift and normalization to the international certified standard reference materials VSMOW2/SLAP2 scale, following the procedure described by van Geldern and Barth [12] .To reduce the memory effect, only the mean of the last 4 of 10 measurements of each sample were used for data evaluation.
The isotopic ratios are reported using the standard delta notation in per mil ( ‰ ) and the analytical precision (1 σ standard deviation) is < 0.05 ‰ for δ 18 O and < 0.3 ‰ for δ 2 H.

The LGR method
The samples were analysed on Liquid Water Isotope Analyser (Model IWA-912, ABB Ltd) with PAL LSI autosampler (CTC Analytics AG, Switzerland) according to the manufacturer procedures.Manufacturer's reported analytical errors of the instrument are ≤±0.δ 2 H = −82.1 ± 0.6 ‰ .The analyses were performed with 2 preparatory and 6-8 measured injections in high precision mode for each sample.The four last injections with minimal standard deviations were usually used for normalization.In case of large deviations (exceeding the instrument's uncertainties) between injections for a particular sample, the measurements were repeated to obtain the reproducible result.The post-processing was done in LRG post analysis software (Version 4.5.06)using bracketing normalization and spectral contamination screening.

Limitations
The present dataset was primarily exploratory in nature based mostly on wide-range sampling.The generated Local Meteoric Water Line for Astana is based on approximately two years of event-based collection.Monthly cumulative values included heatwave/drought years with low amount of warm season precipitation (2021-2022).There were also initial problems with Palmex precipitation sampler with precipitation undercatch, absence of snow tube and siphon (winter) inlet (resolved in autumn 2023) resulting in very low amounts of the sample collected.If there was a sufficient sample amount for analysis it was processed and incorporated into the dataset.In such a case the note was added to the comment's column.The undercatch was resolved by upgrading from 135 mm diameter funnel to 235 mm diameter.The sampling location in Astana at Nazarbayev University is the site of Global Network of Isotopes in Precipitation (GNIP).The sample analysis quality was assessed through completion of a round-robin test supplied by International Atomic Energy Agency's Isotope Hydrology Section as outlined in [13] .
"International Partnerships for Sustainable Innovations", grant no.02WCL1552A .This work was also conducted with financial support from Nazarbayev University (CRP Research Grant No. 021220CRP2122 ).This research was also supported under the target program No. BR05236529 "Complex ecosystem assessment of Shchuchinsk-Borovoye resort area through the environmental pressure evaluation for the purposes of sustainable use of recreational potential" from the Ministry of Education and Science of the Republic of Kazakhstan.The acquisition of Liquid Water Isotope Analyzer was funded through the framework of the Environment & Resource Efficiency Cluster (EREC) by Nazarbayev University.We thank Elena Guseva for collecting samples in 2020-2021 in Burabay.