Two ultraviolet radiation datasets that cover China

Ultraviolet (UV) radiation has significant effects on ecosystems, environments, and human health, as well as atmospheric processes and climate change. Two ultraviolet radiation datasets are described in this paper. One contains hourly observations of UV radiation measured at 40 Chinese Ecosystem Research Network stations from 2005 to 2015. CUV3 broadband radiometers were used to observe the UV radiation, with an accuracy of 5%, which meets the World Meteorology Organization’s measurement standards. The extremum method was used to control the quality of the measured datasets. The other dataset contains daily cumulative UV radiation estimates that were calculated using an all-sky estimation model combined with a hybrid model. The reconstructed daily UV radiation data span from 1961 to 2014. The mean absolute bias error and root-mean-square error are smaller than 30% at most stations, and most of the mean bias error values are negative, which indicates underestimation of the UV radiation intensity. These datasets can improve our basic knowledge of the spatial and temporal variations in UV radiation. Additionally, these datasets can be used in studies of potential ozone formation and atmospheric oxidation, as well as simulations of ecological processes.


Introduction
Although the ultraviolet (UV) solar spectrum contributes only approximately 8.0% of the entire solar radiation at the top of the atmosphere (Gueymard, 2004), it is vital for ecosystems and the environment (Williamson et al., 2014), human health and climate change (Ferrero et al., 2006;Thomas et al., 2012). UV radiation has gradually become one of the major topics investigated in current studies, especially since the discovery of the ozone hole. It is an indicator of the amount of ozone that will form. UV radiation can inhibit plant photosynthesis by destroying leaves, which subsequently affects the balance of ecosystems. The absorption cross section of a gas multiplied by the amount of UV radiation gives the rate of photolysis, which determines atmospheric oxidation, affects photochemical reactions and produces secondary pollutants in the near-surface layer. Moreover, excess UV radiation may induce sunburn, skin cancer, and eye cataracts. In the stratosphere, the absorption of solar UV radiation by ozone counteracts the effect of radiative cooling caused by increases in carbon dioxide and water vapor.
The amount of UV radiation that penetrates to the surface depends mainly on the solar zenith angle, the presence of clouds, aerosols, ozone, and surface reflectivity (Bais et al., 1993;Arola, 2003;Bernhard et al., 2007;Kerr and Fioletov, 2008). The interactions between UV radiation and these factors are complex and not yet fully understood. den Outer et al. (2010) reconstructed UV radiation back to 1960 in Europe and found that UV levels have gradually increased over the last four decades, particularly since the 1980s. With regard to the period from 1980 to 2006, approximately two-thirds of the increase in UV radiation can be attributed to a decline in cloudiness or aerosol optical depth (AOD) and one-third can be attributed to reductions in ozone. Wei et al. (2006) analyzed the long-term changes in ozone and noontime erythemal UV radiation from 1978 to 2011 using Total Ozone Mapping Spectrometer data products. They found that, in the eastern and southern parts of China, the ozone layer is not the main reason for the trend in UV irradiance; rainfall and the related cloud variations have significant correlations with UV radiation in these regions, and approximately 40%-70% of the variability in UV radiation there can be explained by precipitation change. On the other hand, in the western and northern parts of China, variations in ozone can explain approximately 30%-70% of the variations in UV radiation. Measurements from October 2004 to September 2006 at Xianghe Station were used to analyze the relationships between UV radiation and meteorological factors (Xia et al., 2008). It was found that aerosols can cause a reduction of approximately 0.0091 in F UV (the ratio of UV radiation to the amount of solar radiation reaching the surface) per unit increase in AOD (500 nm) for precipitable water vapor content within a range of 1.5-2.0 cm. Additionally, an increase of 17% in F UV resulted from an increase of 1.0 cm in the precipitable water vapor content; F UV increased by 20% when the ground was covered with snow. It was difficult to quantify the role of each factor because the contribution of each factor is different in different time periods and regions. Thus, accurate measurements of UV radiation and the main factors controlling its levels at the surface of the Earth are of great importance for achieving better understanding of the interactions between UV radiation, ozone, aerosols, clouds and surface reflectivity (Schwander et al., 1997;Mayer and Kylling, 2005).
Unfortunately, as UV radiation is not a routinely measured parameter, in situ measurements of UV radiation are very scarce, especially in China. The network for observing UV radiation in China started very late. In the early 1990s, the UV radiation observation system of Brewer was first built in the background stations of Waliguan and Zhongshan for long-term continuous observations. Additionally, domestic scientific researchers have established UV radiation observation stations at a few, widely distributed locations. Although many characteristics of UV radiation have been obtained, the time scale is very short, and the results are local and not regionally representative. Therefore, it is important to carry out research on the measurement of UV radiation on a national scale.
The aim of this paper is to introduce two datasets describing UV radiation. One contains hourly observations of UV radiation from 40 Chinese Ecosystem Research Network (CERN) stations in China, and the other contains reconstructed daily UV radiation amounts at 724 China Meteorological Administration (CMA) stations. The latter used a UV estimation model combined with a hybrid model that was proposed and then improved by Yang et al. (2001Yang et al. ( , 2006. Basic information on the observed and reconstructed data is provided in section 2. A sample description of the datasets is presented in section 3, and some applications of the data are described in section 4.

Basic data information
Basic information on the dataset is presented in the data profile table, including the time range and geographical scope that it covers, the dataset's composition, and so on. In sections 2.1 and 2.2, detailed information on the observed and reconstructed properties of the data are described.

Collection of the observed UV radiation data
The UV radiation data were obtained from a nationalscale network, CERN, which was the first standard network established to measure solar radiation for investigating the radiation budget and its spatial and temporal variations over China. CERN consists of 40 stations that provide in situ measurements of UV radiation, covering almost all typical ecosystems. Two stations are in grassland ecosystems, there is one urban station, three for bays, two for lakes, six for deserts, fifteen for agriculture, ten for forests, and one for marshland. The spatial distribution of the 40 CERN stations in China is shown in Fig. 1, and the geographical locations, altitudes and ecosystem types of the stations are provided in Table 1. The urban station, in Beijing, which is located at the Institute of Atmospheric Physics of the Chinese Academy of Sciences, is the center of data collection, quality control and instrument calibration for the entire radiation observation system.
CUV3 broadband radiometers (Kipp & Zonen, The Netherlands), which have an accuracy of 5%, have been installed at all CERN stations to make observations of UV radiation (290-400 nm). This level of accuracy meets the World Meteorological Organization's (WMO) measurement standards. UV pyranometers are calibrated using standard lamps with a known spectroradiometer, and this calibration system is at the forefront of UV radiation research in China. A spectrometer measures a standard lamp spectral irradiance and then retrieves the spectral sensitivity under standard lamp conditions (K c ). Using the same method, the spectral sensitivity under sunshine conditions (K s ) can be deduced. The K c is considered equal to K s in narrow wavebands. For each narrow waveband, K c can be obtained by using the lamp spectral irradiance, and the spectroradiometer can then be used to measure solar irradiance. The UV radiation can be derived by integrating the solar spectral irradiance between 290 and 400 nm. At the same time, the calibrated UV pyranometers measure the response voltage. All CUV3 pyranometers are calibrated and inter-compared at the beginning and the end of data collection to ensure the accuracy of the calibration. Daily checks are made to ensure that the radiometers are free of dirt and positioned horizontally to guarantee the data quality. M520 (Vaisala) data collectors are used to collect the data. All radiation data are recorded at 1-min intervals, and hourly values are derived from the 1-min values through integration.

Quality control of the observed UV radiation
UV radiation data quality controls are conducted on two aspects. One of these aspects is the observation system, and the other involves a quality assessment of the UV radiation measurement data.
In terms of the first aspect, all radiation sensors and data collectors used at the CERN stations meet the WMO's standards for radiation observations. The UV pyranometers at individual stations are calibrated monthly against reference instruments that have been calibrated against regional reference instruments, which in turn are calibrated against Chinese reference instruments every two years. Data that are missing due  to human error or sensor failure are represented by 32766. As there is no UV radiation at night, the UV radiation data are blank in the Excel cells for when the solar elevation angle is negative. The quality assessment for the UV radiation measurement data is primarily based on three principles. First, considering the cosine response, the UV radiation data are replaced with 9999 when the solar elevation angle is less than 5 • (Huang et al., 2011). Second, each observed hourly UV radiation value should be less than the hourly extraterrestrial UV radiation at the top of the atmosphere at the same geographical location; otherwise, it is flagged as questionable data and replaced with 9999. The extraterrestrial UV radiation (UV ext ), is calculated using Eq. (1), which was proposed by Foyo-Moreno et al. (1999): where I suv is the ultraviolet spectral irradiance and is equal to 78 W m −2 ; ρ is the distance correction factor between the Sun and Earth; α is the solar zenith angle; and w is the time angle. Third, the ratio of UV radiation to the global solar radiation (R s ) should be limited to values between 0.02 and 0.08; otherwise, the data are replaced with 9999 (Hu et al., 2007). All the quality control procedures were carried out using the FORTRAN 90 program. The values in the last column of Table 1 are the effective data quantities after quality control.

Development of the reconstructed UV radiation data
The UV radiation observation system in China started relatively late compared to similar systems elsewhere around the world. Most of the observed UV datasets described above begin in 2005, and the observation sites are very sparse. To obtain high spatial resolution and long records of historical UV radiation data, either empirical and semi-physical methods or satellite inversion algorithms can be used. In this study, daily UV radiation datasets for the period before the use of instrumentation were obtained from solar radiation measurements through an all-sky UV estimation model. However, in the CMA radiation network, only around 120 sites can provide daily solar radiation. Moreover, due to the retrofitting of many of the radiation instruments before 1993, the accuracy of the observed solar radiation data was relatively low during that period (Tang et al., 2011). Instead of using these data, solar radiation amounts reconstructed with a hybrid model were used to calculate the UV radiation.

Reconstructing solar radiation using a hybrid model
The hybrid model put forward and improved by Yang et al. (2001Yang et al. ( , 2006 was applied to estimate solar radiation using the AOD and total column ozone measurements retrieved from satellite data and routine meteorological observations obtained from the CMA. The details of this hybrid model have been described by Yang et al. (2006), so only a brief description is given here. Physical processes, such as Rayleigh scattering, aerosol extinction, ozone absorption, water vapor absorption, permanent gas absorption, and the effects of clouds, which are represented by the transmittance functions τ r , τ a , τ oz , τ w and τ c , respectively, are considered, and the simplicity of the Ångström correlation is also maintained in the hybrid radiative transfer model. The solar beam radiative transmittance (τ d ) and the solar diffuse radiative transmittance (τ d ) under clear skies can be calculated using Eqs.
(2) and (3): (2) τ d = 0.5[τ oz τ g τ w (1 − τ a τ r )] . ( Solar radiation reaching the surface of the Earth can be obtained using the following equation: where R 0 is the solar radiation at the top of the atmosphere, t is time, and Δt is the integration period. More detail on the methods for calculating τ r , τ a , τ oz , τ g and τ c can be found in the paper by Yang et al. (2006). The data required by the model as the input, including surface pressure, surface relative humidity, air temperature, and sunshine duration, which were obtained from routine observations at 724 weather stations (Fig. 2) with specified latitudes and longitudes, underwent quality control by the CMA.
Other input values, such as the AOD and total column ozone at the 724 stations, were interpolated from satellite retrievals. The column ozone concentrations were obtained from the Solar Backscatter Ultraviolet Merged Ozone Data Set, Version 8.6 (http://acd-ext.gsfc.nasa.gov/Data services/merged/index. html). The AOD was obtained from a MODIS data product (MOD08-M3, level 3, Collection 5.1) (http://ladsweb. nascom.nasa.gov/data/search.html) with a spatial resolution of 1 • × 1 • . As no AOD data were available before 2000, and no ozone data before 1970, climatic mean AOD and ozone values were used in the hybrid model. All the calculations involved in the hybrid model were conducted using the FORTRAN 90 program. Using this hybrid model, daily solar radiation values from 1961 to 2014 were obtained and could be used to reconstruct the daily UV radiation.

2.3.2.
Reconstructing UV radiation by combining the hybrid model results with an all-sky UV estimation model As UV radiation is highly correlated with solar radiation, most published experimental results use measured solar radiation to calculate UV radiation by considering the ratio of UV radiation to solar radiation as an empirical constant (Calbó et al., 2005;Podstawczynska, 2010). Long and Ackerman (2000) used a power law equation to describe the dependence of solar radiation on the cosine of the solar zenith angle under clear-sky conditions. The clearness index (K t ) is defined as the ratio of the solar irradiance reaching the surface of the Earth to the extraterrestrial solar irradiance, and it provides a general indication of scattering and absorption processes due to aerosols, gases, clouds, etc. Earlier studies of the effects of K t and the solar zenith angle on UV radiation have been analyzed and confirmed under different sky conditions (Hu et al., 2010;Wang et al., 2013;Liu et al., 2016). To develop the estimation model, the entire hourly dataset from Lhasa Station under all weather conditions was studied. Figure 3a displays the UV radiation plotted against the cosine of the solar zenith angle (μ). Different colors represent different K t values. For a given specific K t interval, it is recommended that the relationship between UV radiation and the cosine of the solar zenith angle is calculated with the following power law equation: where UV 0 indicates the UV radiation for one unit of μ, and e determines how UV varies with μ. Unfortunately, it is not straightforward to obtain direct measurements of UV 0 , as that requires the solar zenith angle to be zero. K t was first allocated as 0.03, with a step size of 0.01. The relationship between UV and μ was fitted using the power law equation [Eq. (5)] within each specific K t interval. The relationship between UV 0 and K t was then analyzed (Fig. 3b). The dependence of UV 0 on K t is described by Eq. (6): where the units of a, b, c and d are W m −2 . The long-term observed daily values of R s could be easily obtained from the hybrid model, but long-term hourly values of R s were not obtainable. Therefore, Eqs. (5) and (6) were modified to calculate daily UV radiation amounts using the daily values of R s , as follows: where UV daily is the daily amount of UV radiation; the units of this quantity are MJ (m 2 d) −1 . K t is the ratio of daily R s to daily extraterrestrial solar irradiance;μ is the average of the cosine of the solar zenith angle from sunrise to sunset; t d is the daily sunshine duration (hours); and A, B, C, D and E are parameters that differ between climatic zones. Therefore, China was divided into eight climatic zones, and the UV radiation data were reconstructed using the corresponding values of the parameters A to E, given in Table 2, and combined with the solar radiation estimates obtained from the hybrid model.

Validation of the reconstructed UV radiation estimates
To validate the accuracy of the UV radiation reconstruction model, the UV radiation amounts measured in situ at 40 CERN stations from 2005 to 2014 were chosen for comparison with the reconstructed UV radiation data obtained by applying the hybrid model and an all-sky UV estimation model to the nearest CMA station. Statistical estimators, such as the correlation coefficient (R), the mean bias error (MBE), the mean absolute bias error (MABE), and the root-meansquare error (RMSE) were used as benchmarks for the radiation products. These metrics are defined as follows: Here, E i is the estimated value (ith number), M i is the measured value, E ave is the average of the estimated values, M ave is the average of the measured values, and N is the number of observations. Table 3 shows the four statistical parameters calculated using the CERN-observed UV radiation and the calculated UV radiation at the nearest CMA stations. The correlation coefficient is larger than 0.8 for all stations except BNF and MXF. The MBE values are negative at 35 of the stations, which indicates that the reconstructed UV radiation values represent slight underestimates compared with the observations. Only MXF and THL have MABE values larger than 25% and RMSE values larger than 30%. All of the statistical results show that the reconstructed UV radiation values are reliable.

Sample description of the observed UV radiation values
The dataset, which contains UV radiation values measured in situ, is composed of 40 files. The format of the file names is "ABC.xlsx", where AB represents the station code, and C represents the type of ecosystem surrounding the station. All the corresponding information can be found in Table  1. Beijing Station is chosen as an example for describing the organization of the data (Table 4). Blank spaces in the last column indicate that the solar elevation is less than 0 • , and there are no effective radiation data; 32766 represents a missing measurement caused by human error or sensor failure; and 9999 indicates that the UV radiation data do not satisfy the quality control principles described in section 2.2. The diurnal variation of UV radiation on 9 July 2015 and the annual average diurnal variation in UV radiation for 2015 at Beijing Station are presented in Fig. 4. The measured value of UV  The four columns represent the year, month, day and reconstructed daily UV radiation, respectively. The annual variation in reconstructed UV radiation in 2014 at station 54511 is shown in Fig. 6. For some stations, 32766 may appear in the fourth column, which represents the absence of reconstructed data. The distribution of annual mean reconstructed UV radiation in 2014 is presented in Fig. 7. This distribution is similar to the distributions of photosynthetically active radiation and surface solar radiation reconstructed by Tang et al. (2013aTang et al. ( , 2013b.

Data applications
UV radiation may have serious effects on public health, such as skin cancer, accelerated aging of the skin, cataracts and other eye diseases. It can also reduce the ability to resist infectious diseases. Increased UV radiation may reduce  growth in several plant species or diminish photosynthetic activity. UV radiation also plays important roles in atmospheric chemistry processes, such as the production of dimethyl sulfide, hydroxyl radicals, tropospheric ozone and ozone precursors. Thus, UV radiation datasets are essential in a wide range of fields, including cancer research, ecology, tropospheric chemistry, agriculture and climate science. The two UV radiation datasets described in this paper have the potential to improve our basic knowledge of the distribution of UV radiation over large temporal and spatial scales, which should contribute substantially to engineering applications in China and provide a scientific basis for sustainable energy and atmospheric environmental protection. Moreover, these datasets contain scientific data that support biological studies, ecological process simulations, and studies of atmosphere-land surface interactions. The datasets have been archived in the Science Data Bank (http://www.dx. doi.org/10.11922/sciencedb.332).