Comparison of ozonesonde measurements in the upper troposphere and lower Stratosphere in Northern India with reanalysis and chemistry-climate-model data

The variability and trend of ozone (O3) in the Upper troposphere and Lower Stratosphere (UTLS) over the Asian region needs to be accurately quantified. Ozone in the UTLS radiatively heats this region and cools the upper parts of the stratosphere. This results in an impact on relative humidity, static stability in the UTLS region and tropical tropopause temperature. A major challenge for understanding ozone chemistry in the UTLS is sparse observations and thus the representation of precursor gases in model emission inventories. Here, we evaluate ozonesonde measurements during August 2016 at Nainital, in the Himalayas, against ozone from multiple reanalyses and the ECHAM6-HAMMOZ model. We find that compared to measurements both reanalyses and ECHAM6-HAMMOZ control simulation overestimate ozone mixing ratios in the troposphere (20 ppb) and in the UTLS (55 ppb). We performed sensitivity simulations using the ECHAM6-HAMMOZ model for a 50% reduction in the emission of (1) NOx and (2) VOCs. The model simulations with NOX reduction agree better with the ozonesonde observations in the lower troposphere and in the UTLS. Thus, neither reanalyses nor ECHAM6-HAMMOZ results can reproduce observed O3 over the South Asian region. For a better representation of O3 in the ECHAM6-HAMMOZ model, NOX emission should be reduced by 50% in the emission inventory. A larger number of observations of ozone and precursor gases over the South Asian region would improve the assessment of ozone chemistry in models.


Mean ozone (O 3 ) profiles.
shows the measurement of ozone mixing ratio (O 3 ) from all ozonesondes (grey) as a function of pressure (from 800 to 20 hPa) along with its mean profile, in comparison with mean profiles of ERA5, MERRA2, CAMS, and ECHAM-CTL. It shows differences in ozone variation with height (pressure) in all the data sets.
In the troposphere, between 800 and 580 hPa, the ozonesonde profile show agreement with ERA5 and the CAMS, while MERRA2 and ECHAM-CTL overestimate than ozonesonde profile by 10 ppb and 18 ppb, respectively (Fig. 1a). At the levels between 580 to 200 hPa, CAMS and MERRA2 shows agreement with ozonesonde data while ERA5 underestimates the observed ozone by 14 ppb and ECHAM-CTL overestimates observed ozone by 55 ppb. Figure 1a shows that in the UTLS between 200 to 100 hPa, all datasets overestimate ozone by ~ 20 ppb and between 60 to 20 hPa by ~ 200-500 ppb. Between pressure levels 100 to 80 hPa, the ECHAM-CTL simulation shows good agreement with ozonesondes while ERA5, CAMS, and MERRA2 show overestimation by ~ 25 to 150 ppb. Between 80 to 40 hPa pressure levels, ECHAM-CTL shows underestimation by ~ 75 ppb and overestimation by 300 ppb 40 to 20 hPa. ERA5, CAMS, and MERRA2 show overestimation by 300-700 ppb between 80 www.nature.com/scientificreports/ and 20 hPa. There is a large variation within daily ozonesonde profiles. These temporal variations may be due to synoptic weather systems. During the first half of August 2016 several tropical storms occurred in the western Pacific i.e. typhoon Omais 37,38 . Also, there are differences between the mean profiles of ozonesondes, reanalysis, and ECHAM-CTL. This may be due to various reasons, e.g. spatial resolution; the ozonesondes measurements are at point location while reanalysis and model simulations are at the grid nearest to the station. Also, differences in emission inventory and assimilated data were used in the reanalysis and ECHAM-CTL model processes.
The scatter plot of ozone concentration from ozonesondes versus ERA5, CAMS, MERRA2, and ECHAM-CTL datasets is shown in Fig. 1b. Figure 1b shows that the large numbers of data points for ozone values between 20 and 100 ppb are outside the 95% confidence level. From Fig. 1a, one can see that ozone values 20-100 ppb are found in the troposphere (800-200 hPa). Thus from Fig. 1a, b, we can infer that large variation within the data sets occurs in the troposphere. Similarly, data points for the ozone values 100-1000 ppb are also outside the 95% confidence level (see Fig. 1b). Figure 1a shows that ozone values 100-1000 ppb are found between 200 and 60 hPa. However, data points for ozone values 2000-10,000 ppb (which are between 60 and 20 hPa levels, see Fig. 1a) are mostly within the 95% confidence limits. Thus differences between data sets are less between 60 and 20 hPa levels and within the 95 confidence limits. 39 also found a similarity in the measurement of ozone from balloon soundings and Aura Microwave Limb Sounder (MLS) in the stratosphere (60-10 hPa) over the Tibetan Plateau.
Probability Density Function (PDF) analysis. Figure 2 illustrates the probability distribution functions (PDF) of ozone mixing ratio from ozonesondes, ERA5, CAMS, MERRA2, and ECHAM-CTL for the campaign period at different slabs of atmospheric pressure levels, one in the troposphere (slab-1:800-170 hPa), and three slabs in the UTLS, (slab-2: 170-100 hPa, slab-3:100-70 hPa, Slab-4: 70-40 hPa). Figure 2a shows the PDF for slab-1. It shows the width (difference between starting and end point of the PDF curve) of the PDF curve is largest in ERA5 (21 to 135 ppb), followed by CAMS (28 to 120 ppb), MERRA2 (30.5 to 100.5 ppb), ozonesondes (19 to 82.5 ppb), then ECHAM-CTL (45.5 to 95 ppb). Thus the width of PDF from ECHAM-CTL is narrower than other data sets. The PDF distribution of reanalysis and ECHAM-CTL shows a bimodal distribution. However, a PDF distribution of ozonesondes measurements shows the normal distribution. Figure 2a depicts ozonesondes PDF curve peaks at ozone concentration 37 ppb, ERA5 at 39 ppb, CAMS at 44 ppb, MERRA2 at 46.6 ppb, and ECHMA-CTL at 58 ppb. All data sets show an overestimation of PDF peak by 2-21 ppb than ozonesonde mean profile in the troposphere region.
In the UTLS, for slab-2 ( Fig. 2b) width of ozonesondes PDF curve (50.5 to 224 ppb) and ECHAM-CTL (51.5 to 235 ppb) is smaller than all data sets, CAMS (37 to 399 ppb), ERA5 (32.5 to 478 ppb) and then MERRA2 (18.3 to 628 ppb). For slab-2, all data sets show overestimation (by 22-37 ppb) in comparison to ozonesondes. PDF curve of all data sets shows the normal distribution. In slab 3 (Fig. 2c) width of the MERRA2 PDF curve is largest (11.5 to 7450 ppb) than all other data sets. It is followed by ERA5 (14.5 to 7490 ppb), CAMS (30.8 to 5180 ppb). The width of the PDF curve of ozonesondes and ECHAM-CTL data sets are almost similar (vary between 55.8 to 1100 ppb). The PDF curve peak is overestimated in all re-analysis and model data sets in comparison with the ozonesondes PDF curve (ERA5: 399 ppb, CAMS: 405 ppb, MERRA2: 313 ppb, and ECHAM-CTL: 214 ppb). In the slab-4 (Fig. 2d) the peak of the PDF curve for ERA5, CAMS, and MERRA2 show an overestimation by 200-380 ppb compared to ozonesondes while ECHAM-CTL shows an underestimation by 300 ppb compared to ozonesondes. www.nature.com/scientificreports/ We plot a peak value of the PDF curve for different pressure slabs in Fig. 3. Figure 3 also suggests that ECHAM-CTL overestimates ozone in the troposphere, between 800 and 100 hPa. A peak in ECHAM-CTL PDF is near the ozonesondes, while it is slightly underestimated above 70 hPa. The ERA5, CAMS, and MERRA2 data sets show an overestimation of the ozone mixing ratio at all pressure levels (by 20 to 400 ppb). The mean vertical profile and PDF analysis suggests that all data sets ERA5, CAMS, MERRA2, and ECHAM-CTL show an overestimation of ozone in the troposphere compared to ozonesondes. In the UTLS, ECHAM-CTL shows underestimation while ERA5, CAMS, and MERRA2 show overestimation compared to ozonesondes. The ozone profiles at Pohang (36.02°N, 129.23°E) in the Korean Peninsula, in comparison with reanalysis products (MERRA2 and CAMS), also show largely overestimation (by 150 ppb) in the troposphere and stratosphere 40 . Sensitivity simulations for NO X and VOCs emissions using the ECHAM6-HAMMOZ model. Figures 1a and 2 show that ozone concentrations are overestimated at pressure levels between 800 and 200 hPa in ECHAM-CTL. In South Asia, the chemical production of tropospheric ozone is mainly from NO X , and VOCs. However, other ozone precursors also play a role in ozone production 8,9 . Hence we reduce their emissions in the model emission inventory to reduce ozone overestimation in the ECHAM6-HAMMOZ model. We performed two sensitivity experiments for (1) reducing NO X emission by 50% (ECHAM-NO X ) and (2) reduction of all VOCs by 50% (ECHAM-VOCs). Further, we compare the monthly mean vertical profile of ozone from ECHAM-NO X and ECHAM-VOCs with ozonesondes.
In the troposphere, between 800 and 580 hPa, ozonesondes show good agreement with ECHAM-NO X (Fig. 4a). Between 580 and 200 hPa, ECHAM-NO X shows underestimation by 14-15 ppb and ECHAM-VOCs show overestimation by 14-18 ppb. The underestimation in ECHAM-NO X and overestimation in ECHAM-HVOCs simulations at the levels between 580 and 200 hPa may be due to the influence of meteorology/winds in Between pressure levels 200 to 120 hPa, the ECHAM-NO X shows underestimation by ~ 8 to 10 ppb. The ECHAM-NO X profile shows good agreement with ozonesondes between 120 and 40 hPa pressure levels. Above 40 hPa, both ECHAM-NO X and ECHAM-VOCs show agreement with each other and ozonesondes. Thus ECHAM-NO X profile shows agreement with ozonesondes between 800 and 580 hPa and 120-20 hPa. The scatter plot of ozone concentration from ozonesondes versus two sensitivity simulations is shown in Fig. 4b. Figure 4b shows that the large numbers of data points for ozone values between 20 and 200 ppb are outside the 95% significance level. From Fig. 4a, one can see that ozone values of 20-200 ppb are present in the troposphere (800-200 hPa). Thus from Fig. 4a-b, we can infer that large variation within the data sets occurs in the troposphere. Similarly, data points for the ozone values 200-1000 ppb are also outside the 95% confidence level (see Fig. 4b). Figure 4a shows that ozone values 200-1000 ppb are present between 200 and 120 hPa. However, data points for ozone values 2000-10,000 ppb (which are between 120 and 20 hPa levels, see Fig. 4a) are mostly within the 95% confidence limits. Hence differences between data sets are less between 120 and 20 hPa levels and within the 95 confidence limits. Figure 5 illustrates the PDF of O 3 from ozonesondes, ECHAM-CTL, ECHAM-NO X , and ECHAM-VOCs for the campaign period at different slabs of atmospheric pressure levels, two in the troposphere (slab-1: 800-580 hPa, and slab-2: 580-170 hPa) and three slabs in the UTLS, slab-3: 170-100 hPa, slab-4: 100-70 hPa, slab-5: 70-40 hPa). Figure 5a shows the PDF for slab-1. The PDF curve of ozonesonde and ECHAM-NO X shows a similar variation and width (36-38 ppb). In contrast, the PDF curve of ECHAM-CTL and ECHAM-VOCs is narrow, containing large values (55-60 ppb). Figure 5a also depicts ozonesondes PDF curve peaks at 38 ppb, which  www.nature.com/scientificreports/ agrees with ECHAM-NO X (at 38 ppb). The ECHAM-CTL PDF peak at 60.5 ppb and ECHAM-VOCs PDF peak at 55 ppb. ECHAM-CTL and ECHAM-VOCs data sets show an overestimation by 17-22 ppb than ozonesondes in the troposphere. But the PDF distribution for ECHAM-NO X for the 800-580 hPa show good agreement with the ozonesondes (Fig. 5a, f), indicating reduction in NO X emissions reduces ozone which improves agreement with observations. For slab-2 width of PDF curve for the ozonesonde and ECHAM-NO X is similar and smaller (21-51 ppb) followed by ECHAM-VOCs (55-89 ppb), then ECHAM-CTL (50.5-93 ppb). The peak of the PDF curve for ozonesondes at 46 ppb, which agrees with ECHAM-NO X (at 46 ppb), while ECHAM-CTL and ECHAM-VOCs show an overestimation by 26 ppb than ozonesondes. In slab-3 (Fig. 5c) width of the ECHAM-CTL PDF curve (51.5 to 235 ppb), ECHAM-NO X (slab3: 37 to 235 ppb), and ECHAM-VOCs (slab3: 47.5 to 270 ppb) is larger than ozonesondes (slab3: 50.5-224 ppb). In slab-3, the PDF curve peak for ECHAM-NO X shows a small underestimation by 9 ppb and an overestimation in ECHAM-CTL and ECHAM-VOCs by 18-20 ppb than ozonesondes. In slab4 (Fig. 2d) width of the ECHAM-VOCs PDF curve is largest (59.5 to 1820 ppb) than all other data sets, followed by ECHAM-NO X (41.5 to 1540 ppb), ECHAM-CTL (55 to 1100 ppb) and ozonesondes (22.4 to 1000 ppb).
The peak value of the PDF curve for different pressure slabs is also shown in Fig. 5f. It is clearly seen that improvement in simulated ozone for ECHAM-NO X sensitivity experiment. It should be noted that the sensitive experiment of ECHAM-NO X shows agreement 800-170 hPa and the UTLS region  www.nature.com/scientificreports/ model data (Fig. 6a). On this day, the ERA5 ozone profile shows good agreement with ozonesondes at lower troposphere heights (800-600 hPa). ECHAM-CTL and CAMS also show agreement with the ozonesondes at 600-200 hPa. ECHAM-CTL profile show underestimation compared to ozonesondes between 100 and 20 hPa. ERA5 and CAMS profiles overlap with each other and show good agreement with ozonesondes between 100 and 20 hPa. The MERRA2 ozone profile shows overestimation at all pressure levels. In Fig. 6b, we compare the measured ozonesonde profile with ECHAM-NO X and ECHAM-VOCs simulations. The ECHAM-NO X profile shows agreement in the lower troposphere, between 800 and 580 hPa, with ozonesondes. A similar agreement is also seen in the mean profile of ECHAM-NO X simulations (Fig. 4a). It is interesting to see that between 100 and 20 hPa, there is a good agreement between ECHAM-NO X and ECHAM-VOCs with each other and with the ozonesondes profile.
Further, we investigate the reason for the low ozone concentration between 140 and 100 hPa (363 to 380 K) on 15th August 2016 using the trajectory module of the three-dimensional Lagrangian chemistry transport model CLaMS (section "Trajectory calculations using the Chemical Lagrangian Model of the Stratosphere (CLaMS)").

Discussions and summary
The comparison of ozonesonde profiles with multiple reanalysis (ERA5, CAMS, and MERRA2) and high-resolution chemistry-climate simulations shows that in the troposphere, between 800 and 580 hPa, ozonesonde profile show agreement with ERA5 and the CAMS, while MERRA2 and ECHAM-CTL overestimate the observed ozonesonde profile by 10 ppb and 18 ppb respectively. At the levels between 580 and 200 hPa, CAMS and MERRA2 profiles show agreement with ozonesonde measurements while ERA5 underestimates the measured ozone by 14 ppb and ECHAM-CTL overestimates by 55 ppb compared to the ozonesonde measurements.
A probability density function analysis (PDF) applied to these data sets shows biases in ozone in ERA5 in the troposphere by 3-15 ppb and in the UTLS by 25-400 ppb and CAMS in the troposphere by (8-16 ppb) and UTLS by 20-200 ppb and MERRA2 in the troposphere by 7-11 ppb and UTLS by 37-350 ppb and ECHAM-CTL in the troposphere by 15-21 ppb and UTLS by (80-300 ppb). Thus, our study shows that neither reanalyses (ERA5, CAMS, MERRA2) nor model simulations (ECHAM-CTL) can reproduce measured ozone profiles over the South Asian region during the monsoon season.
Since ozone mixing ratios are overestimated in the ECHAM-CTL simulations, we reduce the emission of (1) Nitrogen oxides (NO X ) (ECHAM-NO X ) and (2) all volatile organic compounds (ECHAM-VOCs) (ECHAM-VOCs) by 50% in the model's emission inventory. These reduced emission model simulations show that ECHAM-NO X simulations show improved agreement with ozonesonde observations in the lower troposphere (between 800 and 580 hPa) and in the UTLS (between 100 and 40 hPa). The ECHAM-NO X and ECHAM-VOCs simulations only slightly underestimate ozone (by 2-7 ppb) between 170 and 100 hPa. The ECHAM-NO X simulation also shows good agreement on 15 August 2016, a special case when low ozone, and no ATAL was observed over Nainital. Our CLaMS trajectory analysis shows that on this day the clean air mass (containing low ozone and Our study demonstrates that anthropogenic NO X emissions are overestimated in the AEROCOM-ACCMIP-II emission inventory used in the ECHAM-HAMMOZ model simulations over South Asia; they should be reduced by 50% for a better representation of tropospheric ozone in chemistry-climate models. Appropriate simulations of ozone in chemistry-climate model simulations will be helpful for the correct estimation of the oxidising capacity of the troposphere, ozone radiative forcing, ozone heating rates, and the implications for transport processes. Finally, our study suggests that a larger number of height resolved trace gas observations over the South Asian region (including the variability caused by the impact of weather systems such as tropical cyclones) are required for improving the representation of ozone chemistry in models in particular in the Asian monsoon region. The measurements were conducted using the payload of the instrument, namely (1) RS41-SGP from VAISALA, Finland, for the measurements of pressure, temperature, and relative humidity, (2) Ozonesonde based on Electrochemical Concentration Cell (ECC) for ozone mixing ratio from EN-SCI, USA 46,47 . These sensors were used with RS41-SGP XDATA interface with the Vaisala DigiCORA MW41 ground receiving sounding system 48 at a frequency of 1 Hz. The details of measurement technique, resolution, and accuracy can be found in the white paper by Vaisala (WEA-MET-RS-Comparison-White-Paper-B211317En-B) and in Environmental Science (http:// www. en-sci. com/). Ozonesondes measures ozone from the ground to 30 km, with a high vertical resolution of ∼100 m. The 1σ uncertainty in the total ozone normalization factor in the tropics is 5.2% 49 . The details of the measurement of campaign sites are elaborated in 45,50 . In this study, we analysed 25 ozonesondes reaching the 20 hPa levels 50 (see also Table S1).
Reanalysis data. We compare the ozonesonde measurements with three reanalysis data sets, namely: ERA5, CAMS, and MERRA2. ERA5 is the fifth-generation reanalysis dataset produced by the European Centre for Medium-Range Weather Forecasts (ECWMF), which is the latest global reanalysis dataset. The ERA5 reanalysis is based on the newer IFS cycle 41R2 and provides several improvements compared to ERA-Interim, including higher spatial and temporal resolution 35 . The ERA5 ozone field is the result of the assimilation of the model and satellite observations. All Level-2 ozone products assimilated in ERA5 except METOB-B GOME-2, METEOR-3 and ADEOS-1 TOMS. Since December 2014 the assimilation was switched to the near-real-time product. Additional information on ozone in ERA5 is provided by ozone-sensitive channels of the nadir-viewing infrared sounders (HIRS, AIRS, IASI and CrIS 35,51 . The data set have a temporal resolution of one hour and a spatial resolution of 0.25° × 0.25°, with a vertical range from 1000 to 1 hPa (137 vertical levels).
The CAMS global reanalysis data are produced by the Copernicus Atmosphere Monitoring Service (CAMS) 4 . The CAMS ozone field is the result of the assimilation of satellite observations from GOSAT and METOP-A/B, EOS-Aqua, EOS-Terra, ENVISAT, EOS-Aura, NOAA-14, -16, -17, -18, and -19) 33 , and it integrates from SCIA-MARCY, OMI, and GOME/2 as well as ozone profiles from MIPAS and MLS after 2005. The data set have a spatial resolution of 0.75° × 0.75°, with a vertical range from 1000 to 0.1 hPa within 60 hybrid sigma-pressure levels. It gives output every 3 h.
The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA2), is NASA's latest reanalysis, spanning the satellite observing era from 1980 to the present 52 . MERRA-2 assimilates modern hyperspectral radiance and microwave observations, along with GPS-Radio Occultation datasets. It also uses NASA's ozone profile observations that began in late 2004. Additional advances in both the GEOS model and the GSI assimilation system are included in MERRA-2 34 . MERRA2 system produces 3-hourly analyses at 72 sigma-pressure hybrid layers between the surface and 0.01 hPa, with a horizontal resolution of 0.625° × 0.5° with 42 pressure levels (1000 hPa to 1 hPa).
Chemistry climate model simulations:. We used ozone profiles from the simulation of state of the art ECHAM5-HAMMOZ aerosol-chemistry-climate mode. It comprises the atmospheric general circulation model, ECHAM5 53 , a tropospheric chemistry module MOZ 54 , and an aerosol module Hamburg Aerosol Model (HAM) 55 . The HAM module takes into account the primary aerosol compounds, namely sulphate (SU), Black Carbon (BC), Organic Carbon (OC), sea salt (SS), and mineral dust (DU). The chemistry of ozone, NO x , VOCs, and other gas-phase species is based on the MOZART-2 chemical scheme based on O x -NO x -hydrocarbons with 63 tracers and 168 reactions 8,9,54 . The anthropogenic and fire emissions are based on the AEROCOM-ACCMIP-II emission inventory. Other details of the model and emissions are reported by 5,28,56,57 .
The model simulations are performed at a T256 spectral resolution, corresponding to 0.5° × 0.5° in the horizontal dimension, while the vertical resolution is described by 31 hybrid σ-p levels from the surface up to 10 hPa (~ 50 km). The simulations have been carried out with a time step of 20 min. Monthly varying Atmospheric Model Intercomparison Project (AMIP) sea surface temperature (SST) and sea ice cover (SIC) 58 were used as lower boundary conditions. We performed three sets of six member's ensemble simulations for the period 1 January 2015 to 31 August 2016. The analysis is performed for August 2016 leaving other period as spin-up. The experiment (1) control (referred to as ECHAM-CTL) compare with ozonesondes and reanalysis data. The ECHAM-CTL simulated ozone is overestimated than ozonesonde measurements hence we performed two additional simulations: (2) for a 50% reduction in anthropogenic emissions of Nitric oxides (NO X ) referred to as ECHAM-NO X (3) for a 50% reduction in anthropogenic emissions of all species of Volatile Organic Compounds (VOCs) referred as ECHAM-VOCs. The spread between six members in 3 sets of experiments is shown in Fig. S1. The advantage of using a high-resolution chemistry-climate model against regional models is for better performance for large-scale monsoon dynamical processes and ASAM 1,9,24,28,57,59 . Trajectory calculations using the chemical Lagrangian model of the Stratosphere (CLaMS). Global Chemical Lagrangian Model of the Stratosphere (CLaMS) simulations 60 are generally driven by meteorological reanalyses. Here we employ the trajectory module of CLaMS; trajectories are calculated backward in time and are truncated when they reach the model boundary 37,38 . At the beginning of the (backward) trajectory calculation, each air parcel is located at the location of the measurement in August 2016. www.nature.com/scientificreports/ We use horizontal winds from ERA5 reanalysis 35 provided by the European Centre for Medium-Range Weather Forecasts (ECMWF). For the vertical velocities, the diabatic approach was applied using the diabatic heating rate as the vertical velocity, including latent heat release 37 .
The aim is to analyse the transport pathways of air masses from the lower troposphere (i.e. from the model boundary) into the anticyclone region and the transport of stratospheric air masses around the Asian monsoon anticyclone to the location of the balloon measurements over Nainital. The (backward) trajectories allow the origin of air masses and their transport pathways to be identified 37,38,41 .

Data availability
The data used can be obtained from corresponding author on request.