Satellite validation strategy assessments based on the AROMAT campaigns

. The Airborne ROmanian Measurements of Aerosols and Trace gases (AROMAT) campaigns took place in Romania in September 2014 and August 2015. They focused on two sites: the Bucharest urban area and large power plants in the Jiu Valley. The main objectives of the campaigns were to test recently developed airborne observation systems dedicated to air quality studies and to verify their applicability for the validation of spaceborne atmospheric missions such as the TROPOspheric 5 Monitoring Instrument (TROPOMI)/Sentinel-5 Precursor (S5P). We present the AROMAT campaigns from the perspective of ﬁndings related to the validation of tropospheric NO 2 , SO 2 , and H 2 CO . We also quantify the emissions of NO x and SO 2 at both measurement sites. We show that tropospheric NO 2 vertical column density (VCD) measurements using airborne mapping instruments are in principle well suited for satellite validation. The signal to noise ratio of the airborne NO 2 measurements is one order of 10 magnitude higher than its spaceborne counterpart when the airborne measurements are averaged at the TROPOMI pixel scale. However, we show that the temporal variation of the NO 2 VCDs during a ﬂight might be a signiﬁcant source of comparison error. Considering the random error of the TROPOMI tropospheric NO 2 VCD ( σ ), the dynamic range of the NO 2 VCDs ﬁeld extends from detection limit up to 37 σ (2.6x10 16 molec cm − 2 ) or 29 σ (2x10 16 molec cm − 2 ) for Bucharest and the Jiu Valley, respectively. For both areas, we simulate validation exercises applied to the TROPOMI tropospheric NO 2 product.

• Lines 293-295 and Lines 323-327: The aircraft and mobile measurements are said to be simultaneous, but I don't see data or text convincing of that as the graph is just by longitude.Plume structures can change very quickly in time, so if the time difference that may seem small could still be an impact (even if it's as little as a half hour or so) this could explain most of the mismatch as well.Unless they are coincident down to just a few minutes then I see reason to question the temporal effects.
The Mobile-DOAS was sampling under the plume while the aircraft was mapping the area.The aircraft measurements correspond to portions of 3 flight lines between 09:54 UTC and 10:17 UTC.For the figure, we extracted the Mobile-DOAS and airborne measurements along the road in this time window, so the maximum time difference is 23 min.Indeed the plume structure may quickly vary but we also got this kind of comparisons during AROMAT-1, when we had several trips with the car under the plume, close in time to the aircraft overpass.Moreover, there is a theoretical explanation for such discrepancies invoking 3D effects of the radiative transfer as we already discuss in the text.We added the time information.
The airborne data correspond to three portions of flight lines recorded between 09:54 and 10:17 UTC.The BIRA Mobile-DOAS instrument was sampling the plume during this time so the maximum time difference is 23 minutes.
And in the caption of figure S5.
Both airborne and ground-based data were recorded between 09:54 and 10:17 UTC.
• Overall statement: All flights and ground based datasets shown in the manuscript should have noted time windows for data collection.
We agree with the comment.We added the time information where we found it was missing, as in the H2CO figure description.
And in the captions of figure 6, 7, 8, and 13.Presenting Fig. 7 and 13 in the manuscript, we added the time info as well.
• Line 183: Where is RADO?Should this be in the map in Figure 2? Or does it match up to one of the existing labels in Figure 2? RADO is the INOE atmospheric observatory in Magurele, which is already in Fig 2 .We clarified the location in the text.• Line 218: Specify which nadir-looking spectrometer was on the UGAL ultralight.
We changed the wording to 'the ULM-DOAS instrument', which is presented in the instrument list of the Supplement.
• Line 216: Splitted should be split Corrected • Line 294: Which mobile DOAS instrument is this?Additionally, I get the feeling when I read this paper through a few times that there may be other instances where mobile DOAS is stated but not identifying which one.Check these types of details.
We have specified the instrument (BIRA).We have checked the other occurrences of Mobile-DOAS and completed when needed (section 4.4.1 and caption of Table 7).
• Line 565: I missed where it was quantified that the no2 ground and airborne measurements agree with 7%.Can you clarify?Is it referring to the slope between MPIC/AirMAP from the supplement?If so, it is only between the two instruments.The sentence in the conclusion is broad making it seem like all NO2 measurements are within that agreement.Indeed, it is the aforementioned slope.We thought the scope was clear since we used 'These', which refers to the previous sentence giving the scope.But we agree it could be misleading, we rephrased: In the AROMAT conditions, airborne measurements were consistent with ground-based observations within 7\%... • Line 579-581: The 'structure' was not shown for HCHO to be able to draw whether it is visible from daily satellite overpasses.Please rephrase.
We rephrased to .
Due to the lower signal to noise ratio of the H_2CO observations, it is difficult to use such daily measurements for satellite validation.
• Figure 7 and Table 3: IUP-UB nadir only compact spectrometer is not listed in Table 3 for NO2, but the caption says it's from the IUP-UB for both NO2 and HCHO.I also saw some inconsistencies in IUP-UB vs IUP-Bremen in terms of naming convention, too.
We have changed IUP-UB to IUP-Bremen for consistency, and we added this instrument in Table 3.
• Figure 9: Is the last two digits in the legend correspond to hour?Is it in UTC? Please be clear in the caption.Another suggestion that will make this figure infinitely easier to investigate would be to make the y-axis only expand up to 2000m or so.There is not really anything changing above 1.5 km.
Indeed.We clarified the caption.However, we prefer to keep the y-axis as it is as it clearly points out that the upper troposphere is completely free of NO2.
• Figure 11: the color bar in the top left figure is different from the rest of the figure.Additionally, you mention Fig. S9 in the caption but looking at Fig. S9, I do not see 3 mobile DOAS sites.
We thank the referee for pointing us an error in this figure.We have corrected it.
• Figure S3 and S4: The colors for the lines in the caption don't appear to be right.The fit lines I see appear to be yellow (not blue) are very thin and hard to see.
We have updated the figure to make the straight lines thicker and corrected the color in the caption.
• Figure S11: add in the caption where these in situ measurements are collected in that domain.Are they at the Turceni powerplant?And are they at the ground?
Following the remark of the editor, we removed this figure and others to trim the supplement, as we did not use them in the analysis.
• Sections S2.6 doesn't have locations listed for these in situ measurements.And that section doesn't have SO2 or NO2, though Figure S11 shows NO2 and SO2 in situ measurements.
We removed Sect S2.6 (same reason as previous reply)

Relevant changes
-New title -Removed 4e15 for the temporal variation from the abstract and stress that it corresponds to a single flight in the conclusion -Removed the material in the supplement which was not used in the analysis Monitoring Instrument (TROPOMI)/Sentinel-5 Precursor (S5P).We present the AROMAT campaigns from the perspective of findings related to the validation of tropospheric NO 2 , SO 2 , and H 2 CO.We also quantify the emissions of NO x and SO 2 at both measurement sites.
We show that tropospheric NO 2 vertical column density (VCD) measurements using airborne mapping instruments are in principle well suited for satellite validation.The signal to noise ratio of the airborne NO 2 measurements is one order of magnitude higher than its spaceborne counterpart when the airborne measurements are averaged at the TROPOMI pixel scale.
However, we show that the temporal variation of the NO 2 VCDs during a flight might be a significant source of comparison error.Considering the random error of the TROPOMI tropospheric NO 2 VCD (σ), the dynamic range of the NO 2 VCDs field extends from detection limit up to 37 σ (2.6x10 16 molec cm −2 ) or 29 σ (2x10 16 molec cm −2 ) for Bucharest and the Jiu Valley, respectively.For both areas, we simulate validation exercises applied to the TROPOMI tropospheric NO 2 product. 1 These simulations indicate that a comparison error budget matching closely the TROPOMI optimal target accuracy of 25% can be obtained by when adding NO 2 and aerosol profile information to the airborne mapping observations, which constrains the investigated accuracy to within 28%.In addition to NO 2 , our study also addresses the measurements of SO 2 emissions from power plants in the Jiu Valley, as well as a urban hotspot of H 2 CO in the center of Bucharest.For these two species, we conclude that the best validation strategy would consist in deploying ground-based measurement systems at well identified locations.

Introduction
Since the launch of the Global Ozone Monitoring Experiment (GOME, Burrows et al. (1999)) in 1995, spaceborne observations of reactive gases in the UV-visible range have tremendously improved our understanding of tropospheric chemistry.GOME mapped the large urban sources of NO 2 in North America and Europe, the SO 2 emissions from volcanoes and coal-fired power plants (Eisinger and Burrows, 1998), and the global distribution of H 2 CO with its maxima above East Asia and the tropical forests (De Smedt et al., 2008).Subsequent air-quality satellite missions expanded on the observation capabilities of GOME.
Table 1 lists the past, present, and near-future nadir-looking satellite instruments dedicated to ozone and air quality monitoring with their sampling characteristics in space and time.The pixel size at nadir has shrunk from 320x40 km 2 (GOME) to 3.5x5.5 km 2 (TROPOMI, Veefkind et al. (2012), the original TROPOMI resolution of 7x5.5 km 2 was increased on 6 August 2019, MPC ( 2019)).This high horizontal resolution enables for instance to disentangle contradictory trends in ship and continental emissions of NO 2 in Europe (Boersma et al., 2015) or to distinguish the different NO 2 sources in oil sand mines in Canada (Griffin et al., 2019).The satellite-derived air quality products are now reliable enough to improve the bottom-up emission inventories (e.g.Kim et al. (2009), Fioletov et al. (2017), Bauwens et al. (2016)) and to be used in operational services, for instance to assist air traffic control with the near-real time detection of volcanic eruptions (Brenot et al., 2014).The bottom lines of Table 1 present the near-future perspective in spaceborne observation of the troposphere: a constellation of geostationary satellites will provide hourly observations of the troposphere above east Asia (GEMS, (Kim, 2012)), North America (TEMPO, Chance et al. (2013)), and Europe (Sentinel-4, Ingmann et al. (2012)).These new developments will open-up new perspectives for atmospheric research and air quality policies (Judd et al., 2018).
Validation is a key aspect of any spaceborne Earth observation mission.This aspect becomes even more important as the science matures and leads to more operational and quantitative applications.Validation involves a statistical analysis of the differences between measurements to be validated and reference measurements, which are independent data with known uncertainties (von Clarmann, 2006;Richter et al., 2014).The aim of validation is to verify that the satellite data products meet their requirements in terms of accuracy and precision., 2014).Richter et al. (2014) have discussed the challenges associated with the validation of tropospheric reactive gases.These challenges arise from the large variability in space and time of short-lived reactive gases, the dependency of the satellite products on different geophysical parameters (surface albedo, profile of trace gases and aerosols), the differences in vertical sensitivity between satellite and reference (ground-based or airborne) measurements, and the small signals.An ideal validation study would involve a reference dataset of VCDs whose well-characterized uncertainties would be small compared to those required for the investigated products.This reference dataset would cover a large amount of satellite pixels with adequate spatial and temporal representativeness at different seasons, places, and pollution levels.Beside the VCDs, the ideal validation exercise would also quantify the geophysical parameters that impact the retrieval of the investigated satellite products.In the real world however, Richter et al. (2014) points out that "the typical validation measurement falls short in one or even many of these aspects".
The first validations of the tropospheric NO 2 and H 2 CO VCD products of GOME involved in-situ samplings from aircraft (Heland et al., 2002;Martin et al., 2004).Such measurements may cover good fractions of satellite pixels but they miss the lower part of the boundary layer, where the trace gas concentrations often peak.Schaub et al. (2006) and Boersma et al. (2011) summarize other early validation studies for the tropospheric NO 2 VCDs retrieved from GOME, SCIAMACHY, and OMI.
Several of these studies make use of the NO 2 surface concentration datasets from air quality monitoring networks.Compared to campaign-based data acquisition, operational in-situ networks provide long-term measurements, but their comparison with satellite products relies upon assumptions on the NO 2 profile.Other validation studies use remote-sensing from the ground and aircraft, in particular based on the Differential Optical Absorption Spectroscopy (DOAS) technique (Platt and Stutz, 2008), which is also the basis for the retrieval algorithms of the satellite-derived products.In comparison with in-situ measurements, DOAS has the benefit of being directly sensitive to the column density of a trace gas, i.e. the same geophysical quantity as the one retrieved from space.Heue et al. (2005) conducted the first comparison between a satellite-derived product (SCIA-MACHY tropospheric NO 2 ) and airborne DOAS data.Many validation studies also use ground-based DOAS measurements, in particular since the development of the Multi-AXis DOAS (MAX-DOAS) technique (Hönninger et al., 2004).MAX-DOAS measurements are valuable for validation due to their ability to measure integrated columns at spatial scales comparable to the satellite ground pixel size.Moreover, they broaden the scope of validation activities since they also provide limited profile information on both trace gases and aerosols (Irie et al., 2008;Brinksma et al., 2008;Ma et al., 2013;Kanaya et al., 2014;Wang et al., 2017;Drosoglou et al., 2018).The limitations of using the MAX-DOAS technique for validation arise from their still imperfect spatial representativeness compared to typical satellite footprints and to some extent from their limited sensitivity in the free troposphere.Spatial representativeness has often been invoked to explain the apparent low bias of the OMI tropospheric NO 2 VCDs in urban conditions (Boersma et al., 2018).
These campaign activities quantified key pollutants (NO 2 , SO 2 , O 3 , H 2 CO, and aerosols) and assessed practical observation capabilities of future satellite instruments while preparing for their validation.They combined ground-based and airborne measurements.DISCOVER-AQ involved the deployment of the Geostationary Trace gas and Aerosol Sensor Optimization instrument (GEOTASO, Leitch et al.,2014;Nowlan et al.,2016) and of the Geostationary Coastal and Air Pollution Events (GEO-CAPE) Airborne Simulator (GCAS, Kowalewski and Janz,2014;Nowlan et al.,2018).In Europe, the two AROMAT campaigns, which took place in Romania in September 2014 and August 2015, demonstrated a suite of new instruments such as the Airborne imaging DOAS instrument for Measurements of Atmospheric Pollution (AirMAP, Schönhardt et al.,2015;Meier et al.,2017), the NO 2 sonde (Sluis et al., 2010), and the Small Whiskbroom Imager for atmospheric compositioN monitorinG (SWING, Merlaud et al.,2018).Different airborne imagers were intercompared and further characterized during the AROMAPEX campaign in April 2016 (Tack et al., 2019).
Two aforementioned publications focused on the AirMAP and SWING operations during the 2014 AROMAT campaign (Meier et al., 2017;Merlaud et al., 2018).In this work, we present the overall instrumental deployment during the two campaigns and analyze the relevance of these measurements for the validation of several air quality satellite products: tropospheric NO 2 , SO 2 and H 2 CO VCDs.The datasets collected during AROMAT fulfill several requirements of the ideal validation study, as described above.We further investigate the strengths and limitations of the acquired data sets.
The paper is structured as follows: Section 2 describes the two target areas and the deployment strategy.Section 3 characterizes the investigated trace gases fields in the sampled areas.Section 4 presents a critical analysis of the strengths and limitations of the campaign results while elaborating on recommendations for future validation campaigns in Romania.Eventually, we use the AROMAT measurements to derive NO x and SO 2 fluxes from the two sites.The Supplement presents technical details on the instruments operated during the campaigns and presents additional information and measurements.

Target areas and deployment strategy
This section presents the two target areas of the AROMAT campaigns, Bucharest and the Jiu Valley.It also lists available studies on air quality at these two sites as well as logistical aspects of relevance.
Figure 1 presents a map of the tropospheric NO 2 vertical column densities (VCDs) above Romania, derived from OMI measurements (Levelt et al., 2006) and averaged between 2012 and 2016.The map also indicates the position of the 8 largest cities of the country.Compared to highly polluted areas in western Europe such as northern Belgium or the Netherlands, Romania appears relatively clean at the spatial resolution of the satellite data.There are however two major NO 2 sources clearly visible from space, which appear to be of similar magnitude with NO 2 columns around 2.5 x 10 15 molec cm −2 : the Bucharest area and the Jiu Valley, northwest of Craiova.For the latter, the NO 2 enhancement is due to a series of large coal-fired thermal power plants.

Bucharest
Bucharest (44.4The NO 2 VCDs seen from space above Bucharest appear lower than over western European sites at the resolution of OMI (see Fig. S1 in the Supplement).However, this is partly due to the dilution effect for this relatively small and isolated source.Local studies based on the 8 air quality stations inside the city point out that, regarding local PM and NO x levels, Bucharest is amongst the most polluted cities in Europe (Alpopi and Colesca, 2010;lorga et al., 2015).The city center is the most heavily polluted, with concentrations of pollutants well above the European thresholds.For instance, the annual mean concentration of NO 2 at the traffic stations was about 57 µ g.m −3 in 2017 (EEA, 2019), when the EU limit is 40 µ g.m −3 .

The Jiu Valley between Targu Jiu and Craiova
The second NO 2 plume in Fig. 1 lies around 250 km west of Bucharest.It corresponds to a series of four thermal power plants located along the Jiu river between the cities of Targu Jiu (82,000 inhabitants, 45.03 • N, 23.27 • E) and Craiova (269,000 inhabitants, 44.31 • N, 23.8 • E).These plants were built in this area due to the presence of lignite (brown coal), which is burned to produce electricity.
The altitude of the valley ranges from 268 m a.s.l. in Targu Jiu to 90 m in Craiova.The valley is surrounded by moderately elevated hills (400 m a.s.l.).Due to the orography, the prevailing wind directions is from southwest to southeast.
Beside NO 2 , the SO 2 emissions from these plants are also visible from space, as first reported by Eisinger and Burrows (1998) using GOME data.Since 2011, the OMI-derived trends above the area indicate that the emissions of SO 2 have been decreasing, while those of NO 2 are stable (Krotkov et al., 2016).This is related to the installation of flue gas desulfurization (FGD) systems, which was part of environmental regulations imposed on Romania following its entry in the European Union in 2007.
Figure 3 presents a map of the Jiu Valley area with the four power plants.The map also shows the tracks of the two airborne platforms (the FUB Cessna and an ultralight operated by UGAL) operated in this area during AROMAT-2.Table S1 in the Supplement presents the geographical positions, nominal capacities, and smokestack heights of the four power plants.From north to south, the plants are named according to their locations: Rovinari, Turceni, Isalnita and Craiova II.
During the AROMAT campaigns, we focused in particular on the emissions of the Turceni power plant (44.67 • N, 23.41 • E).
With a nominal capacity of 1650 MW, it is the largest electricity producer in Romania.The Turceni power plant is located in a rural area, 2 km ESE of the village of Turceni.The plant emits aerosols, NOx, and SO 2 from the 280 m high smokestacks.
Scientific studies on air quality inside the Jiu Valley are sparse.Previous measurements performed by INOE during a campaign in Rovinari in 2010 indicated elevated volume mixing ratios of NO 2 (up to 30 ppb) and of SO 2 (up to 213 ppb) (Nisulescu et al., 2011;Marmureanu et al., 2013).The maximum ground concentrations occurred in the morning, before the planetary boundary layer development.Mobile-DOAS observations performed in 2013 revealed columns of NO 2 up to 1 x 10 17 molec cm −2 (Constantin et al., 2015).

Groups, instruments, and platforms
The AROMAT consortium consisted of research teams from Belgium (BIRA-IASB), Germany (IUP-Bremen, FUB, MPIC), The Netherlands (KNMI), Romania (University "Dunarea de Jos" of Galati, hereafter UGAL, National Institute of R&D for Optoelectronics, hereafter INOE, and National Institute for Aerospace Research "Elie Carafoli", hereafter INCAS), and Norway (NILU).The AROMAT consortium had a common focus on measuring the tropospheric composition using various techniques.

Geophysical results
This section presents selected findings related to tropospheric NO 2 , SO 2 and H 2 CO in the two target areas.The Supplement gives details about the instruments involved in these observations and presents additional measurements in Bucharest and the Jiu Valley.This suggests that along this portion of the flight, which was inside the plume but outside the city, the NO 2 VMR measured at 300 m a.s.l. may be used as a proxy for the NO 2 VCD.Indeed, the BLH was about 1500m (Fig. S8 in the Supplement and discussion therein) during these observations.Assuming a constant NO 2 VMR of 3.5 ppb in the boundary layer leads to a NO 2 VCD of 1.4 x 10 16 molec cm −2 .This estimate is close to the AirMAP NO 2 VCD observed in the plume (Fig. 6).When measured at 300 m a.s.l., the NO 2 VMR thus seems a good estimate of its average within the boundary layer.Note that this finding is specific to the configuration in Bucharest where we flew at 10 km from the city center and does not apply to our measurements in the exhaust plume of the Turceni power plant (Fig. 9).Future campaigns should include vertical soundings inside the Bucharest plume to further investigate its NO 2 vertical distribution.We estimated the H 2 CO reference column for the airborne data using the Mobile-DOAS measurements.Both NO 2 and H 2 CO are in good agreement when comparing their distributions as seen from the airborne and ground-based instruments.However, if the highest H 2 CO VCDs are found above the Bucharest city center, they are not coincident with the NO 2 maximum, as can be seen comparing the upper and lower panels of Fig. 7, for instance on the second Cessna flight line from the north.
The H 2 CO hotspot observed above Bucharest is mainly anthropogenic.Indeed, biogenic emissions typically account for 1 to 2 x 10 16 molec cm −2 (J.-F.Müller, personal communication), in agreement with the background VCDs measured by the Mobile-DOAS along the Bucharest ring.During the measurements, the wind was blowing from south and west.The difference between NO 2 and H 2 CO spatial patterns may be explained by the different origins of NO x compared to H 2 CO or by the formation time of H 2 CO through the oxidation of VOCs.
Anthropogenic hotspots of H 2 CO have already been observed, e.g.above Houston (Texas), an urban area which includes significant emissions from transport and petrochemical industry (Parrish et al., 2012;Nowlan et al., 2018)  instruments is not straightforward in these conditions.Table S2 in the Supplement gives the typical AMFs used in this analysis for airborne and zenith-only Mobile-DOAS.When observed with the Mobile-DOAS, the plume shows higher NO 2 VCDs and appears narrower than with the airborne instruments.This is partly related to air mass factor uncertainties, but they can not explain alone such a discrepancy.Close to the power plant, the plume is very thin and heterogeneous which leads to 3-D effects in the radiative transfer, as suggested in a previous AROMAT study (Merlaud et al., 2018).In these conditions, the 1-D atmosphere of the radiative transfer models used to calculate the airborne AMFs may not be realistic enough and bias the VCDs measured from the aircraft.
Figure 9 shows those AROMAT-1 NO 2 sonde measurements above Turceni which detected the plume.The NO 2 is not well-mixed in the boundary layer, with maxima aloft and lower VMRs close to the surface.This is understandable so close to the source, as high-temperature NO x is emitted from the 280 m high stack.In these balloon-borne datasets, the observed maximum NO 2 VMR is about 60 ppb inside the plume, and the NO 2 VMR vanishes above 1200 m a.s.l..These results suggest that airborne measurements with the ULM-DOAS, which can fly safely at 1500 m a.s.l., can provide reliable measurements of the integrated column amount inside the plume.As for NO 2 , it appears difficult to quantitatively relate the airborne and Mobile-DOAS SO 2 VCDs observations in the close vicinity of the power plant.As shown in Fig. S5 of the Supplement (lower panel), the maximum SO 2 VCD measured from the ground on the road close to the factory amounts to 1.3 x 10 18 molec cm −2 while from the aircraft, the SO 2 VCD reached 8 x 10 17 molec cm −2 .Part of this difference can be explained by 3-D effects on the radiative transfer, as for NO 2 .As discussed below, it seems easier to compare the SO 2 flux.

Discussion
In this section, we develop the lessons learned from our study for the validation of satellite observations of the three investigated tropospheric trace gases, namely NO 2 , SO 2 , and H 2 CO.For each molecule, we discuss the benefit of conducting such airborne campaigns as well as the choice of Romania as a campaign site.In the last part of the section, we also estimate the NO x and SO 2 emissions from Bucharest and from the power plants of the Jiu Valley, using the different datasets of the campaigns.tropospheric VCD seen by TROPOMI would be around 2x10 16 molec cm −2 (29 σ for TROPOMI).

Characterization of the reference measurements
Table 3 summarizes the NO 2 observations during the AROMAT campaigns.For each instrument, the table indicates the measured range of NO 2 VCDs (or VMRs), the ground sampling distance and a typical detection limit and bias.Regarding DOAS instruments, we estimated the detection limits on the NO 2 VCDs from typical 1-σ DOAS fit uncertainties divided by typical air mass factors (AMF).Table S2 in the Supplement presents these typical AMFs and detection limits.The 1-σ DOAS fit uncertainty is instrument specific and an output of the DOAS fitting algorithms.The AMF depends on the observation's geometry, atmospheric and surface optical properties.Uncertainties on the AMF usually dominate the systematic part of the error for the DOAS measurements.Therefore, for these instruments, the bias given in Table 3 corresponds to the uncertainty in their associated AMF.
Combined with the ground sampling distance, the detection limit enables one to quantify the random uncertainty of a reference observation at the satellite horizontal resolution.Indeed, considering reference measurements averaged within a satellite pixel, the random error associated with the averaged reference measurements decreases with the square root of the number of measurements, following Poisson statistics.For instance, a continuous mapping performed with SWING at a spatial resolution of 300 x 300 m 2 inside a TROPOMI pixel of 3.5 x 5.5 km 2 would lead to 214 SWING pixels.Averaging the NO 2 VCDs of these SWING pixels would divide the SWING original uncertainty (1.2x10 15 molec.cm−2 ) by √ 214, leading to 8.2x10 13 molec.cm−2 , about one tenth of the random error of TROPOMI (7x10 14 molec.cm−2 ) given in table 2.
However, the temporal variation of the NO 2 VCDs further adds uncertainty to the reference measurements when comparing them with satellite data.The validation areas typically extend over a few tens of kilometers.At this scale, satellite observations are a snapshot in time of the atmospheric state, while an airborne mapping typically takes one or two hours.DSCDs.For each AirMAP overpass, we averaged the NO 2 VCDs at the horizontal resolution of TROPOMI (see previous section).The standard deviation of the differences between two averaged overpasses then indicates the random part of the NO 2 VCDs temporal variation during an aircraft overpass.This standard deviation is 3.7x10 15 and 4.2x10 15 molec cm −2 , respectively between the first and second, and second and third overpass.Hereafter, we used 4x10 15 molec cm −2 as random error due to the temporal variation.
Clearly, the NO 2 VCD temporal variation depends on characteristics of a given validation experiments, such as the source locations and the wind conditions during the measurements.The temporal variation also depends on the time of the day and we base our estimate here on measurements around 11:00 LT while TROPOMI overpass is at 13:30 LT.In the studied case however, this error source is larger for the reference measurements than the TROPOMI precision (7x10 14 molec cm −2 ).This is quite different from using static MAX-DOAS as reference.The latter are usually averaged within one hour around the satellite overpass.Compernolle et al. (2020) quantify the temporal error for MAX-DOAS NO 2 VCDs, typically ranging between 1 to 5x10 14 molec cm −2 .In the next section, we investigate the effect of underestimating the temporal random error.

Simulations of validation exercises in different scenarios
We simulated TROPOMI Cal/Val exercises with the spatially averaged AirMAP observations described in Sect.4.1.1.We considered these averaged AirMAP NO 2 VCDs as the ground truth in simulated TROPOMI pixels, on which we added Gaussian noise to build synthetic satellite and reference NO 2 VCDs datasets.For the synthetic satellite observations, the noise standard deviation corresponded to the TROPOMI random error (the precision in Table 2).For the synthetic airborne observations, we added in quadrature the aforementioned averaged airborne shot noise (e.g.7x10 13 molec cm −2 for SWING) and temporal error (4x10 15 molec cm −2 , which we assumed to be also realistic around Turceni).We then applied weighted orthogonal distance regressions to a series of such simulations to estimate the uncertainty on the regression slope.This led to slope uncertainties of about 6% and 10% in Bucharest and Turceni, respectively.
In a real-world validation experiment, this regression slope would quantify the combined biases of the two NO 2 VCDs datasets (satellite and reference).These biases mainly originate from errors in the AMFs, resulting in particular from uncertainties on the NO 2 and aerosol profiles, and on the surface albedo.To some extent, these quantities can be measured from an aircraft with the type of instrumentation deployed in the AROMAT activity.The ground albedo can be retrieved with the DOAS instruments by normalizing uncalibrated airborne radiances to a reference area with known albedo (Meier et al., 2017) or by using a radiometrically calibrated DOAS sensor (Tack et al., 2019).The NO 2 and aerosol profiles can be measured with in-situ instruments such as a CAPS NO 2 monitor and a nephelometer.For legal reasons, vertical soundings are difficult above cities.One can measure the NO 2 and aerosol profile further down in the exhaust plume, once the latter is above rural areas.The conditions inside the city can be different and this motivates the deployment of ground-based instruments, e.g.sunphotometers and MAX-DOAS, inside the city.
Regarding uncertainties on the references AMFs, the benefit of knowing the aerosol and NO 2 profile appears when comparing the AMF error budget for airborne measurements above Bucharest (26%, Meier et al. (2017)) and above the Turceni power plant (10%, Merlaud et al. (2018)).In the latter case, there was accurate information on the local NO 2 and aerosol profiles thanks to the lidar and the balloon-borne NO 2 sonde, respectively.We used these two AMF uncertainties to estimate a total possible bias between reference and satellite observations.Table 6 presents total error budgets for different scenarios of validation exercises using reference airborne mapping to validate spaceborne tropospheric NO 2 VCDs.We estimated the random and systematic uncertainties between satellite and reference measurements with SWING and AirMAP, including (or not) profile information on the aerosols and NO 2 VMR, and for measurements over Bucharest or Turceni.Note that we considered 25% for the satellite accuracy.The temporal error of the airborne measurements clearly dominates the total random error, making the differences in detection limit between AirMAP and SWING irrelevant for this application.Adding the profile information on the other hand reduces the total multiplicative bias from 37% to 28% or 29% in Bucharest and Turceni.This quantifies the capabilities of such airborne measurements for the validation of the imaging capabilities of TROPOMI regarding the NO 2 VCDs above Bucharest and the Jiu Valley.
Finally, it should be noted that these regression simulations assume a correct estimation of the temporal random error.
Underestimating this error propagates in the fit of the regression slope.Figure 12 presents the possible effect of such an underestimation when the a priori random error of the reference measurements is set at 1x10 15 molec cm −2 , using again the AirMAP observations of Fig. 5 (right panel) as input data.As the dynamic range of the reference measurements increases with the applied error, the fitted slope decreases.For a true error of 4x10 15 , this leads for instance to an underestimation of the slope of about 5%.This effect is small but other sources of random error (e.g undersampling the satellite pixels) would add up in a real-world experiment.Wang et al. (2017) observed such a systematic decrease of the regression slope when averaging MAX-DOAS measurements within larger time windows around the satellite overpass.

Lessons learned for the validation of space-borne H 2 CO VCDs
Table 4 is similar to Table 3 but for H 2 CO, which we only measured in significant amounts in and around Bucharest.
The background level of the H 2 CO VCD around the city is around 1x10 16 molec cm −2 and the anthropogenic increase in the city center is up to 7x10 16 molec cm −2 (Fig. 7).The background falls within the TROPOMI H 2 CO spread (1.2x10 16 molec cm −2 ), and Fig. 7 indicates that the extent of the urban hotspot only corresponds to a few TROPOMI pixels, with a maximum at 6 σ.This limits the relevance of individual mapping flights for the validation of H 2 CO, yet systematic airborne measurements would improve the statistics.The information on the H 2 CO horizontal variability is nevertheless useful, as it justifies the installation of a second MAX-DOAS in the city center, in addition to background measurements outside the city.
Indeed, long-term ground-based measurements at two sites would be useful to investigate seasonal variations of H 2 CO, as already demonstrated in other sites (De Smedt et al., 2015).Averaging the H 2 CO over a season would reduce the random errors of the satellite measurements and it could reveal the horizontal variability of H 2 CO from space.The H 2 CO hotspot around Bucharest seems to be visible in the TROPOMI data of summer 2018 (I.De Smedt, personnal communication).
Getting information on the profile of H 2 CO during an airborne campaign may also help to understand the differences between ground-based and space-borne observations.This could be done by adding to the BN-2 instrumental set-up an in-situ H 2 CO sensor such as the In Situ Airborne Formaldehyde instrument (ISAF, Cazorla et al. (2015)) or the COmpact Formaldehyde FluorescencE Experiment (COFFEE, St. Clair et al. ( 2017)).

Lessons learned for the validation of space-borne SO 2 VCDs
Table 5 is similar to Table 3 but for SO 2 , which we only measured in significant amounts in the Jiu Valley.The higher bias of the airborne measurements for SO 2 compared to NO 2 is due to the albedo.The latter is lower in the UV where we retrieve SO 2 , which leads, for the same albedo error, to a larger AMF uncertainty (e.g.Merlaud et al., 2018, Fig.10).
Averaging the SO 2 VCDs from the airborne mapping of Fig. 10 at the TROPOMI resolution leads to 30 near nadir TROPOMI pixels above a 2-σ error of 5.4x10 16 molec cm −2 .The maximum SO 2 tropospheric VCD seen by TROPOMI would be 2.4x10 17 molec cm −2 (7 σ).This tends to indicate that airborne mappings of SO 2 VCDs above large power plants could help to validate the horizontal variability of the SO 2 VCDs measured from space, to a limited extent in the AROMAT conditions due to the small dynamic range (7 σ).As for H 2 CO, systematic airborne measurements would improve the statistics.
However, it would be difficult to quantify the bias of the satellite SO 2 VCD with AROMAT-type of airborne measurements.
Adding in quadrature the biases of the SO 2 VCDs for airborne measurements (40%, Table 5) and for TROPOMI (30%, Table 2) already leads to a combined uncertainty of 50%, without considering any temporal variation or regression error.This best-case scenario is already at the upper limit of the TROPOMI requirements for tropospheric SO 2 VCDs (Table 2).
Similar to H 2 CO, the validation of the satellite-based SO 2 measurements should thus rely on ground-based measurements, enabling to improve the signal-to-noise ratio of the satellite and reference measurements by averaging their time series.An additional difficulty for validating SO 2 VCDs emitted by a power plant arise from the spatial heterogeneity of the SO 2 field around the point source, which renders ground-based VCDs measurements complicated.
On the other hand, Fioletov et al. (2017) presented a method to derive the SO 2 emissions from OMI data and validated it against reported emissions.The SO 2 fluxes can be measured locally in several ways and we tested some of them during AROMAT-2 (see Sect. 4.4.2below).To validate satellite-derived SO 2 products in Europe, it thus seems possible to compare satellite and ground-based reference SO 2 fluxes.Theys et al. ( 2019) already validated TROPOMI-derived volcanic SO 2 fluxes against ground-based measurements.In this context, a SO 2 camera pointing to the plant stack would be a valuable tool since it could be permanently installed and automated.One advantage of such a camera compared to the other tested remote-sensing instruments, beside its low operating cost, is that it derives the extraction speed from the measurements, avoiding dependence on low-resolution wind information.The next section presents the SO 2 fluxes derived with such a camera during the 2015 campaign.
Note that the SO 2 VCDs measured on 28 August 2015 around Turceni may be higher than in standard conditions due to a temporary shutdown of the desulfurization unit, which was reported by local workers.SO 2 VCDs in the area seem to have decreased (D.Constantin, personnal communication).The first reported TROPOMI SO 2 measurements above the area pinpoint other power plants in Serbia, Bosnia-Herzegovina, and Bulgaria (Fioletov et al., 2020).For validation studies, it would be worth to install automatic SO 2 cameras around these plants, until they are equipped with FGD units.enables to derive such emissions on a daily basis (Lorente et al., 2019).Regarding SO 2 , as discussed in the previous section, the low signal-to-noise ratio of the satellite measurements implies averaging for several months to derive a SO 2 flux (Fioletov et al., 2020), yet campaign measurements are useful to select an interesting site and test the ground-based apparatus and algorithms.
The comparisons with reported emissions should not be overintrepreted since we compare campaign-based flux measurements performed during a few days in daytime with reported emissions which represent yearly averages.Nevertheless, they give interesting indications about the operations of the FGD units of the power plants and possible biases in emission inventories.
Our flux estimates are all based on optical remote sensing measurements.They involve integrating a transect of the plume along its spatial extent and multiplying the outcome by the plume speed, which may correspond to the stack exit velocity (camera pointing to the stack) or to the wind speed (Mobile-DOAS and imaging-DOAS).We refer the reader to previous studies for the practical implementations.Ibrahim et al. (2010) presented the method we used for Bucharest, where we encircled the city with the Mobile-DOAS.Meier et al. (2017) presented the AirMAP-derived flux estimations, while Johansson et al. (2014) derived industrial emissions from a car-based Mobile-DOAS instrument as we did for the Turceni power plant.Constantin et al. (2017) presented the fluxes based on the ULM-DOAS measurements.Regarding the SO 2 cameras, they are now commonly used to monitor SO 2 emissions from volcanoes (see McGonigle et al. (2017) and references therein), but their capacity to measure SO 2 fluxes from power plants have been demonstrated as well (e.g., Smekens et al., 2014).plume (15:00 UTC).The ERA5 database indicates a constant windspeed between 1000 and 900 hPA of 1.2 m s −1 .Finally and similarly to Meier (2018), we took a ratio of 1.32 for the NO x to NO 2 ratio and estimated the chemical loss of NO x with a lifetime of 3.8h and an effective source location in the center of Bucharest.
Table 7 presents the AirMAP and Mobile-DOAS derived NO x fluxes from Bucharest, ranging between 12.5 and 17.5 mol.s −1 .On 8 September 2014, the Mobile and airborne observations were coincident.Their estimated NO x fluxes agree within 20%.This gives confidence in the flux estimation yet one should keep in mind that the same wind data was used for both estimations.Meier (2018) estimated the uncertainties on the AirMAP-derived NO x flux to be around 63%, while the uncertainty of Mobile-DOAS derived NO x flux typically range between 30% and 50% (Shaiganfar et al., 2017).
We compared our measured NO x fluxes with the European Monitoring and Evaluation Programme inventory (EMEP, https: //www.ceip.at/).In practice, we summed the EMEP gridded yearly NO x emissions between 44.2 • and 44.6 • N and between 25.9 • E and 26.3 • E and we assumed the emissions are constant during one year.This led to NO x emissions of 6.14 and 6.33 mol s −1 for 2014 and 2015.Studying the reported emissions from several European cities including Bucharest, Trombetti et al. (2018) mentions that the EMEP emissions are well below other inventories for all the pollutants.We thus also compared our flux with the Emissions Database for Global Atmospheric Research (EDGAR v4.3.2, Crippa et al. (2018)), which is only available until 2012.The same method led to a NO x flux of 18.4 mol s −1 , to compared with the 2012 EMEP NO x emissions of 7.1 mol s −1 .Based on summer measurements, the AROMAT-derived NO x emissions do not include residential heating.The latter ranges between 10 and 40% of the total NO x according to Trombetti et al. (2018).This tends to confirm that the EMEP inventory underestimates the NO x emissions for Bucharest.ratio.When considering the longitude, the low SO 2 to NO 2 ratio (1.33) appears to correspond to the Rovinari exhaust plume, while the other power plants exhibit a higher ratio (13.55).The low ratio observed at Rovinari corresponds to the FGD units operating at this power plant.
We estimated the NO x and SO 2 flux from the power plants using several instruments: a Mobile-DOAS, the ULM-DOAS, and the SO 2 camera.For the DOAS instruments, we inferred the wind direction from the plume position and we retrieved the wind speed from the ERA5 database.Considering the observed vertical extent of the plume downwind of Turceni (Fig. 9), we took the wind speed at 950 hPa (ca.500 m a.s.l.).
Figure 14 presents the ULM-DOAS-estimated fluxes of NO x and SO 2 from the power plants in Turceni, Rovinari, and Craiova for the flight on 26 August 2015.The figure also shows the reported emissions from the European Environment Agency (EEA) large combustion plants database (EEA, 2018), assuming constant emissions throughout the year.Turceni appears to be the largest SO 2 source (78 mol s −1 ), while Rovinari is the largest NO x source (8 mol s −1 ).
It is difficult to interpret the discrepancies between those measured fluxes and the yearly reported emissions since we observed large variations in the instantaneous emissions with the SO 2 camera (see below and Fig. 15).However, the ratio of the two fluxes appears interesting since we can assume its relative stability.This ratio for a given power plant depends on whether or not a desulfurization unit is operational at the plant.On Fig. 14, Turceni appears to have both the largest measured ratio and the largest discrepancy between the measured and reported ratios.This is consistent with a temporary shutdown of the desulfurization unit of the Turceni power plant, as was reported by the plant workers during the campaign.The ULM-DOAS measurements on 25 August 2015 (shown in Fig. S11 the Supplement), which also sampled the Isalnita plume, are consistent with those of 26 August 2015.These measurements enable to estimate total NO x and SO 2 fluxes to be about 22 and 147 mol s −1 , respectively.
Table 8 focuses on the Turceni power plant and lists all estimates of the NO x and SO 2 emissions from this source.Meier Figure 15 presents a time series of the SO 2 emissions from the Turceni power plant between 9:00 and 10:50 UTC on 28 August 2015.We derived SO 2 fluxes at different altitudes above the stack using a UV SO 2 camera which is an updated version of the Envicam2 system, used during the SO 2 camera intercomparison described by Kern et al. (2015).We converted the measured optical densities to SO 2 column densities using simultaneous measurements with an integrated USB spectrometer (Lübcke et al., 2013).We estimated the stack exit velocity from the SO 2 images, recorded with a time resolution of about 15 seconds, by tracking the spatial features of the plume.Dekemper et al. (2016) used a similar approach to derive the NO 2 flux from NO 2 camera imagery.
The SO 2 fluxes retrieved for transverses at 400 to 700 m vertical distances above the stack agree on average with each other within 20%.Emissions estimated 100 m above the stack are underestimated due to saturation (SO 2 column densities above 2 x 10 18 molec.cm−2 ) and high aerosol concentration close to the exhaust.
The SO 2 emissions show large fluctuations.During the time of our observations they increased from 1 kg.s −1 (15.6 mol.s −1 ) to around 4 ± 1 kg −1 (62.4 mol.s −1 ).The images (Fig. S10 in the Supplement) also show a second and weaker source that emits SO 2 .This is probably the desulfurization unit, which was reported to be turned on again on this day, after the temporary shutdown.Indeed, as appears in Table 8, the SO 2 /NO 2 ratio measured from AirMAP is lower than the ones measured from the ULM-DOAS during the previous days, and the same holds true for the Mobile-DOAS measurements.

Conclusions
The two AROMAT campaigns took place in Romania in September 2014 and August 2015.They combined airborne and ground-based atmospheric measurements and focused on air quality-related species (NO 2 , SO 2 , H 2 CO, and aerosols).The AROMAT activity targeted the urban area of Bucharest and the power plants of the Jiu Valley.The main aims were to test new instruments, measuring the concentrations and emissions of key pollutants in the two areas, and investigating the concept of such campaigns for the validation of air quality satellite-derived products.
We have shown that the airborne mapping of tropospheric NO 2 VCDs above Bucharest is potentially valuable for the validation of current and future nadir-looking satellite instruments.In the AROMAT conditions, airborne measurements were consistent with ground-based observations within 7% and covered a significant part of the dynamic range of the NO 2 tropospheric VCDs at an appropriate signal to noise ratio.Our simulations, based on campaign measurements and TROPOMI characteristics, indicate that we can constrain the accuracy of the satellite NO 2 VCDs within 28 or 37%, depending on whether information on the aerosol and NO 2 profile is available or not.This points to the importance of acquiring profile information to approach the TROPOMI optimal target accuracy for tropospheric NO 2 VCDs (25%).
A unique advantage of airborne mapping is its ability to validate the imaging capabilities of nadir-looking satellites.This feature becomes more important as the satellite horizontal resolutions reaches the suburban scale.Judd et al. (2019) pointed out the difficulty for static ground-based measurements to represent the NO 2 VCDs measured from space in polluted areas, due to the horizontal representativeness error.This error cancels out by mapping the full extent of satellite pixels.The caveat is the temporal error, which can be larger than with static ground-based measurements.For a single morning flight above Bucharest, we have estimated the random part of this temporal error to be about 4 x 10 15 molec cm −2 .In the AROMAT conditions, underestimating this error would lead to a low bias in the regression slope between satellite and airborne measurements.
This temporal error varies with local conditions for a given experiment but the satellite air quality community should further investigate this effect.This indicates the usefulness of simultaneous ground-based measurements, which may also be useful to estimate the reference NO 2 VCDs in the airborne observations.These conclusions for NO 2 above Bucharest apply to other large polluted urban areas.
In addition to NO 2 , we also detected the signature of H 2 CO emissions in and around Bucharest, with an anthropogenic hotspot in the city center.Due to the lower signal to noise ratio of the spaceborne H 2 CO observations, it is difficult to use such daily measurements for satellite validation.We thus propose considering long-term ground-based MAX-DOAS measurements in the city for the validation of H 2 CO.
In the Jiu Valley, NO 2 is clearly visible from both satellite and aircraft, and the VCDs are comparable in magnitude with the signal detected above Bucharest.However, it appears more complicated to quantitatively compare the NO 2 VCDs datasets in the thick exhaust plumes of the power plants.These plants also emit SO 2 but, as for H 2 CO, the low signal to noise ratio of satellite measurements reduces the validation relevance of individual airborne measurements.
In relation to the ideal validation study mentioned in the introduction, the relevance of international airborne campaigns is generally limited by its timespan of typically a couple of weeks, imposed by logistical and cost considerations.To overcome this limitation, we propose to consider routine airborne mapping of NO 2 VCDs by local aircraft operators and close to a well-equipped ground-based observatory.Such a set-up would reduce the fixed costs of the observations, which could then be allocated to flight hours in different seasons.Such an approach would combine the advantages of long-term ground-based and airborne measurements.In the longer term, high altitude pseudo-satellites (HAPS) could provide the necessary routine measurements above selected supersites, as needed to validate the observations from future sensors in geostationary orbit.
Competing interests.The authors declare that they have no conflict of interest.
Author contributions.AM, LB, D-EC, MDH, ACM, LG, DN, and MVR planned and organised the campaign.All coauthors contributed to the campaign either as participants or during campaign preparation and/or follow up data analysis, including the writing of this manuscript, which was coordinated by AM and MVR with feedback and contributions from all the coauthors.

Figure 8
Figure 8 presents the horizontal distribution of the NO2 VCDs in the Jiu Valley measured with the MPIC Mobile-DOAS on 23 August 2015 between 08:07 and 14:16 UTC.

Figure 13
Figure 13 presents a scatter plot of the slant columns of NO2 and SO2 for the ultralight flight of 26 August 2015, which detected the four exhaust plumes of the Valley between 08:31 and 11:04 UTC.

-
Corrected Fig 11 where some of the color bars were wrong -Added the time information on the measurements where it was missing The Airborne ROmanian Measurements of Aerosols and Trace gases (AROMAT) campaigns took place in Romania in September 2014 and August 2015.They focused on two sites: the Bucharest urban area and large power plants in the Jiu Valley.The main objectives of the campaigns were to test recently developed airborne observation systems dedicated to air quality studies and to verify their applicability for the validation of spaceborne atmospheric missions such as the TROPOspheric Stefan et al. (2013) have shown the importance of local conditions and anthropogenic factors in air quality analysis in areas close to Bucharest, during two weeks of measurements in2012.lorga et al. (2015)  andGrigoraş et al. (2016) showed that the main NO x contributions came from traffic and production of electricity, spread over about 10 medium-size thermal power plants within the city.

Figure 2
Figure 2 shows the Bucharest metropolitan area and the flight tracks of the two scientific aircraft used during AROMAT-2 (the FUB Cessna-207 and the INCAS BN-2).Note that the BN-2 tracks are actually a good indication of the Bucharest ring road.We were not allowed to cross the ring road with the BN-2, except in the North of the city.The figure also pinpoints important locations for the AROMAT campaigns.The FUB Cessna took-off and landed at the Baneasa international airport, located 8 km north of Bucharest city center (44.502 • N, 26.101 • E).The INCAS BN-2 also used Baneasa airport during

Figure 4
Figure4illustrates the typical instrumental deployment during the campaigns.The set-up combined airborne and groundbased measurements to sample the 3-D chemical state of the lower troposphere above polluted areas.The Supplement presents the main atmospheric instruments operated during the two campaigns, classified into airborne, ground-based, remote sensing, and in-situ sensors.The primary target species during AROMAT-1 were NO 2 and aerosols while the observation capacities expanded in AROMAT-2 through the improvements of the AirMAP and SWING sensors for SO 2 measurements and the deployments of other instruments such as SO 2 cameras, DOAS instruments targeted to H 2 CO, and a PICARRO instrument to measure water vapor, methane, CO, and CO 2 .We used two small tropospheric aircraft: the Cessna-207 from FUB, and the Britten-Norman Islander (BN-2) from INCAS.The Cessna was dedicated to remote sensing.It mainly performed mapping flights at 3 km a.s.l. for the airborne imagers, while parts of the ascents and descents were used to measure aerosol extinction profiles with the FUBISS-ASA2 instrument.The BN-2, which was only used during AROMAT-2, was dedicated to in-situ measurements around Bucharest between surface and 3000 m a.s.l..In AROMAT-2, there was also an ultralight aircraft used by UGAL for nadir-DOAS observations in the Jiu Valley.The ultralight aircraft typically flew between 600 and 1800 m a.s.l.Two UAVs, operated by INCAS and UGAL flew during AROMAT-1.These measurements were not repeated during AROMAT-2 since the coverage of the UAVs was too limited, both in horizontal and vertical direction.Finally, we also launched balloons carrying NO 2 sondes from Turceni and performed Mobile-DOAS measurements from several cars during both campaigns.The Supplement provides more details about the practical deployments during the campaigns.

3. 1 Figure 6
Figure 5 presents two maps of the AROMAT NO 2 measurements performed with the AirMAP, CAPS, and MPIC mobile-DOAS instruments above Bucharest, on 30 (Sunday afternoon, left panel) and 31 (Monday afternoon, right panel) August 2015.AirMAP is a remote sensing instrument that mapped the NO 2 VCDs from the Cessna at 3 km a.s.l. and produced the continuous map.The CAPS is an in-situ instrument, it was operated on the BN-2 and sampled the air at 300 m a.s.l. and performed vertical soundings above Magurele.The MPIC Mobile-DOAS mainly drove along the Bucharest ring road.The datasets of Fig. 5 reveal large differences of NO 2 amounts on Sunday 30 August 2015 compared to Monday 31 August 2015.On Sunday afternoon, the NO 2 VCDs peak around 1.5 x 10 16 molec cm −2 .On Monday, the NO 2 plume spread from the Figure7shows the H 2 CO and NO 2 VCDs measurements from the IUP-Bremen nadir instrument operated onboard the Cessna on 31 August 2015 (morning flight), together with the MPIC Mobile-DOAS measurements.The airborne data shown correspond to the second overpass (07:46-08:23 UTC) while the Mobile-DOAS were recorded between 08:13 and 10:00 UTC.The H 2 CO VCDs range between 1±0.25 x 10 16 molec cm −2 and 7.5±2 x 10 16 molec cm −2 , a maximum observed inside the city.

Figure 10 (
Figure 10 (upper panels) shows the AirMAP and SWING NO 2 VCDs measured around the Turceni power plant on 28 August 2015.The two airborne instruments largely agree, detecting NO 2 VCDs up to 8x10 16 molec cm −2 in the exhaust plume of the power plant.Figure S5 in the Supplement (upper panel) extracts the AirMAP and SWING NO 2 VCDs along the path of the ground-based BIRA Mobile-DOAS measurements and compares the three datasets.The airborne data correspond to three portions of flight lines recorded between 09:54 and 10:17 UTC.The BIRA Mobile-DOAS instrument was sampling the plume during this time so the maximum time difference is 23 minutes.This comparison confirms the good agreement for the airborne instruments but indicates that comparing airborne nadir-looking DOAS with ground-based zenith Mobile-DOAS

4. 1
Lessons learned for the validation of space-borne NO 2 VCDs 4.1.1Number of possible pixels and dynamic range at the TROPOMI resolution Regarding Bucharest, the mapped area of Fig. 5 (right panel) virtually covers 43 TROPOMI near-nadir pixels.Averaging the high spatial resolution AirMAP NO 2 VCDs within these 43 hypothetical TROPOMI measurements reduces the dynamic rangeof the observed NO 2 field.The latter decreases from 3.5x10 16 to 2.6x10 16 molec cm −2 (37 σ where σ is the required precision on the tropospheric NO 2 VCD).Nevertheless, 33 of the 43 hypothetical TROPOMI pixels exhibits a NO 2 VCD above the required 2-σ random error for TROPOMI (1.4x10 15 molec cm −2 ).

Figure 11
Figure 11 illustrates our estimation of the temporal variation of the NO 2 VCDs comparing consecutive AirMAP overpasses above Bucharest from the morning flight of 31 August 2015.During this flight, the Cessna covered the same area three times in a row between 07:06 and 08:52 UTC. Figure S3 in the Supplement presents the corresponding AirMAP and SWING NO 2

4. 4
Emissions of NO x and SO 2 from Bucharest and the Jiu Valley This section presents estimates of the NO x and SO 2 fluxes from Bucharest and the power plants in the Jiu Valley, combining our different 2014 and 2015 measurements and comparing them with available reported emissions.Campaign-based estimates of NO x emissions from large sources are relevant in a context of satellite validation since the high resolution of TROPOMI

4. 4
.1 NO x flux from Bucharest We estimated NO x fluxes from the Bucharest urban area using the NO 2 VCDs measured with the UGAL Mobile-DOAS systems along the external ring and the wind data on 8 September 2014 and 31 August 2015.We derived the wind direction from the maxima of the NO 2 VCDs in the DOAS observations.For the wind speed, we took 1.1 m s −1 on 8 September 2014, the value Meier (2018) used for the AirMAP-derived flux, which originates from meteorological measurements at Baneasa airport.On 31 August 2015, we used the ERA5 wind data (C3S, 2017) at the time when the Mobile-DOAS crossed the NO 2

4. 4
Figure 13 presents a scatter plot of the slant columns of NO 2 and SO 2 for the ultralight flight of 26 August 2015, which detected the four exhaust plumes of the Valley between 08:31 and 11:04 UTC.Two regimes are visible in the SO 2 to NO 2

(
2018) estimated the NO x flux from the Turceni power plant using the AirMAP measurements of 2014 and 2015.This leads to similar values for the two flights on 11 September 2014 and 28 August 2015, of about 8 mol.s −1 .On this second day, the UGAL Mobile-DOAS crossed the plume along the road in front of the power plant.These ground-based measurements lead to a NO 2 flux of 2.2 mol.s −1 , much lower than the aforementioned AirMAP-derived value.However, Meier (2018) calculated the latter based on AirMAP measurements at 3.5 km from the source.At shorter distances, the AirMAP estimated NO 2 flux is smaller and close to the Mobile-DOAS observations.This is probably related to the fact that the NO/NO 2 ratio has not yet reached its steady state value above the road where we performed the Mobile-DOAS observations, which is only around 1 km from the stack.The agreement is better for SO 2 (25 and 32 mol.s −1 ).On 25 August 2015, we had a coincidence of ULM-DOAS and Mobile-DOAS observations and we observed a similar range of values.This gives us confidence in our estimate of the NO x flux from the aircraft but confirms that the nearby road is too close to the plant to estimate a meaningful NO x flux from Mobile-DOAS NO 2 observations.Note that the conversion of NO into NO 2 is also visible right above the Turceni stack in the NO 2 imager data of 24 August 2015, as appears in Fig.6ofDekemper et al. (2016).

Figure 1 .
Figure 1.The tropospheric NO2 VCD field seen from space with the OMI/AURA instrument above Romania (OMNO2d product, averaged for 2012-2016 with Giovanni, NASA GES DISC).The black stars pinpoint the largest cities of Romania.

Figure 2 .
Figure 2. The Bucharest area with important locations for the AROMAT campaigns: the INOE atmospheric observatory in Magurele, the Baneasa airport, and the Clinceni airfield.Buit-up areas appear in grey.The red and black lines, respectively, show the BN-2 and Cessna flight tracks during AROMAT-2.

Figure 3 .
Figure 3.The Jiu Valley and its four power plants between Targu Jiu and Craiova.The scientific crew was based in Turceni during the AROMAT campaigns.The green and red lines, respectively, show the ultralight and Cessna flight tracks during AROMAT-2.

Figure 4 .Figure 5 .
Figure 4. Geometry of the main measurements performed during the AROMAT campaigns.The Imaging-DOAS instruments map the NO2 and SO2 VCDs at 3 km altitude above the target area while the in-situ samplers measure profiles of trace gases and aerosols.Ancillary ground measurements include Mobile-DOAS to quantify trace gases VCDs and lidars to measure the aerosol optical properties.

Figure 6 .
Figure 6.Volume mixing ratio and VCDs of NO2 in and out of the pollution plume of Bucharest, as measured with the CAPS (on the BN-2, 12:30-12:55 UTC) and AirMAP (on the Cessna, 12:00-13:30 UTC) during the afternoon flights on 31 August 2015.Note that the plot shows the VCDs extracted at the position of the CAPS measurements.

Figure 8 .
Figure 8. Tropopsheric vertical column densities of NO2 measured with the MPIC Mobile-DOAS instruments in the Jiu Valley on 23 August 2015 between 08:07 and 14:16 UTC.

Figure 9 .
Figure 9. Examples of NO2 sondes data from Turceni during AROMAT-1 (11 and 12 September 2014).The legend indicates the date, the last two digits being the hour of launch (UTC).

Figure 11 .Figure 12 .
Figure 11.AirMAP measurements of NO2 VCDs degraded at the TROPOMI resolution during three overpasses of the morning flight of 31 August 2015 (left panels), together with the differences of these degraded NO2 VCDs for consecutive overpasses (right panels).The right panels also indicate the means (µ) and standard deviations (σ) of the two differences.

Figure 13 .Figure 14 .
Figure 13.SO2 and NO2 SCDs SCDs measured from the ULM-DOAS above the Jiu Valley on 26 August 2015 between 08:31 and 11:04 UTC.Blue dots indicate the measurements above Rovinari, whereas the red ones are for all the other plants.

Figure 15 .
Figure 15.SO2 fluxes from the Turceni power plant on 28 August 2015.They were estimated with the Envicam2 SO2 camera for 4 transverses at vertical altitudes above the stack of 100, 400, 500 and 700 m.The red line shows the estimated plume speed (m.s −1 ).
Table 2 presents such requirements for the TROPOMI-derived tropospheric vertical column densities (VCDs) of NO 2 , SO 2 , and H 2 CO (ESA • N, 26.1 • E) is the capital and largest city (1.9 million inhabitants according to the 2011 census) of Romania.

Table S5
(Zieger et al., 2007)6)l et al., 2015)en 17 and 31 August 2015.We started in Bucharest with car-based Mobile-DOAS measurements and observations at RADO.The INOE mobile lab was installed in Turceni on 19 August 2015, followed by an SO 2 camera (instrument described inKern et al., 2015;Stebel et al., 2015)and NO 2 camera(Dekemper et al., 2016).Poor weather conditions limited the relevance of the measurements during the first days of the campaign.Two Mobile-DOAS teams in Bucharest moved from Bucharest to the Jiu Valley on 23 August 2015.From then, the weather was fine until the end of the campaigns, and valuable data were collected during all days between 24 and 31 August 2015.In the Jiu Valley, the crew was based in Turceni and most of the static instruments were installed at a soccer field.Beside the INOE mobile lab with in-situ samplers, the scanning lidar, SO 2 cameras and the NO 2 camera pointed to the power plant plume.The NO 2 camera acquired images until 25 August 2015.The car-based Mobile-DOAS operated in the Valley between the different power plants.From 24 August, the SO 2 cameras were split: one of them stayed in the soccer field, the two others were installed at several points around Turceni.Also on 24 August, the UGAL ultralight took off from Craiova and flew to the Jiu Valley until Rovinari, carrying the ULM-DOAS instrument.This experiment was repeated on 25, 26, and 27 August.On 28 August 2015, the Cessna flew above Turceni with AirMAP and SWING.In Bucharest, the BN-2 flew first on 25 August 2015.It took off from Strejnicu and carried various in-situ instruments: the TSI nephelometer and Aerosol Particle Sizer, the NO 2 CAPS, the PICARRO, and the KNMI NO 2 sonde, and flew in a loop pattern at 500 m a.s.l.around the city ring road.After this test flight, the aircraft performed 6 flights between 27 and 31 August 2015, which included soundings around Baneasa and Magurele, up to 3300 m a.s.l.On 30 and 31 August 2015, the Cessna mapped the city of Bucharest, performing two flights per day.It also performed soundings to measure AOD profiles with the FUBISS-ASA2 instrument(Zieger et al., 2007).
in the Supplement summarizes the main measurement days during AROMAT-1, specifying if the measurements were taken in Bucharest or in the Jiu Valley.The "golden days" of the AROMAT-1 campaigns are 2, 8, and 11 September 2014.These days are particularly interesting due to good weather conditions and coincident measurements.On 2 September 2014, we operated the three Mobile-DOAS together around Bucharest.On 8 September 2014, we flew AirMAP above Bucharest with the UGAL and MPIC Mobile-DOAS on the ground.Finally, on 11 September 2014, SWING and AirMAP were time-coincident above the Turceni power plant, and two balloons sampled the vertical distribution of NO 2 .Table S6 in the Supplement summarizes the measurements of the AROMAT-2 campaign, specifying if the measurements were taken in Bucharest or in the Jiu Valley.Compared to the AROMAT-1 campaign, a larger number of instruments took part and also a larger number of 'golden days' occurred.All the days between 24 and 31 August 2015 led to interesting measurements.Regarding intercomparison exercises for the airborne imagers, the best days are 28 August 2015 (Jiu Valley) and 31 August 2015 (Bucharest).
.Nowlan et al. alsodeployed an airborne DOAS nadir instrument, they reported H 2 CO VCDs up to 5 x 10 16 molec cm −2 in September 2013.Figure 8 presents the horizontal distribution of the NO 2 VCDs in the Jiu Valley measured with the MPIC Mobile-DOAS on 23 August 2015 between 08:07 and 14:16 UTC.The figure shows elevated NO 2 VCDs close to the four power plants listed in Table S1 of the Supplement, with up to 8x10 16 molec cm −2 downwind of Turceni and Rovinari.In comparison, the area East of Craiova is very clean, with typical NO 2 VCDs under 1x10 15 molec cm −2 .The situation of Fig. 8 is characteristic of the conditions encountered in the Jiu Valley, with high NO 2 VCDs observed north and west of the plants due to the prevailing wind directions.During both campaigns, we observed maximum NO 2 VCDs reaching up to 1.3x10 17 molec cm −2 close to the plants with Mobile-DOAS instruments.
Meier (2018)10 (lower panels) presents the SO 2 horizontal distributions measured around Turceni with AirMAP (lower left panel)and SWING (lower right panel) on 28 August 2015.The maps show the plume from the Turceni plant transported in the northwest direction, and other areas with elevated SO 2 VCDs in the east and south of Turceni.Meier (2018)presents in detail these AirMAP SO 2 observations and compares them with SWING results.FigureS4in the Supplement shows the corresponding time series of SWING and AirMAP SO 2 DSCDs.It is found that the AirMAP-derived SO 2 columns inside the plume SO 2 reach 6x10 17 molec cm −2 and that the AirMAP and SWING SO 2 VCDs agree within 10%.Moreover, for these airborne data, the SO 2 horizontal distribution broadly follows that of NO 2 .The discrepancies can be explained by the different lifetimes of the two species.

Table 4 .
Summary of the AROMAT measurements of H2CO.

Table 5 .
Summary of the AROMAT measurements of SO2.

Table 6 .
Total simulated error budget for the validation of spaceborne NO2 VCDs validation using airborne mapping at different resolution, with or without profile informations.

Table 7 .
NOx emissions from Bucharest estimated from the AROMAT measurements.Note that we respectively use UGAL and MPIC Mobile-DOAS measurements for the estimates on 8 September 2014 and 31 August 2015.

Table 8 .
NOx and SO2 emissions from the Turceni power plant estimated from the AROMAT measurements.