Chasing parts in quadrillion: applications of dynamical downscaling in atmospheric pollutant transport modelling during field campaigns

Atmospheric transport and dispersion models (ATDMs) are widely used to study and forecast pollution events. In the frame of the “Effect of Megacities on the transport and transformation of pollutants on the regional to global scales” (EMeRGe) project, ATDM forecasts were carried out to identify potential airborne sampling areas of perfluoro‑ carbons (PFCs) emanating from controlled PFC releases. The forecasts involved short‑distance transport over small‑ scale topographic maxima (Manilla; Philippines), short‑distance transport over large‑scale topographic maxima (Taipei, Taiwan) and long‑distance transport over mixed topography (Nanjing, China, sampled over Taiwan). In situ aircraft measurements of PFC mixing ratios down to a few parts per quadrillion (ppqv) provide us with a unique dataset to explore the added benefits of dynamical downscaling. Transport simulations were repeated using FLEXPART driven by ERA5 and IFS meteorological data and FLEXPART‑WRF with dynamically downscaled IFS data down to 1.1 km and four PBL parametrisations. Of the three cases studied, dynamical downscaling led to significant differences for the Manilla and Taipei releases that can be interpreted through changes in the modelled orographic flow regimes. The choice of PBL scheme also significantly impacted accuracy, but there was no systematically better‑ performing option, highlighting the benefits of ensemble forecasting. Results show how convergence and diver‑ gence between ensemble members can be utilised to help decision‑making during field campaigns. This study high‑ lights the role that dynamical downscaling can play as an important component in campaign planning when dealing with observations over orographically complex areas.


Introduction
Events leading to the release of significant amounts of atmospheric pollutants are ubiquitous around the globe.Releases from natural activities represent a relatively consistent forcing in the global emission budget and include forest wild fires that can, among others, emit carbonaceous aerosols, carbon monoxide (CO) and dioxide (CO 2 ), methane (CH 4 ), and non-methane organic compounds (NMOC) (Van der Werf et al. 2010;Akagi et al. 2011;Urbanski 2014;Alvarado et al. 2020;Daskalakis et al. 2022;Lin et al. 2023a) and volcanic eruptions that can release volcanic ash and gases; primarily CO 2 , sulphur dioxide (SO 2 ), and bromine monoxide (BrO) (Bobrowski et al. 2007;von Glasow et al. 2009).Short-lived, high-impact releases from anthropogenic activities tend to be less consistent and generally occur due to accidents at power plants, pipelines or chemical storage facilities that can lead to atmospheric releases of pollutants, such as SO 2 , nitrogen oxides (NO x ), CO 2 and CH 4 (Rashad and Hammad 2000;Chatzimouratidis and Pilavachi 2007;Jia et al. 2022), aerosols (e.g.ammonium nitrate; NH 4 NO 3 ; Ur Rehman et al. 2021) and radioactive material such as caesium ( 134,137 Cs), xenon ( 133 X) and iodine ( 129,131 I) (Stohl et al. 2012;Le Petit et al. 2014;Achim et al. 2014).In this study, the term "pollutants" will be used as a catch-all term for the various scenarios just described; however, the main focus will be on aerosols and long-lived or chemically inert species.
In all cases, the transport of released pollutants can be influenced by a variety of factors, including specific weather patterns (Fero et al. 2009;Beig et al. 2021), topography (Cécé et al. 2016;Mathieu et al. 2018;Poulidis et al. 2018;Quan et al. 2020), chemical transformations (Yadav et al. 2016;Daskalakis et al. 2022) and the physical properties of the pollutants (Bagheri et al. 2015).Before deposition or dilution to background levels, the released material can travel large distances and have adverse effects on air quality and human health (Carvalho et al. 2011;Schmidt et al. 2015;Santoso et al. 2020;Stewart et al. 2022;Milford et al. 2023) and the climate (Stocks et al. 1998;Robock 2000;Gillett et al. 2004;Randerson et al. 2006;Carvalho et al. 2011;Urbanski 2014;Bethke et al. 2017;Aubry et al. 2022;Marshall et al. 2022), creating complex international hazards.As such, understanding atmospheric pollutant transport is critical for developing effective air quality management strategies and mitigation measures (Gulia et al. 2015;Miranda et al. 2015;Qu et al. 2023).
Atmospheric transport and dispersion models (ATDMs) are a critical component of pollution hazard mitigation, used to predict the transport and deposition of pollutants in the atmosphere (Eliassen 1980;Rentai et al. 2011;Folch 2012;Leelőssy et al. 2014).Such models can be used to cover hypothetical scenarios (pre-event hazard assessment) and as emergency forecast tools.In the case of emergencies, by accurately predicting the movement of pollutants, it is possible to implement targeted mitigation measures to reduce the exposure of people and ecosystems to harmful pollutants, for example, by issuing warnings and evacuating potentially affected areas (Kukkonen et al. 2012;Webster et al. 2012;Beckett et al. 2020).Resulting products of hazard assessment (e.g.hazard maps) are used to inform stakeholders and local authorities (Jenkins et al. 2015).
Global meteorological datasets (either forecast data or reanalysis products) are commonly used to drive ATDMs as they are easily accessible and available from organisation such as the National Center for Atmospheric Research (NCAR) in the United States or the European Centre for Medium-Range Weather Forecasts (ECMWF) in Europe (e.g.Arnold et al. 2015;Macedonio et al. 2016).The use of such datasets can provide valuable information on large-scale weather patterns and, as such, is especially relevant when it comes to transport over synoptic scales (Takemura et al. 2011), but, at their current resolutions, is not able to accurately represent small-scale or quickly evolving variations in weather conditions that can significantly impact pollutant transport and dispersion (Costa et al. 2013).These limitations can result in inaccurate predictions of pollutant concentrations, leading to poor decision-making for air quality management, as seen, for example, after the Eyjafjallajökull eruption in 2010, which, partially due to the inability of ATDMs to correctly represent the dispersion and diffusion of fine ash in the atmosphere (Beckett et al. 2020), led to an overestimation of risk that cost a total of $2 billion to the aviation industry (Budd et al. 2011;Schmitt and Kuenz 2015).
One way to improve the accuracy of ATDMs is by using downscaling techniques to generate high-resolution data based on global data.In the case of dynamical downscaling, a regional model is used to provide high-resolution meteorological data with improved accuracy, due to on an increase in directly resolved physical phenomena, as well as the improved representation of the local topography and land use.The overall impact of these factors is an improved representation of the Planetary Boundary Layer (PBL) and local circulations (De Meij et al. 2019;Takemi and Ito 2020).This is particularly important in areas of complex topography, where near-surface meteorological flow is driven by the complex interplay between atmospheric stability and wind characteristics at the PBL and localised circulations enforced by the topography and thermal-gradient flows, such as the sea/ land breeze and mountain/valley breeze circulations (Lu and Turco 1994).Local topographical flow phenomena are commonly known as "orographic effects" and include mountain waves (i.e.oscillations after the atmosphere impinges on a mountain; Smith 1989), strong downslope winds in the lee of mountains (Durran 1990) and flow stagnation points in the windward side (Smith 1989;Smolarkiewicz and Rotunno 1989).These orographic effects depend both on the characteristics of the mountain (Ólafsson and Bougeault 1996), as well as the incoming flow (Smolarkiewicz and Rotunno 1989).The expected orographic flow regime (i.e.flow around or over the mountain) is commonly characterised by the Froude number ( Fr = UN −1 H −1 , where U is the incoming wind in m s −1 , N is the Brunt-Väisälä frequency in s −1 , and H is the mountain height in m).The combination of modified PBL characteristics and orographic effects can significantly impact areas of high pollution concentration and is, thus, especially important when dealing with transport over densely populated urban areas (Cécé et al. 2016).
Despite potential advantages, dynamical downscaling is not without its limitations.To begin with, regional models can be computationally expensive and require significant expertise (Xu et al. 2019).Additionally, the accuracy of the results depends on the quality of the input data and the choice of model configuration options (Jankov et al. 2007;Etherton and Santos 2008;Mulena et al. 2016;Poulidis and Iguchi 2021).Due to these sensitivities, standardising model configuration is difficult and evaluation needs to be carried out in a case-by-case basis.To account for these inherent uncertainties in the models, an ensemble approach is usually necessitated, i.e. the statistical analysis of multiple model simulations within a well-defined parameter space (Deppe et al. 2013;Angevine et al. 2014;Kioutsioukis et al. 2016).
Here, the benefits of dynamical downscaling were studied through the lens of atmospheric/chemical field campaign management, specifically looking at tracer gas release experiments.Tracer gas release experiments are an important complementary method to acquire model evaluation data (Ren et al. 2015) and involve the release of a chemically inert tracer gas, e.g.perfluorocarbons (PFC; Ren et al. 2015) or Krypton-85 ( 85 Kr; Connan et al. 2013) into the atmosphere.Releases can take place under controlled conditions, allowing for a choice over meteorological and topographic complexity.After the release, pollutants can be tracked and measured directly, for example through aircraft observation missions (Ren et al. 2015) or in situ measurements (Connan et al. 2013).
Dedicated dynamical downscaling during gas tracer release field campaigns can be used for improved weather forecasting, allowing for a better understanding of local meteorological and chemical processes, ultimately leading to a better representation of the tracer plume.This can be especially relevant when considering that target mixing ratios can be down to units of parts per quadrillion with respect to volume (ppqv; Ren et al. 2015).Considering the cost of such field campaigns and the difficulty of repetition, accurate knowledge of the target region is of utmost importance.Despite this, as air tracer release campaigns require careful planning that spans multiple organisations, such as local universities, civil protection and aviation authorities, it can be difficult to allocate resources for dynamical downscaling.As such, use of atmospheric transport modelling has historically been carried out over global meteorological datasets instead (Stohl et al. 1998), a choice that has persisted until recent years (Connan et al. 2013;Freitag et al. 2014;Ren et al. 2015;Andrés Hernández et al. 2022).
Considering the inherent difficulties but potential benefits of operational dynamical downscaling during a field campaign, it is clear that further research into the potential advantages is necessary.The latter are examined here, using tracer gas releases during the "Effect of Megacities on the transport and transformation of pollutants on the regional to global scales" (EMeRGe) campaign (Andrés Hernández et al. 2022) as a test case and looking into the potential differences in target areas or measurement forecasted by the ATDM during the campaign.The rest of the paper is organised as follows.Section 2 contains detailed information on the EMeRGe campaign PFC releases and a full description of the modelling configuration used.Section 3 includes a case-by-case presentation of model results against observations, an analysis of the overall performance of the models employed, and an analysis of the predicted areas of high PFC mixing ratios.Section 4 is used to discuss the role of orographic effects on the cases studied here and the potential benefits of dynamical downscaling during field campaigns and emergencies, while Sect. 5 provides a summary of the main conclusions.

The EMeRGe campaign-regional setting and observations
The release of PFC tracer gas was a key component during the EMeRGe campaign; an international effort to study the impact of major population centres (MPCs) on the atmosphere in Europe and Asia (Andrés Hernández et al. 2022;Förster et al. 2023;Lin et al. 2023a), with the primary objective to improve understanding of MPC outflow plume photochemical and heterogeneous processing along transport pathways.Roberts et al. 2018).PFC tracer gas mixing ratios were measured during the EMeRGe campaign using the perfluorocarbon tracer system (PERTRAS) system, developed by the DLR as a companion to Lagrangian aircraft experiments, first used in the "Stratospheric ozone: Halogen Impacts in a Varying Atmosphere" (SHIVA) campaign in November 2011 (Ren et al. 2015).PETRAS includes a tracer release unit, an adsorption tube sampler, and an analysing system (TD-GC-NICIMS) and is highly portable, allowing its use in different platforms (ground station, ship, and aircraft).The tracer release rate ranges between 0.1 and 500 ml min −1 (2% uncertainty).It can release up to a total of 30 kg of three different PFCs.The system's laboratory detection limit was measured at 1.2 ppqv.The tube samples were analysed in the DLR laboratory using a gas chromatic technique as described in detail by Ren et al. (2014).
Data from the three release experiments during the EMeRGe-Asia campaign (March-April 2018) were analysed here (Table 1).PFCs were released at three locations: Manilla, Philippines (afternoon release, building terrace), Taipei, Taiwan (afternoon release, building terrace), and Nanjing, China (morning release, building terrace).The release experiments and associated simulations will be referred to based on the name of the release location for the remainder of this study.In total, two types of PFCs were released, perfluoromethylcyclohexane (PMCH; Nanjing) and perfluorodimethylcyclohexane (PDMCH; Manilla, Taipei).Releases were carried out over 1-2 h, releasing 5-10 kg of PFC at rates between 47 and 370 ml min −1 (assuming a density of 1.8 g ml), depending on the location.
Two short-scale (Manilla, Taipei) and one long-scale (Nanjing) transport experiments were carried out.For the Manilla experiment, observations were carried out ∼20 h after the release (20.03.2018, between 0234 and 0338 UTC).The location sampled was over sea, 1.8 • -2.0 • to the west of the release and at an average altitude of 1402 m asl (Fig. 3a).Similarly, for the Taipei release, observations were carried out ∼22 h after the release (28.03.2018, between 0553 to 0815 UTC), covering distances 2.0 • -3.4 • to the south-west of the release location, along the eastern shoreline of the Taiwan island, at an average altitude of 746 m asl.Finally, for the Nanjing experiment, observations were carried out over the same location as the Taipei release (i.e.along the eastern shoreline of the Taiwan island), ∼48 h after the release (07.04.2018, 0110 to 0209 UTC), at a distance of 7.6 • -9.5 • from the release point at an average height of 1106 m asl.
Importantly, the areas surrounding both measurement sites have complex topographies (Fig. 1).In the case of Manilla (Luzon island, central Philippines), the most relevant topographic features to the release site are the Zambales mountain range, the Natib and Mariveles mountains to the west and the Southern Sierra Madre range to the east (Fig 1a).In all cases, individual peaks can reach over 1500 m, but the width of the mountain ranges is small (20-30 km).Of significance is also the Mindoro mountain range to the south-west of the release point (maximum height 2586 m).In the case of the Taipei and Nanjing campaigns, the measurements were carried out over the west-south-west coast of Taiwan.The island is composed of two main mountain ranges: the Central Mountain Range (CMR), which is the dominant topographic feature on the island and is aligned from the south to the north-east with a maximum height of 3825 m, and the Xueshan Range, which is located from the centre of the island to the northern coast with a maximum height 3886 m.Two adjacent secondary ranges are connected to the CMR, the Yushan Range (centre-south; maximum height 3952 m) and the Alishan Range (centre; maximum height 2663 m).
In addition to the PFC observations, hourly averaged values of wind speed and wind direction data from 420 stations surface meteorological stations of the Taiwan Central Weather Bureau (CWB) Automatic Rainfall and Meteorological Telemetry System (ARMTS; e.g.Jian et al. 2022) were used to conduct an evaluation of the relevant modelled meteorological data.The location of the stations is shown in Fig. 1.The ARMTS is a dense observation network covering Taiwan with an average nearestneighbour horizontal distance of 7.1 km.The network coverage is better over low altitudes, with 70 stations at altitudes over 500 m and 37 stations over 1 km, creating a possible model evaluation bias.

Modelling approach
To examine differences in the simulated transport and, by extension, the target areas identified for PFC sampling,

Table 1 PFC release details during the EMeRGe campaign
H rel is the height of the release point, M mol is the molecular mass of the released PFC (350 for PMCH and 400 for PDMCH), M rel is the mass released, H obs is the heights of the HALO observations (average in brackets) and r B is the background mixing ratio.The sampling area is shown in Fig. 3 Name ATDM simulations were carried out for each release experiment in two ways: 1. directly over global data and 2. over dynamically downscaled data.In total, six ATDM simulations were carried out per release, two over global data and four over dynamically downscaled data using different PBL parametrisation schemes.The impact of downscaling was initially examined through an error evaluation by comparing simulated against observed values from the EMeRGe campaign.Furthermore, the evolution of the simulated PFC mixing ratio maxima was cross-examined between the two ATDM methodologies, to highlight the possible differences in suggested areas for observations during the EMeRGe campaign.The overall methodology is presented as a flow chart in Fig. 2, while Fig. 2 A flowchart describing the methodology employed.In total six meteorological datasets were used as input for ATDMs, two global datasets (ERA5, IFS) and four datasets produced using the WRF model based on IFS data and different PBL parametrisation schemes.WRF results for the Taipei and Nanjing releases were evaluated against surface observations and results from all ATDM simulations were evaluated against aircraft PFC observations and were then post-processed to identify the evolution of areas of high pollutant mixing ratios details on the simulations are provided in the following sections.In all cases, ATDM simulations were carried out at a grid spacing near the native of the meteorological data.

Global data simulations
ATDM simulations over global data were carried out using version 10.2 of the FLEXible PARTicle (FLEX-PART) model (Pisso et al. 2019).FLEXPART is a Lagrangian transport model that is used to solve the equation of motion (also accounting for turbulent fluctuation) for a large number of "particle parcels" (Lagrangian approach; e.g.Stohl et al., 1998) over a predefined meteorological field.Turbulence within the PBL is evaluated using the Hanna parametrisation scheme (Hanna 1982).FLEX-PART is one of the most commonly used Lagrangian ATDMs, tested in a variety of settings, such as pollutant plume transport (Halse et al. 2013), volcanic ash transport (Dacre et al. 2016), and nuclear accidents (Yasunari et al. 2011) Here, two forward-time simulations were carried out per release point, using ECMWF's hourly ERA5 (Hersbach et al. 2020) and IFS (Roberts et al. 2018) datasets.Although the focus of the study is nominally on forecasting, ERA5 data are used here instead of the NCAR Global Forecasting System (GFS; Saha et al. 2014) as they share resolution (with a grid spacing of 0.25 deg or ∼30 km), but allow for an additional cross-dataset examination, i.e. high-resolution forecast data against lower resolution reanalysis data, while at the same time providing a representation of the accuracy expected at that resolution.In all simulations, 10 6 trajectories were used for the simulation.PFCs were included as passive tracers (i.e.no dry or wet deposition or chemical transformations) with the appropriate molecular mass for each release (Table 1).Model output was set at 10 min as the average values of 1-min trajectory states.Gridded mixing ratio is calculated internally in the model based on given output grids surrounding the areas of aircraft measurements.A horizontal grid spacing of 0.25 • (0.09 • ) was chosen for the ERA5 (IFS) simulations, while in the vertical all simulations follow the grid height profile of the ECMWF 137 model levels (e.g.see Hersbach et al. 2020) up to a height of 6 km asl, then switch to one level per 1 km up to 10 km, topped with a final level until 20 km.In all cases, output data are used between the surface and 6 km asl.

Dynamical downscaling simulations
The Weather Research and Forecasting (WRF) numerical weather prediction model (version 4.2; Skamarock et al. 2019) was used to dynamically downscale IFS data for the three trace release locations used.For all simulations one-way nesting was used starting from an outer domain grid spacing ( x ) of 10 km, progressively focusing down to a x = 1.1 km over the study areas (Fig. 3).In the Nanjing release simulations as the emission source is too far away to be included in the domain encompassing the measurements, an additional high-resolution domain ( x = 1.1 km) was used to improve the model repre- sentation around the release area.In the vertical 59 level were used for all simulations, with 24 levels in the bottom 2 km to have a allow for the appropriate in-model representation of the wind field within the PBL.The first 12 h of all simulations is used to account for spin-up time.The model top was set at 50 hPa, while a Rayleigh damping layer was imposed at the top 5 km to reduce the errors from spurious gravity waves being reflected on the top of the domain (Klemp et al. 2008).Model output interval was set at 3 h for the outermost domain, 30 min for D2, and 10 min for domains D3 and D4.In the case of the Nanjing release the output interval of D2 was also set at 10 min so that data from domains D2-4 can be used for ATDM simulations.The model domain placement is shown in Fig. 3 and summarised in Table 2.
Within the WRF model, the Global 30 Arc-Second Elevation digital elevation model ( ∼900 m resolution; Gesch et al. 1999) and the Moderate Resolution Imaging Spectroradiometer (MODIS) Land-data processing dataset (Justice et al. 2002) were used as the sources for the topography and land use classification respectively.Figure 4 highlights the changes in the representation of topography and land use at different scales in two areas studied here, i.e. around Manilla, Philippines and the island of Taiwan.The three different scales are shown for a grid spacing ( x ) of 1.1 km (the resolution used here and the nominal limit of the majority of mesoscale model PBL schemes), 10 km (representing the ECMWF's Integrated Forecasting System; IFS) and 31 km (representing the ECMWF's European Reanalysis 5; ERA5).The loss of fidelity is evident as grid spacing increases, in terms of both land use (classification details lost) and topography (significant smoothing of the topographic maxima), which significantly impact the representation of the PBL in the model.The smoothing of topography at lower resolutions particularly affects the area around Manilla as most mountains are relatively tall ( > 1 km) but narrow (20-30 km), meaning that they are not wide enough to be properly represented in the model.
A PBL parametrisation scheme is used in the model to predict tendencies of the prognostic variables due to unresolved turbulent motions.PBL schemes can largely be divided in two groups, depending on the closure schemes employed: first order (1.0 non-local) and oneand-a-half or higher order (1.5, 2, 3 local), also known as turbulence kinetic energy (TKE) closure.Non-local schemes do not require additional prognostic equations to express the effects of turbulence on mean variables, while TKE closure schemes require one additional prognostic equation for TKE.
The two categories differ in two main ways.One is related to the extension of the region that is able to affect the PBL variables at one point: in local closure schemes, only the vertical layers symmetrically adjacent to a specific point can directly affect the variables at that point, while non-local schemes also consider variables in deeper (potentially multiple) layers.Furthermore, the PBL height calculation is different between the two categories: in local schemes the height is determined as the level at which the TKE profile decreases to a threshold, while in non-local schemes it is calculated as one level above height at which the bulk Richardson number (Ri b ) exceeds a threshold.The general expectation is that use of a suitable higher-order turbulence closure would lead to increased accuracy, but is tied with higher computational costs as it involves more prognostic equations that often require shorter integration time steps.
The WRF model's representation of near-surface meteorological flow has a known and extensively studied sensitivity to the PBL scheme (e.g.Borge et al. 2008;Shin and Hong 2011;Gómez-Navarro et al. 2015;Banks et al. 2016;Avolio et al. 2017;Gunwani and Mohan 2017;Onwukwe and Jackson 2020).Despite the large number of studies conducted to evaluate WRF model configurations, results are commonly inconclusive (Shin and Hong 2011;Borge et al. 2008; Gunwani and Mohan 2017; Onwukwe and Jackson 2020), hinting at the systematic uncertainties tied with mesoscale models (Hanna and Yang 2001;Rife et al. 2004).As the releases studied here occurred within the PBL, two non-local and two local closure PBL schemes were used to create a "mini-ensemble".The  conditions, in addition to the explicit non-local transport in unstable conditions.Bougeault-Lacarrere (BouLac; Bougeault and Lacarrere 1989) and the Mellor-Yamada Nakanishi Niino Level 3 (MYNN3; Nakanishi and Niino 2006) schemes are 1.5 and 2-order, respectively.BouLac was developed for the meso-beta-scale and uses a prognostic equation for TKE, with the second-order moments parameterised based on an eddy coefficient approximation.The turbulence length scale is designed to capture turbulence induced by gravity waves over complex terrain and is based on the mean TKE of the layer and buoyancy.Finally, the MYNN3 scheme expresses stability and mixing length based on the results of large-eddy simulations rather than observations to minimise underestimation of TKE and convective layer growth associated with other Mellor-Yamada-type schemes.
Other than the PBL parametrisation, a full physics suite was used, consistent for all simulations: the Goddard microphysics' scheme (Tao et al. 1989), the New Goddard radiation scheme (Chou and Suarez 1999;Chou et al. 2001), the Revised MM5 surface layer scheme (Gómez-Navarro et al. 2012), the Unified Noah Land Surface Model (Tewari et al. 2004), and the Kain-Fritsch cumulous scheme (Kain 2004), which was only used for the outermost domain.During the simulation design, the decision was made to focus solely on the PBL schemes, as they were expected to have the largest impact on the simulation results, due to the specific conditions of the three scenarios examined.The WRF model is also known to have sensitivities to the microphysical and land-surface parametrisation schemes (Borge et al. 2008;Baró et al. 2015;Rizza et al. 2018;Abdi-Oskouei et al. 2020).When it comes to pollutant transport, the microphysics' scheme can be expected to have an impact when there is significant precipitation (Borge et al. 2008); however, for the cases studied there was little precipitation in the areas of interest between the release and observations of the PFCs.The sensitivity to land-surface schemes is commonly studied using online chemistry models such as WRF-chem (Grell et al. 2005), due to its combined impact on the near-surface flow characteristics (Misenis and Zhang 2010) and the emissions of dust (Rizza et al. 2018;Abdi-Oskouei et al. 2020) and biogenics (Zhao et al. 2016).However, as in the cases studied here, only specified point emissions were introduced in the model, reducing the expected impact of the land-surface model.As such, only the PBL scheme was varied in the simulations presented here.
The FLEXPART-WRF model (version 3.1) was used to carry out ATDM simulations based on the dynamically downscaled meteorological data.FLEXPART-WRF is a specialised version of the FLEXPART model, adapted from version 9.02 to carry out Lagrangian calculations directly over WRF model output (Brioude et al. 2013).
As with the FLEXPART model, FLEXPART-WRF has been used in various ATDM applications from regional transport of pollutants (Madala et al. 2016) down to large-eddy simulations (Cécé et al. 2016).Mirroring the simulations over global data, a total of 10 6 particle trajec- tories were used per release.However, particle splitting (i.e.splitting of a single trajectory into two each representing half of the pollutant mass; Stohl et al. 2005), was used every 3 h to account for the increase in the simulated wind field complexity, leading to a total number of particles of 2.5×10 8 for the Manilla case, 5.1×10 8 for the Taipei case and 4.1×10 9 for the Nanjing case.FLEX-PART-WRF simulations were carried out for each model domain, following the grid spacing of the original WRF data.In the vertical, grid heights follow the vertical profile of WRF levels up to 6 km asl, then switch to one level per 1 km up to 10 km, topped with a final level until 20 km.To account for turbulence, the Hanna scheme was used instead of turbulence estimates from the WRF model, as the latter has been shown to lead to insufficient mixing within the PBL (Brioude et al. 2013).

Evaluation metrics
During the EMeRGe campaign, the PERTRAS system carried out continuous measurements, with output every 100 sec.In order to compare model results against the measurements, values in the model were sampled every 5 sec to account for the aeroplane's movement.All error analysis was carried out based on the average modelled PFC mixing ratio.Two commonly used error metrics ( E = P − O , where O is the observed value and P is the prediction) are the Root Mean Square Error (RMSE= √ < E 2 > , where the angular bracket signifies averaging over a chosen dimension) and the Mean Bias Error (MBE=< E > ).Despite its ubiquitous use in the literature, RMSE has well-documented asymmetries: for a variable with values in the non-negative real numbers, the error due to overestimation tends to infinity, while the error for underestimation tends to the average observation.For linearly changing quantities without a systematic bias, this is often neglected; however, > ) were used for the mixing ratio-based errors to provide an unbiased evaluation of the transport modelling; however, all values of PFC mixing ratio ( r PFC ) less than 1 ppqv, were replaced by 0.1 to account for the device's detection limit.Finally, Kendall's rank correlation ( τ ) was used for the mixing ratio instead of the more commonly used Pearson's correlation coefficient (p) as it is invariant under monotone transformations and thus more appropriate for exponentially changing variables (Kursa 2022).Conventional error metrics (RMSE, MBE) were used for the meteorological data analysis to provide values comparable to the literature.

Meteorological flow and PFC mixing ratios during the observation period 3.1.1 Manilla release
In order to have a visual representation of the PFC transport and the meteorological flow field, the simulated and observed PFC mixing ratio (as a maximum across all measurement heights) is shown in Fig. 5, overlaid with the average wind field during the HALO measurements (shown as streamlines).A more detailed comparison of the modelled against observed values follows in Fig. 6.Across all simulations, the broad characteristics of the flow are similar: easterly winds during the release and subsequent transport led to westwards PFC dispersal, following the climatologically expected conditions (Matsumoto et al. 2020).Significant deviations, however, occur over and to the west of the Zambales mountain range.In both global datasets, there is a relatively smooth transition from an anticyclonic circulation to the north of the domain to easterlies in the middle and southern parts of the domain, with the centre point situated to the west of the Zambales range (Fig. 5a, b).This transition leads to northerlies over the Luzon island, parallel to the range.The impact of the topography at all domain points is significantly different in the WRF simulations.However, the specific characteristics of the resolved flow depend on the PBL scheme (Fig. 5c-f ).In the northern parts there is no consistent anticyclonic flow west of the mountain range in all simulations due to the complex representation of the lee flow over the mountain range that shows a number of gap flows as individual peaks are resolved in the model.This behaviour leads to a southward shift in the dispersal direction.West of the The varied representation of the atmospheric flow leads to significant differences in the simulated PFC transport.In the simplest case, FLEXPART simulations based on ERA5 data show westwards advection closely following the average wind field (Fig. 5a).The PFC plume is elongated laterally with a single area of high PFC values stretching between 14.2 • N and 15.2 • N.Only a small part of the plume has been separated, located to the south-east of the main plume.Similar to the ERA5 case, the FLEXPART-IFS simulation also results in the plume situated to the west of the release point; however, there is also part of the plume lingering over the island of Luzon (Fig. 5b).This is caused by blocking of the surface flow, since at the IFS data resolution the Zambales range (west of the release point) is, at least, partially resolved.Unlike the ERA5-based simulations, the main plume shows a more complex structure with three local maxima, due to the impact of the Zambales range.Still, flow is largely towards the west of the release point, extending between 14 and 15.5 • N. Simulated flow for both datasets shows that the HALO observations were carried out over the plume's location, but model results show significant overestimation.
In all WRF-based simulations the plume shows a complex structure with multiple maxima reflecting the resolved atmospheric flow (Fig. 5c-f ).The most significant change between the FLEXPART and FLEXPART-WRF simulations is the southwards component of the flow in the lee of the Zambales range, combined with overall higher wind speeds in the WRF data.This leads to similarities in the overall flow for all WRF-based simulations: westerly to south-westerly dispersal, with the maximum concentration of the plume near the lower left corner of the domain.Significant deviations from this are only seen for the WRF[MYNN3] simulation (Fig. 5f ), where the southwards shift west of the release point is less significant, leading to most of the plume having already exited the domain by the time of the observations (i.e. the time period shown in Fig. 5).The resolution of the topography over the Zambales range leads to blocking west of the release point, causing part of the PFC plume to remain over Luzon, mirroring the IFS simulation results.
The overall southwards shift in the plume transport points towards the fact that HALO observations were carried out over areas of low PFC mixing rations.This  1) is also reflected in the observed mixing ratio values (Fig. 6).During the first hour of sampling, elevated values ( r PFC > 10 ppqv) were only noted in two instances (Fig. 6a).Both simulations based on global data overestimated r PFC ; for the ERA5-driven simulation this is true for the duration of the measurement period, while for the IFS-driven simulation, this is mainly true for the first 20 min of the observations.On the other hand, r PFC esti- mates from the WRF-based simulations remain within the range of the observed values; however, all simulations underestimate mixing ratios at the start of the observation period.
A comparison of the simulated against observed r PFC range is shown in Fig. 6b.Aside from a few outliers, observations ranged between 1-10 ppqv for the duration of the measurement campaign.ERA5-based FLEX-PART simulations systematically overestimate values by O(1), while IFS ranges from a O(1) overestimation to O(2) underestimation, with the median still being an overestimation.FLEXPART-WRF r PFC values are within the expected range for all PBL schemes except MYNN3, which led to systematic underestimation due to high wind speeds.Of the four PBL schemes, ACM2 provided the best match, followed by the BouLac scheme.The vertical PFC profiles in all cases show that the traces gas is not mixed well within the PBL, with elevated tracer values near the PBL height for all cases outside of the MYNN3 simulation, where the less-mixed head of the plume has already passed by the sampled area.Overall, the calculated RMSLE values ranged between 1.21-1.59 (MBLE 0.2 to 1.3 , i.e. overall overestimation) for the FLEXPART simulations and 1.05-1.2(MBLE −0.91-−0.47;underestimation) for the FLEXPART-WRF simulations.All FLEXPART-WRF simulations ranked better than the FLEXPART simulations, with the best-matching simulation being FLEXPART-WRF [BouLac].

Taipei release
The Taipei experiment allows for the examination of a different transport setting, as the dominant elements in the topography were the main ridges on the island that are overlaid with a number of smaller peaks that can reach a height of ∼4000 m asl.Even at a lower resolution, the main ridge structure of the island remains; however, individual peaks are smoothed, and the height of the island decreases (Fig. 4d-f ).Despite the presence of the main ridge structure at low resolutions, the overall smoothing of the orography had a significant impact on the resolved flow around the island in the different models (Fig. 7).In all cases, incoming wind direction is easterly-northeasterly (i.e.within the climatological expectations for March; Cheng 2001), meaning that the incoming flow Fig. 7 As Fig. 5, but for the Taipei release and heights between 0.5 and 1.0 km meets the island of Taiwan at an angle.Results from all models show flow splitting on the eastern (windward) side and vortices on the western (lee) side; however, there are distinct differences between the global datasets (Fig. 7a, b) that show strong southerly winds to the west of Taiwan and a small lee vortex area over the southern tip of the island and the WRF simulations (Fig. 7c-f ) that feature a more extended lee area south-south-west of the island.The elongated vortices on the lee side better agree with the theoretically expected flow, given an incidence angle of −30 • as described in Wells et al. (2008).
The different representation of flow on the release side and to the west side of Taiwan leads to very different representations of PFC transport in the models.In the FLEXPART simulations (Fig. 7a, b) the PFC plume was transported as a nearly homogeneous unit towards the south-western coast of Taiwan.In the FLEXPART-WRF simulations, there were two significant flow splitting points into the western side of Taiwan independently of the PBL used (Fig. 7c-f ); one immediately west of the release point (approximately 120.5 • W, 25 • N) and one on the south-western coast similar to the global data simulations.These additional points of flow-splitting lead to a divided PFC plume, with the main part of the PFC plume transported directly west of the release site and a secondary plume transported to the south of Taiwan, across the HALO observations' area.Of the four PBL schemes used, three lead to a qualitatively similar flow (Fig. 7c-e).The exception is the MYNN3-based simulation, where the westwards-headed part of the plume has already left the computational domain by the time of the observations (Fig. 7f ).A significant part of the tracer was transported to the eastern side of Taiwan.
PFC mixing ratio measurements were carried out in two periods on the same day, from 0550-0610UTC and 0800-0815UTC, i.e. 24-26 h after the release.Elevated mixing ratio values over a background of 8 ppqv were observed in both periods, up to 27 ppqv in the first and 4.7 ppqv in the second (Fig. 8a).As with the Manilla case, ERA5-driven FLEXPART simulations overestimate concentrations during the initial measurement period, while IFS-driven simulations show a realistic representation of the upper range, but underestimate the lower range (Fig. 8a, b).FLEXPART-WRF simulations, on the other hand, tend to underestimate the observations (Fig. 8b) systematically.However, the performance depends on the PBL scheme.The YSU PBL scheme produces the best match, which captures quartiles and median values, but not the full upper range, while the ACM2 scheme underestimates the PFC mixing ratio by up to O(2).In all models the PFC vertical profile is not well mixed and tends to have multiple maxima with altitude, reflecting the complexity of the flow around the island (Fig. 8c).The FLEX-PART simulations are associated with RMSLEs between Fig. 8 As Fig. 6, but for the Taipei release 0.94 and 1.06 (MBLEs 0.02 for the ERA5-driven simula- tion and −0.57for the IFS-driven simulation), while the FLEXPART-WRF simulations with RMSLEs between 0.99 and 1.48 (MBLEs −1.4-−0.7,i.e. systematic underestimation).Due to significant underestimation in the observed mixing ratios by FLEXPART-WRF, the bestperforming model configuration was FLEXPART[ERA5], followed by FLEXPART-WRF [YSU].One point to acknowledge is that the Taipei case features the lowest amount of HALO data-approximately 35 min, leading to only 9 data points, potentially introducing a bias in the analysis.

Nanjing release
The Nanjing release is a case of long-range transport modelling.Compared to the Taipei release, a small change in the wind direction (north to north-easterly) incidence angle of the incoming flow (from −30 to ∼ 0 • ) significantly simplifies the flow around the island, leading to flow splitting along the small axis of the island and the minimum affected area (Fig. 9).In all simulation results the resolved flow is similar and adheres to the theoretical expectations for the flow regime (Cheng 2001).Most of the orographically based impact occurs either over the island (where there are areas of reverse flow) or the southern side (i.e.lee side); however, the PFC plume is separated from the generated vortices.The most important difference between the global and the dynamically downscaled data is that the overall wind velocity is increased in the latter, leading to an approximately 1 • southern shift in the centre of the PFC plume (i.e.comparing Fig. 9a, b to c-f ).
The characteristics of the simulated PFC tracer mixing ratios are very similar across all model setups, reflecting the similarities in the overall meteorological flow and transport (Fig. 10).Observed PFC mixing ratios ranged between 0 to 7 ppqv over the background, with one exception at 20 ppqv.Most model setups result in a slight overestimation of the average; however, the observed range is largely reproduced for all models (Fig. 10a, b).Notably, no model was able to reproduce the short-lived maximum in the observed values.Unlike the shortrange transport simulations of the Manilla and Taipei releases, results for the Nanjing release were qualitatively similar for all setups, with the RMSLE ranging between 0.65 and 0.68 (MBLEs − 0.27 -0.32) for FLEXPART and 0.64-0.74(MBLEs −0.34-0.33)for the FLEXPART-WRF simulations, with the best-performing configuration being FLEXPART-WRF [YSU].The lowest RMSEs were associated with the FLEXPART[IFS] and FLEX-PART-WRF[YSU] simulations respectively.As expected for a long-range transport simulation, results from all Fig. 9 As Fig. 5, but for the Nanjing release simulations showed a well-mixed profile within the PBL (Fig. 10c).

Overall model performance evaluation
The overall comparison between all model simulations carried out, including all 3 WRF resolutions is shown in Fig. 11.As was highlighted during the individual case analysis (i.e. , comparing FLEXPART to FLEXPART-WRF simulations did not show a systematic reduction of errors, although individual FLEXPART-WRF configurations did provide the best results in two out of the three release experiments (Fig. 11a).The RMSE of r PFC was included to emphasise the possible bias in a case such as the one studied here (Fig. 11b): compared to the RMSLE, all RMSE values for FLEXPART-WRF simulations tend to be artificially low compared to FLEXPART simulations due to the respective systematic underestimation and overestimation of r PFC (Fig. 11c).
Between the two FLEXPART configurations, despite being based on forecast data, the IFS dataset consistently led to higher simulation accuracy for all metrics except for the Taipei case.When comparing the FLEX-PART against the FLEXPART-WRF simulations, there was no systematic pattern of change between the different WRF resolutions.In the case of the Manilla release, RMSLE and the Kendall correlation coefficient slightly improve, but MBLE worsens with increased resolution (i.e.overall improved error and correlation, but consistent underestimation).In the Taipei release case, there is significant improvement between FLEXPART[IFS] and FLEXPART-WRF at a 10-km grid spacing for all metrics and most PBL schemes, but accuracy overall decreases with increased resolution.The Nanjing case, on the other hand, shows a relatively consistent increase in accuracy with increasing resolution.
When comparing the four PBL schemes used for the FLEXPART-WRF simulations, the BouLac scheme performed better than the other schemes at lower resolutions, which can be expected as it is a meso-beta scale parametrisation (Bougeault and Lacarrere 1989).Overall, the YSU scheme performed most consistently across all resolutions (in agreement with past studies on WRF PBL sensitivities; Banks et al. 2016;Avolio et al. 2017), while the MYNN3 scheme ranked the lowest.However, across all schemes, the differences in performance start relatively large at low resolution and converge towards similar values as resolution increases.
Up to this point the evaluation focused exclusively on the observed PFC mixing ratios.An important question is whether the performance results for the transport modelling correlate with an evaluation against meteorological data from surface stations.To answer this, observations from the CWB meteorological station network were compared against near-surface simulation data for In both days, the overall evolution of wind speed and wind direction is similar, with relatively weak E to NE winds, with slightly increased wind speeds observed during the Nanjing release experiment.In both days and irrespective of the PBL scheme, the model is able to broadly replicate the evolution of wind speed (albeit with an average overestimation) and show a very good reproduction of the observed wind direction.All PBL schemes lead to very similar behaviour in the resolved flow characteristics; only the YSU and BouLac schemes show a large error in wind direction roughly 18 h after the tracer release, but this is tied to very small wind speed values (Fig. 12c).The overall error diagnostics for the two simulations are shown in Table 4.
The distribution of wind speed and wind direction errors over Taiwan shows systematic differences between the two quantities (Fig. 13).Wind speed (first and third columns in Fig. 13) tends to be associated with higher RMSE values in the northernmost and southernmost edges of the island and mainly over low heights.In contrast, wind direction (second and fourth columns in Fig. 13) tends to be associated with increased RMSE values at the stations with higher elevation.The overall distribution of the errors over the island is similar between the Taipei (first and second columns) and the Nanjing (third and fourth columns) release days; however, wind speed during the Nanjing experiment is associated with overall higher RMSE values.As with the results from the station subset (Fig. 12), there are only minor differences between the errors for the different PBL schemes.
In order to examine the relation between the wind quantities' errors against the observed wind speed, the RMSE and MBE of wind speed and wind direction for the merged dataset of the Taipei and Nanjing simulations are plotted against each other, shaded using the average observed wind speed (Fig. 14).Both for the wind speed and wind direction, 67% of the data are contained close to the RMSE and MBE values of the simulations (shown with the black contour in Fig. 14); however, in the case of wind speed the distribution is heavily skewed towards negative MBE values (model overestimation), while for wind direction it is symmetric across the x-axis.
The two wind-related quantities also show a different behaviour when it comes to the higher errors.In the case of wind speed, high RMSE values are associated with either systematic overestimation (positive MBE; mainly occurring with low average wind speed values) or systematic underestimation (negative MBE; only high wind speed values).On the other hand, wind direction MBE remain close to the zero MBE line even for higher RMSE values.Unlike wind speed, high wind direction RMSE values are associated primarily with low wind speed values.As with Figs. 12, 13 the different PBL schemes do not lead to any systematic differences in the results.
In order to draw out broad conclusions for the impact of the PBL parametrisation, each option used was ranked based on the RMSE or RMSLE or wind speed, wind direction and tracer concentration.Results were ranked for each simulation and the overall ranking based on the average across all simulations is shown in Table 5.Out of all PBL options, YSU was the best performing in terms of both the wind field as well as the tracer mixing ratio, followed by the ACM2 scheme-both first-order nonlocal closure schemes.The BouLac scheme performed well based on the tracer transport, but is the worst performer for the meteorological evaluation, while MYNN3 performs best for the wind speed, but fidelity decreased for the wind direction and tracer transport.Still, as seen with the actual RMSE values (i.e.Table 4; Fig. 11) the differences are very small and often come down to decimal places.

Impact of downscaling on PFC mixing ratio maxima
Despite potential benefits in the representation of the meteorological flow (e.g.Takemi and Ito 2020), there was no systematic improvement in the representation of the mixing ratio for the simulations presented.Although at least one FLEXPART-WRF configuration performed better than (or in the Taipei release on-par with) the FLEX-PART model, no PBL scheme led to a systematically better simulation.This outcome is also to be expected as the sensitivity to the PBL scheme is one of the most wellknown and studied aspects of the model (e.g.García-Díez et al. 2013;Banks et al. 2016;Avolio et al. 2017;Gunwani and Mohan 2017).Still, irrespective of the PBL   Considering the low PFC values observed, based on the IFS-driven forecast simulations, it is also important to analyse the results in terms of predicted "observation target area", i.e. the area according to the models where high PFC values were expected (and be extension the most appropriate area to sample).To do this, FLEXPART and FLEXPART-WRF were analysed as two mini-ensembles.For each ensemble member, areas with PFC mixing ratios over 3 and 10 ppqv were identified as "target areas" that were then averaged over the ensemble to allow for a probabilistic approach.Areas with at least a 50% probability of measurements over each limit (i.e.agreement of two or more different PBL simulations in the case of FLEXPART-WRF) are shown in Figs. 15, 16, shaded depending on the occurrence timing of the maxima in hours before the end of the observation period.
When the flight track is compared against the target area based on the FLEXPART simulations there is good agreement for both low and high mixing ratios respectively).This is to be expected since the actual forecasts that aided in the decisionmaking during the EMeRGe campaign were carried out over IFS data.Even though the HYSPLIT model (Stein et al. 2015) was used during the campaign, a good match between the locations can be expected despite FLEX-PART being used here as the two models have been shown to perform similarly (Hegarty et al. 2013).A comparison between the FLEXPART and FLEXPART-WRF results, however, effectively summarises the differences discussed in Sects.3.1.1-3.1.3.For low PFC mixing ratios, there is overlap between the two models (i.e.comparing individual columns of Fig. 15).This reflects both the actual measurements, as well as the simulated values that, for most model configurations, tended to be over 3 ppqv.As expected, the largest difference between FLEXPART and FLEXPART-WRF simulations is the Taipei release (Fig. 15b, e), in which case (according to the FLEXPART-WRF simulations) the observations mainly took place over an area of low model confidence (i.e.outside of the highlighted area in Fig. 15e), or 3-6 h too late.Notably, the area directly west of the release point is also  Results change drastically when focusing on higher expected mixing ratios.Even for a target mixing ratio of 10 ppqv there are significant differences across all cases (Fig. 16).For the Manilla release (Fig. 16a, d), FLEX-PART simulations identify an extensive area, comparable to the results for 3 ppqv (Fig. 15a).On the other hand, when compared to FLEXPART-WRF simulations, the actual observations track is over the past location of the plume and to the north-east of the identified highconfidence area, i.e. according to the FLEXPART-WRF simulations the observations would have been better carried out at a different location or at an earlier time.The Taipei release shows the most significant disagreement between the two sets (Fig. 16b, e), as FLEXPART shows a large area of high confidence to the south of Taipei.In the case of the FLEXPART-WRF set, for the time of the observations, there is no identified high-confidence area to the south of the release point.That does not mean that FLEXPART-WRF does not predict high r PFC values over southern Taiwan, but rather that all ensemble members simulate non-overlapping areas of high r PFC , highlighting the difficulty in the accurate prognosis over that region for the day.Agreement between the different ensemble members is still visible to the west of the release point, or 6-9 h before the observation period ends.Finally, for the Nanjing release (Fig. 16c, f ), despite the similarities in the identified target areas, there is a notable southwards shift in the location, which is in better agreement with the location of the observed maxima (Fig. 9).

Discussion
A comprehensive performance evaluation of the WRF, FLEXPART and FLEXPART-WRF model has been presented for three different transport settings and using four PBL parametrisations (in the case of WRF and FLEXPART-WRF).Despite the fact that the PBL option had an overall significant impact on the resulting tracer mixing ratio, there was no systematically better-performing option across all simulations and tested quantities.This is an expected outcome and reflects the established literature (e.g.Borge et al. 2008;Onwukwe and Jackson 2020).Still, even though differences among the domainwide performance were small the two first-order PBL schemes (YSU and ACM2) performed marginally better than the other options.This also broadly reflects the consensus of similar studies (e.g.Banks et al. 2016;Avolio et al. 2017;Gunwani and Mohan 2017;He et al. 2022).
The central idea behind the evaluation carried out was to quantify the potential impact of downscaling on campaign management.In this context, even if the meteorological representation of flow is better at higher resolution, this would not add a significant benefit in terms of campaign management if the predicted area of concentration maxima remained the same.In general, using global data and the FLEXPART model led to relatively simple dispersal pathways leading to a concentrated plume with a systematic overestimation of observations, while using regional model data and FLEXPART-WRF led to complex dispersal pathways, separation of the plume into multiple smaller units, and was generally associated with systematic underestimation of observations.The analysis, based on all PFC measurements for the different cases, showed that errors associated with the FLEXPART-WRF simulations either stayed within the same order of magnitude as the FLEXPART simulations or marginally improved.However, in two out of the three cases studied (namely the short-range transport cases), the dynamical downscaling simulations revealed a considerably different representation of the flow and identified areas for observation.These differences in the simulations can be broadly explained based on orographic effects (Smith 1989).

General impact of topography
The topography surrounding Manilla plays a significant role in the meteorological flow over the islands as the various topographic maxima can lead to a combination of flow splitting and gap flows west-south-west of Manilla.The resulting complex flow is known to be a deciding factor in rainfall over the Philippines islands (Pullen et al. 2015;Hilario et al. 2021) and atmospheric pollutant outflow from Manilla (Bagtasa and Yuan 2020).In the Manilla simulations, most of the topography west of the release point is practically removed at lower resolutions.Due to the complex topography identifying an undisturbed "incoming" flow is difficult, but considering average wind speed values of 3 m s −1 places the expected flow over the Zambales mountain range from "flow over" ( Fr ∼ 1 ; ERA5 simulation), to a "flow around" regime ( Fr ∼ 0.2 ; ERA5 simulation).Changes seen in the pol- lutant transport pattern (and subsequently the predicted target area) between the two models, can be expected by this change in the simulated flow regime.
The topography of Taiwan presents another modelling challenge.The horizontal extent and height of the topography force the island in a "flow around" regime and exerts significant control over meteorological conditions, even impacting typhoon tracks (Yeh and Elsberry 1993;Lin et al. 2002).The actual wind flow around the island is a result of the interplay between the synoptically enforced dominant wind and the local sea breeze and orographic effects (Huang and Chang 2018;Wu et al. 2019;Cheng et al. 2022).On top of this, air quality over Taiwan, as well as the biochemical cycle of the surrounding area, is also known to be impacted by atmospheric flow acceleration over the Taiwan Strait that can occur (e.g. as in the Nanjing case here) due to channelling between the mountains in south-east China and Taiwan (Lee and Hills 2003;Lin et al. 2012).Naturally, the orographic flow regime also controls pollution outflow from the island, with air quality critically impacted depending, amongst other parameters, on the incidence angle between the oncoming flow and the main island ridges (Cheng 2001;Hsu et al. 2023).
In the simulations here, results depended critically on the incidence angle and transport length.As shown in Fig. 4, the main orographic feature (CMR) in Taiwan is represented relatively well at lower resolutions.In the Taipei release case, this means that the expected flow regime remains similar across the different models (with Fr ranging from ∼0.5 for the ERA5 data to ∼0.33 for IFS and ∼0.25 for the WRF simulations).In all cases, this places the expected flow within the "flow around" regime for an elongated mountain ridge (Ólafsson and Bougeault 1996).As such, the differences seen in the way PFC transport was represented across the different model resolutions mainly come down to the representation of the northernmost mountain range (Xueshan), as smoothing on the edges of the island eliminates the north-western flow splitting point, which, critically, is located just west of the release point.To add to this complexity, the resolved flow critically depends on the incidence angle of the incoming wind (Wells et al. 2008).In contrast to the Taipei release, in the Nanjing case, the flow, impacted by channelling over the Taiwan strait (Lee and Hills 2003;Lin et al. 2012), was aligned with the long axis of the mountain range, meaning that changes in the flow regimes were minimal.On top of this, the PFC plume was already well-mixed and of significant spatial extend (similar to the size of Taiwan), meaning that even moderate change in the simulated representation of the plume had a less significant impact on the forecasted target area.
Results broadly conform to theoretical expectations when analysing the areas from the scope of orographic effects.The Manilla release represents a case where topography is relevant but not resolved.Intuitively, dynamical downscaling can be expected to be important.On the other hand, the Nanjing release represents a case where the detailed resolution of topography can be expected to be largely irrelevant as most of the transport occurs over water.The Taipei release represents the most interesting and challenging case presented; even though it could be considered an in-between case of the previously discussed two extremes.Critically examining the model's limitations in this case is especially important due to the model's inability to accurately reproduce the observed mixing ratios at higher resolution (i.e.ensemble-average error increasing with increasing resolution).Considering the limited amount of observations after the Taipei release, it is impossible to generalise the overall accuracy of the FLEXPART-WRF simulations, especially as all simulations included a splitting of the plume, with the main part heading westwards, towards an unsampled area.
Despite conclusions based on observations and simulations presented here, it is important to acknowledge that the study is still based on three cases; as such, generalisation of results is difficult.First, the transport evaluation was carried out over the limited data available, leading to a potential bias, especially due to the inhomogeneity of the PFC mixing ratio profiles and low mixing ratios encountered in most experiments.Considering the important benefits that can be expected from downscaling, as well as the inherent difficulties in model evaluation, a more extensive and focused study on the subject should be carried out over an area with a dense data network to allow for a more robust analysis.It is also important to point out that all areas studied here involve complex topography; it stands to reason that over simple topography (for example, as represented here with the Nanjing case), the benefits of dynamical downscaling would be reduced.
Another important point is the fact that what was studied here are isolated cases of inert pollutant transport from isolated sources that are more prone to shortlived or isolated biases (e.g.Poulidis and Iguchi 2021).Although important, as they can be considered as representative of natural or anthropogenic disasters, results here do not apply for continuous emissions over a long time, or emissions from multiple sources in an extended area, as local biases can be expected to cancel out.Finally, the analysis focused on inert pollutants.As such, care is needed to extrapolate results when considering chemically active species as chemical transformations can be expected to introduce non-linear effects whose appropriate inclusion in the modelling can exert a first-order control over the transport results (Lin et al. 2023a, b).

Applications in campaign flight decision-making
A critical question is whether dynamical downscaling could have been used to change decision-making during the EMeRGe campaign.The use of model ensembles has already been suggested as a way to adjust observation campaign targets in order to tackle model uncertainty in inverse modelling (Dunbar et al. 2022); however, here, we focused on the forward problem, in a similar way to Dacre and Harvey (2018), i.e. by utilising divergence in the ensemble members.When comparing the FLEX-PART and FLEXPART-WRF predicted "target areas" for 10 ppqv observations (i.e.Fig. 16) there are two points to focus: (i) the existence of a high-confidence target area (west of the release point) missing from the FLEXPART simulations and (ii) divergent results over the south of Taiwan with increased model resolution.Following point (i) would mean that reconsideration of the area to be observed would be advisable (similar to the Manilla case, although due to a different reason).Point (ii) could help identify a high risk of low PFC mixing ratio measurements over the south of Taiwan comes at a higher risk of a low mixing ratio.In this case, the suggested action would either be a reconsideration of the measurement timing on the same day, or, if possible, delay for a day with easier-to-forecast meteorological conditions (e.g. as in the Nanjing release).
Naturally, the study here is a theoretical reanalysis exercise; due to the complexity of the field campaign organisation involving different partners and civil agencies, it might be impossible to amend campaign targets.Still, combined results from the different cases, showcase the strengths of an ensemble approach that can be facilitated through the use of regional modelling, due to the abundance of physical parametrisation options, e.g.Deppe et al. (2013).Furthermore, unlike pollutant releases outside of the PBL (e.g.Poulidis and Iguchi 2021), results for the near-surface releases studied here highlight that different PBL schemes at the same resolution are different enough to allow for an ensemble approach.Considering the cost and difficulties of repeating field campaigns, the ability to carry out probabilistic forecasting can be a deciding factor that can help identify days that should be avoided due to particularly difficult-to-forecast conditions; or, at least, can provide additional data and help have well-defined expectations in order to take a calculated risk.
Although the focus of the study has been on field campaign management, the findings can be applied to any scenario that leads to a pollutant release from an isolated source, such as a power plant accident or volcanic eruption.When forecasting is employed in such cases, misidentifying high pollutant concentration areas can have important ramifications (Dacre and Harvey 2018).Furthermore, direct sampling, which can be vital in understanding the specific hazard, can be associated with a high risk, while the window of opportunity for sampling can be small.Given these extreme conditions, additional input that can be provided, as shown in the current study, could make the difference between a successful and failed risk management case (Budd et al. 2011;Schmitt and Kuenz 2015).

Conclusions
Dynamical downscaling is a commonly used technique to obtain high-resolution atmospheric data over a specified region of interest, which can be analysed directly or used to increase the accuracy of secondary models, such as ATDMs.Key applications of the latter include hazard management and field campaign organisation.However, dynamical downscaling requires resources, computational time, and expertise, potentially leading to increased complexity in the organisation of measurement efforts.The potential benefits of using dynamical downscaling were examined here, taking advantage of the EMeRGe campaign perfluorocarbons (PFC) release experiments, carried out at three locations in Asia in March-April 2018: Manilla (Philippines;19.03.2018), Taipei (Taiwan;27.03.2018), and Nanjing (China;05.04.2018).Aside from an evaluation of model accuracy, differences in the location of simulated PFC mixing ratio maxima were identified in order to assess what would have been the ideal location for the sampling carried out during the campaign.
Overall, based on the limited observations available, the use of dynamically downscaled data to drive the ATDM did not lead to a consistent increase in model accuracy; the results varied significantly depending on the location and model configuration options.Furthermore, it was seen that a model configuration that performs well based on meteorological evaluation is not necessarily tied to a good performance when it comes to secondary modelling applications (here the tracer transport).However, the simulations highlighted fundamental differences between the global-based and dynamical-downscale-based simulations, stemming from the different representations of the study areas in the models.In the simple cases of the Manilla and Nanjing releases, increased resolution led to similar PFC transport patterns, but with significant spatial shifts.In the complex case of the Taipei release, dynamical downscaling revealed fundamentally different PFC transport patterns.
On top of simulation accuracy, which is the typical benefit commonly associated with and expected from dynamical downscaling, results were analysed in the context of ensemble modelling.For isolated emission sources within the boundary layer, simulations here showed that different WRF PBL schemes led to a sufficient spread in the ATDM results.Although, no PBL scheme led to systematically improved accuracy, we demonstrated how the inherent difficulties in representing local flows over complex topography, leading to divergence within ensemble members can be exploited and utilised as a tool to aid in decision-making.

Fig. 1
Fig. 1 Topographic height (shown shaded) of: a central Philippines, b Taiwan.In (b) topographic height is overlaid with locations of the Taiwan Central Weather Bureau surface weather stations (round markers).Relevant mountains and mountain ranges and their corresponding names are shown with different colours.Note that the area shown as the Central Mountain Range (CMR) also includes the Yushan and Alishan mountain ranges

Fig. 3
Fig. 3 Maps of the extended areas around the measurement campaign, overlaid with the extents of the WRF modelling domains (D1-4) and the release (star marker) and HALO observation locations (solid lines) for: a the Manilla (orange), b the Taipei (green) and Nanjing (blue) experiments.Note that domain D4 shown in (b) is only used in the Nanjing WRF simulations

Fig. 4
Fig. 4 Examples of the model representation of land-use (main a-f) and topography (top and right sub-panels a-f) for the areas around Manilla, Philippines a, c, e and Taipei, Taiwan b, d, f, depending on the horizontal grid spacing ( x ): a, b 1.1 km, c, d 10 km, and e, f 31 km.Topography height (in km) is shown as the projection across latitudes (top sub-panels) and longitudes (right sub-panels), with different shading indicating an increase in latitude and longitude, respectively.Land-use classes shown are aggregated (shown in the colourbar) based on the original MODIS data and are shown for illustrative purposes.The coastline is plotted in black in all main panels for comparison

Fig. 5
Fig. 5 Maximum simulated PFC mixing ratio between 1.0 and 1.5 km asl (shaded) overlaid with observations (round markers) and the average wind field between the same altitudes and time period for simulations based on: a ERA5, b IFS, c WRF[YSU] (i.e.WRF using the YSU PBL scheme), d WRF[ACM2], e WRF[BouLac], and f WRF[MYNN3].Note that results are shown only for the innermost WRF domain.The release location is shown with a star marker

Fig. 6
Fig. 6 Comparison of observed and simulated values of PFC mixing ratio following the Manilla release: a temporal evolution of PFC mixing ratio ( r PFC ), b box-and-whiskers plot of r PFC values, and c vertical r PFC profiles.Note that the background is subtracted from the observed values (Table1)

Fig. 10
Fig. 10 As Fig. 6, but for the Nanjing release

Fig. 11
Fig. 11 Error analysis of all simulations: a Root Mean Square Logarithmic Error (RMSLE), b Root Mean Square Error (RMSE), c Mean Bias Logarithmic Error (MBLE), and d Kendall's coefficient ( τ ).In all cases, different colours denote different release experiments, while in the case of the FLEXPART-WRF simulations, individual PBL schemes are shown separately

Fig. 12
Fig. 12 Average values of: a, c wind speed and b, d wind direction, across 42 stations shown against time since the tracer release for the Taipei (a, b) and Nanjing (c, d) release experiments

Fig. 13
Fig. 13 Surface station RMSE of wind speed (first and third columns) and wind direction (second and fourth columns) for the Taipei (first and second columns) and Nanjing (third and fourth columns) simulations.Rows show results for the different PBL schemes used: a-d YSU, e-h ACM2, i-l BouLac, and m-p MYNN3

Fig. 14
Fig. 14 Surface station MBE against RMSE for wind speed (first column) and wind direction (second column), based on the combined dataset of the Taipei and Nanjing simulations.Rows show results for the different PBL schemes used: a, b YSU, c, d ACM2, e, f BouLac, and g, h MYNN3.In all panels the limits where RMSE= |MBE| are shown using dashed lines

Fig. 15
Fig. 15 Simulated areas with ≥ 50% probability of exceedance of r PFC = 3 ppqv, shaded based on the time before the end of the EMeRGe campaign observation period for the releases from: a, d Manilla, b, e Taipei, and c, f Nanjing.First row a-c show results from the FLEXPART simulations, while d-f show results from the FLEXPART-WRF simulations.Overlaid are the observation locations (black lines), while the red contour indicates the area for the duration of the measurements

Table 2
WRF model domain configuration description per release location, including the number of grid points in both horizontal directions ( n x , n y ), horizontal grid spacing ( x = y ) and time step ( t) Hong et al. 2006hes taken for each PBL scheme are presented in Table3.Yonsei University (YSU;Hong et al. 2006) and Asymmetric Convection Model 2 (ACM2; Pleim 2007) schemes are 1-order.YSU uses an explicit entrainment layer and a parabolic K-profile in an unstable mixed layer.ACM2 represents non-local upward mixing and local downward mixing and has an eddy-diffusion component in stable

Table 3
Characteristics of the PBL schemes using in the study

Table 4
Error diagnostics for wind speed (WS; in m s −1 ) and wind direction (WD; in degrees) based on the model evaluation against surface stations for the Taipei and Nanjing simulations In each case RMSE values are accompanied by MBE values in brackets 2)Poulidis et al.Progress in Earth and Planetary Science  (2024) 11:37

Table 5
Model performance (for the 1.1 km domain) depending on the PBL scheme Ranks (out of 4) are presented based on the results from all cases.Ranks marked with an asterisk (*) denote the same average point count.Note that the meteorological evaluation was only carried out for the Taipei and Nanjing simulations