Toward reduced transport errors in a high resolution urban CO 2 inversion system

We present a high-resolution atmospheric inversion system combining a Lagrangian Particle Dispersion Model (LPDM) and the Weather Research and Forecasting model (WRF), and test the impact of assimilating meteorological observation on transport accuracy. A Four Dimensional Data Assimilation (FDDA) technique continuously assimilates meteorological observations from various observing systems into the transport modeling system, and is coupled to the high resolution CO 2 emission product Hestia to simulate the atmospheric mole fractions of CO 2 . For the Indianapolis Flux Experiment (INFLUX) project, we evaluated the impact of assimilating different meteorological observation systems on the linearized adjoint solutions and the CO 2 inverse fluxes estimated using observed CO 2 mole fractions from 11 out of 12 communications towers over Indianapolis for the Sep.-Nov. 2013 period. While assimilating WMO surface measurements improved the simulated wind speed and direction, their impact on the planetary boundary layer (PBL) was limited. Simulated PBL wind statistics improved significantly when assimilating upper-air observations from the commercial airline program Aircraft Communications Addressing and Reporting System (ACARS) and continuous ground-based Doppler lidar wind observations. Wind direction mean absolute error (MAE) decreased from 26 to 14 degrees and the wind speed MAE decreased from 2.0 to 1.2 m s –1 , while the bias remains small in all configurations ( –1 ). Wind speed MAE and ME are larger in daytime than in nighttime. PBL depth MAE is reduced by ~10%, with little bias reduction. The inverse results indicate that the spatial distribution of CO 2 inverse fluxes were affected by the model performance while the overall flux estimates changed little across WRF simulations when aggregated over the entire domain. Our results show that PBL wind observations are a potent tool for increasing the precision of urban meteorological reanalyses, but that the impact on inverse flux estimates is dependent on the specific urban environment.


Introduction
Inversion of atmospheric tracers (Tarantola 2005) is a widely used method to determine the surface fluxes of greenhouse gases (GHGs) such as carbon dioxide (CO 2 ) (Tans et al. 1990; Enting and Mansbridge 1989) using observations from various observing platforms such as towers (Richardson et al. 2012, Andrews et al. 2014, Miles et al. 2016, Richardson et al. 2016), aircraft (e.g.Gerbig et al. 2003), ground-based remote sensing (e.g.Wunch et al. 2010) and more recently satellite measurements (Crisp et al. 2004).Inverse methods have the potential to support both global-and local-scale GHG emissions monitoring (Pacala et al., 2010), pending adequate accuracy and precision of the inverse flux estimates.Inverse GHG flux estimate methods require an accurate representation of the physical relationship between the GHG mole fractions measured by these various observing platforms and the original surface fluxes of GHGs ( Enting 2002).If atmospheric transport uncertainty is large, the observational data have limited impact in determining the true GHG surface fluxes.However, atmospheric transport errors are not typically rigorously quantified for inverse flux estimates, but research suggests that transport errors have a large impact on atmospheric GHG mole fraction estimates (Lin andGerbig 2005, Díaz-Isaac et al. 2014) and flux estimates (Baker et al 2006, Lauvaux et al. 2009, Lauvaux and Davis 2014).Few studies have attempted to assimilate jointly meteorological observations and atmospheric greehouse gas measurements (e.g.Kang et al., 2012;Agustí-Panareda et al., 2016).Therefore, improving atmospheric transport model performance should

RESEARCH ARTICLE
Toward reduced transport errors in a high resolution urban CO 2 inversion system Aijun Deng * , Thomas Lauvaux § , Kenneth J. Davis § , Brian J. Gaudet § , Natasha Miles § , Scott J. Richardson § , Kai Wu § , Daniel P. Sarmiento § , R. Michael Hardesty † , Timothy A. Bonin † , W. Alan Brewer † and Kevin R. Gurney ‡   We present a high-resolution atmospheric inversion system combining a Lagrangian Particle Dispersion Model (LPDM) and the Weather Research and Forecasting model (WRF), and test the impact of assimilating meteorological observation on transport accuracy.A Four Dimensional Data Assimilation (FDDA) technique continuously assimilates meteorological observations from various observing systems into the transport modeling system, and is coupled to the high resolution CO 2 emission product Hestia to simulate the atmospheric mole fractions of CO 2 .For the Indianapolis Flux Experiment (INFLUX) project, we evaluated the impact of assimilating different meteorological observation systems on the linearized adjoint solutions and the CO 2 inverse fluxes estimated using observed CO 2 mole fractions from 11 out of 12 communications towers over Indianapolis for the Sep.-Nov.2013 period.While assimilating WMO surface measurements improved the simulated wind speed and direction, their impact on the planetary boundary layer (PBL) was limited.Simulated PBL wind statistics improved significantly when assimilating upper-air observations from the commercial airline program Aircraft Communications Addressing and Reporting System (ACARS) and continuous ground-based Doppler lidar wind observations.Wind direction mean absolute error (MAE) decreased from 26 to 14 degrees and the wind speed MAE decreased from 2.0 to 1.2 m s -1 , while the bias remains small in all configurations (< 6 degrees and 0.2 m s -1 ).Wind speed MAE and ME are larger in daytime than in nighttime.PBL depth MAE is reduced by ~10%, with little bias reduction.The inverse results indicate that the spatial distribution of CO 2 inverse fluxes were affected by the model performance while the overall flux estimates changed little across WRF simulations when aggregated over the entire domain.Our results show that PBL wind observations are a potent tool for increasing the precision of urban meteorological reanalyses, but that the impact on inverse flux estimates is dependent on the specific urban environment.
substantially improve the accuracy and precision of inverse GHG flux estimates.
An objective of the Indianapolis Flux Experiment (INFLUX) is to quantify and improve the precision and accuracy of atmospheric inverse methods for determining urban CO 2 emissions (Davis et al. 2016, this issue).Improving the accuracy and precision of atmospheric transport model solutions is thus a critical element of this study.Methods for improving transport model solutions can be divided into four broad classifications, increasing model resolution, improving model physical parameterizations, improving input data used to drive the simulation, and assimilation of meteorological observations.Law et al. (2003) attempted to reduce transport model errors by increasing model resolution, potentially reducing the representation errors affecting global scale models (Ahmadov et al. 2007, Carouge et al. 2010).Higher resolutions such as regional applications of inverse modeling (e.g.Schuh et al., 2010;2013;Lauvaux et al. 2016), demand greater atmospheric transport fidelity due to highly dynamic continental meteorology (Law et al. 2008), and the presence of observations obtained largely within the planetary boundary layer (PBL) and in close proximity to strong sources and sinks (e.g.Peters et al. 2007, Miles et al. 2016, this issue).Simply increasing resolution is thus unlikely to solve the issue of transport model accuracy since resolution and complexity increase jointly.In addition, increased resolution does not ensure accuracy in model physical parameterizations or input data.Urban inversions are thus likely to suffer from significant transport errors similar to regional and global inversions (Feng et al. 2016).
The Weather Research and Forecasting (WRF) model is a state-of-the-science community-supported numerical weather prediction (NWP) and atmospheric simulation system.WRF has been used worldwide for both research and operational applications (Skamarock et al. 2008).Its ability to simulate atmospheric processes relevant to atmospheric transport and dispersion has been tested widely (Cintineo et al. 2014, Clark et al. 2015, Coniglio et al. 2013, Hariprasad et al. 2014, Rogers et al. 2013, Lauvaux et al. 2013, Karion et al. 2015).In addition to its advanced numerical scheme and continuously upgraded array of model physics parameterizations, WRF has a four dimensional data assimilation (FDDA) capability implemented by Penn State University (Deng et al. 2009) that allows meteorological observations to be continuously assimilated, allowing WRF to produce dynamic analyses at user-desired resolution.Rogers et al. (2013) investigated the effect of various FDDA strategies on the accuracy of the WRF-simulated mesoscale features over the Central Valley of California.The optimal FDDA configuration identified in Rogers et al. (2013) has been used in many recent modeling studies involving studying GHGs (Lauvaux et al. 2013, Karion et al. 2015).
Towards that end, we test the impact of assimilating various meteorological data on the transport model accuracy for the INFLUX region, and the effect of assimilating these meteorological data sources on the inverse CO 2 fluxes derived for the city.Since wind speed, wind direction, and the depth of mixing in the PBL are expected to be critical to accurate and precise GHG flux estimation, the regional operational weather observation network was enhanced with a commercial compact Halo Streamline Doppler lidar (Pearson et al. 2009) capable of continuous observations of those key atmospheric variables.The lidar data complement the standard WMO rawinsondes (only available at 00 and 12 UTC), and aircraft measurements from the commercial airline program Aircraft Communications Addressing and Reporting System (ACARS, Anderson 2010).We evaluate the performance of the inversion system over 2 months (September-October 2013) using WRF atmospheric simulations that assimilate different combinations of data sources from WMO surface stations, the lidar, and in-situ data from ACARS, and assess the overall impact on WRF performance for each of the meteorological instrumentation used in our assimilation system.Finally, we conduct several sensitivity studies to assess the impact of these different meteorological data assimilation choices on the inverse CO 2 flux estimates obtained over the two-month period.Section 2 briefly describes the WRF model used in this study.Model configurations and experimental design are discussed in Section 3. Model results and discussions are given in Section 4, and a summary and conclusions are given in Section 5.This study complements a companion study by Sarmiento et al (2016, this issue) that explores the dependence of model performance on both the choice of land surface and PBL parameterizations, and the input data used to describe the urban surface.

Model description
The modeling system used in this research consists of a mesoscale atmospheric modeling component that handles forward transport and dispersion, a Lagrangian Particle Dispersion Model (LPDM, Uliasz 1994) that is run backward in time and driven by the flow fields simulated by the atmospheric transport model, WRF, and a Bayesian the inversion modeling system to compute the posterior fluxes given the prior fluxes, the transport adjoint solutions, and the estimates of prior emissions errors, transport model errors and measurement errors.The mesoscale atmospheric model used here is the Weather Research and Forecasting (WRF) model, a mesoscale atmospheric modeling system that has been used worldwide for both research and operational applications (Skamarock et al. 2008).WRF's development is supported by the broad scientific community, along with very active participation of university scientists worldwide.WRF has a flexible, portable code that runs efficiently in computing environments ranging from massively parallel supercomputers and clusters to laptops.
WRF is a non-hydrostatic, fully compressible three dimensional (3D) primitive equation model with a terrainfollowing, hydrostatic pressure vertical coordinate, and is designed for simulating atmospheric phenomena across scales ranging from large eddies ( ~100 m) to mesoscale circulations and waves ( ~1 km to 100 km) to synoptic-scale weather systems ( ~1000 km).These applications include real-time NWP, model physics research, regional climate simulation, hazard prediction modeling and, with the addition of a chemistry module (WRF-Chem, Grell et al. 2005), air-quality studies.
The WRF model includes a complete suite of atmospheric physical processes that interact with the model's dynamics and thermodynamics core.These physical processes include cloud microphysics (MP), cumulus parameterization needed on coarser grids (Δx > O(10 km)) for representing the un-resolved atmospheric convection, atmospheric radiation, PBL/turbulence physics, and land surface models (LSMs).Selection of the model physics suite in this research is based on previous research (e.g., Lauvaux et al. 2013, Rogers et al. 2013).For cloud microphysics, this study uses the WRF single-moment five-class (WSM5) simple ice scheme (Hong et al. 2004) that assumes no mixed-phase conditions.For cumulus parameterization, the Kain-Fritsch scheme (Kain andFritsch 1990, 1993;Kain 2004) is used on the 9-km grid (see next section for grid configuration).For atmospheric radiation, the Rapid Radiative Transfer Model (RRTM; Mlawer et al. 1997) longwave (LW)/Dudhia shortwave (SW; Dudhia 1989) scheme is used.For PBL turbulent processes, the turbulent kinetic energy (TKE)-predicting Mellor-Yamada Nakanishi Niino (MYNN) Level 2.5 turbulent closure scheme (Nakanishi and Niino 2006) is used, along with the MYNN surface layer scheme to preserve consistency.The decision of selecting the MYNN PBL scheme is based on our experiences and previous studies where MYNN appeared to produce the most accurate PBL temperature and moisture profiles (Cintineo et al. 2014, Clark et al. 2015) as well as the most accurate PBL depth (Coniglio et al. 2013, Hariprasad et al. 2014) in simulations of the PBL over land, all of which are highly important to simulating transport and dispersion of surface emissions.For land surface processes, the Noah LSM (Chen andDudhia 2001, Tewari et al. 2004) is used.The Noah LSM is a four-layer soil temperature and moisture scheme and includes plant root zone, evapotranspiration, soil drainage, and runoff, taking into account vegetation categories, monthly vegetation fraction, soil texture, and snow and ice cover.
The WRF modeling system has four dimensional data assimilation (FDDA) capabilities that allow continuous assimilation of meteorological observations during the model simulation, unlike intermittent approaches used in variational data assimilation techniques.For retrospective applications, FDDA can be used in numerical models to produce accurate dynamic analyses at the desired temporal and spatial resolution.FDDA has been widely used in studying atmospheric transport and dispersion processes (Deng et al. 2004, 2006, Rogers et al. 2013, Lauvaux et al. 2013, Karion et al. 2015).The version of FDDA used in this research was originally developed for MM5 and was enhanced and implemented into WRF (Deng et al. 2009, Rogers et al. 2013).Further enhancements to the observation nudging technique in WRF have brought more flexibility to control how surface observations influence meteorology aloft.WRF users have freedom to choose different vertical weighting functions for the surface observations (Rogers et al. 2013).Unlike Rogers et al. (2013) in which various FDDA strategies were evaluated to identify the optimal FDDA settings to produce the most accurate model solutions to represent the meteorological conditions, this research uses the findings from Rogers et al. (2013) to focus on exploring the effect of assimilating various meteorological observation types on the model solution as well as on the inverse flux solutions driven by the WRF-simulated meteorological solutions.The FDDA strategy used in this research includes using analysis nudging and observation nudging on the 9-km coarse grid, and only observation nudging on the 3-and 1-km fine grids, with similar multiscale configurations and parameter settings to those used in Rogers et al. (2013), Lauvaux et al. (2013), and Karion et al. (2015).
To solve the inverse problem, surface fluxes are related to the observed atmospheric mole fraction via a representation of atmospheric dynamics referred to as the observation operator (Lauvaux et al. 2012).This linear operator is then inverted to allow for the optimization of fluxes given prior flux estimates, atmospheric observations and the associated uncertainties.Whereas the WRF model represents the projection of the surface fluxes into atmospheric mole fraction, the inverse of the operator is still required in order to solve for the inverse problem.To represent the inverse operator, also called influence function, we use the Lagrangian Particle Dispersion Model (Uliasz 1994) driven by the hourly mean wind fields, potential temperature, and the turbulence simulated by the WRF model as discussed above.The mean wind fields and the TKE will drive the LPDM backward in time so that areas of influence at the surface for any given tower observation can be represented by the model solution.LPDM has been used extensively in the past for various applications (e.g.Pielke and Uliasz, 1993, Schuh et al. 2010, Lauvaux et al. 2012), and more recently to perform urban inversions of CO 2 over Indianapolis (Lauvaux et al. 2016).Lauvaux et al. (2016) provides a complete description of the coupling between WRF and LPDM used here.
The inversion system is solving for the exact solution of the Bayesian inverse problem, i.e. producing the analytical flux solution with its associated error covariances (i.e.Kalman method) (Tarantola 2005).Prior CO 2 emissions estimates within the inner domain are drawn from the Hestia 2012 emissions product (Gurney et al. 2012).Hestia is a high-resolution (i.e., building-level) inventory-based data product created to estimate anthropogenic CO 2 emissions for the Indianapolis area.It combines several datasets such as energy consumption, traffic data, industrial productivity, and electricity generation from the power plant, with models such as a building energy model.Forwards-in-time atmospheric CO 2 simulations based on emissions from the Hestia dataset within WRF-Chem are only used to compare to the mole fractions estimated with WRF-LPDM.Time periods with large discrepancies between forwards and adjoint atmospheric CO 2 mole fractions reveal unusually difficult transport conditions, and these times are excluded from the inverse analyses.
The inverse system for Indianapolis has been described in Lauvaux et al. (2016) for the period September 2012 to May 2013.Here, we perform five-day emission inversions for September and October 2013 using the reference configuration as described in Lauvaux et al. (2016).The minimzation is performed using a Kalman filter producing the analytical solution to the inverse problem (Tarantola et al., 2005).Similar approaches have been used at global (e.g.Law et al., 2003), continental (Carouge et al., 2010), or regional (Lauvaux et al., 2008).The prior error structures prescribed in the inversion follow the urban land mask and a correlation length scale of 4 km, both convolved to generate the prior emission covariance matrix similar to Carouge et al. (2010) and Lauvaux et al. (2012).The control vector to be optimized in the system includes only the fossil fuel emissions, ignoring biogenic fluxes for this time of year.Turnbull et al. (2015) demonstrated the limited influence of biogenic signals during October to April (less than 5%) relatively too small to be optimized over that time of year.The background conditions were determined using the optimal tower location depending on wind speed, as detailed in Lauvaux et al. (2016).Of most importance, we use the wind direction to define the optimal background sites for each observation time and perform multiple inversions using identical assumptions only swapping the influence functions from the different WRF-FDDA simulations.This comparison allows us to isolate and quantify the sensitivity of the inverse estimate of CO 2 emissions to the different transport solutions.

Model configuration
The WRF modeling system used in this research is based on WRF V3.5.1, released in September 2013.The INFLUX WRF configuration consists of three nested grids with 9-/3-/1km horizontal resolutions (Figure 1a), with the focus on the 1-km grid.The topographic and landuse database needed to initialize the WRF model is based on the U.S. Geological Survey (USGS) 30-second terrain and 24-category landuse.As indicated by the landuse distributions (Figure1b, c, d), a significant fraction of the 1-km grid is characterized as urban (Figure 1d).In the vertical, fifty nine (59) vertical terrain-following layers are used, with the center point of the lowest model layer located ~7 m above ground level (AGL).The thickness of the layers increases gradually with height, with 24 layers below 850 hPa ( ~1550 m above sea level).The top of the model is set at 100 hPa.The WRF-Chem system was configured to run for a two-month period (Sept.-Oct.2013), in five-day segments with a 12-hour overlapping time-window.The WRF model solutions are then used to drive a LPDM that calculates the CO 2 footprints for each of the CO 2 tower observations (Lauvaux et al., 2016).The footprints or influence functions are used in the inversion system to compute the updated posterior CO 2 fluxes.

Meteorological observations
The meteorological observations assimilated in WRF-FDDA include the standard measurements of 10-m wind, 2-m temperature and moisture fields from World Meteorological Organization (WMO) surface stations at hourly intervals and radiosondes at 12-houly intervals, as shown in Figure 2.There are five upper-air observations spread across the 9-km grid.However, none of them are located on the 3-and 1-km grids and only three WMO stations are available on the 1-km grid.
The 20-minute wind profiles measured by the lidar are used in this study.The lidar is deployed in Lawrence, IN, about 15 km to the northeast of downtown Indianapolis.
The lidar has operated and provided data continually since April 2013, with a 6-month gap at the end of 2015 when the system was temporarily removed and upgraded.Additional information about the deployment and data are available online (http://esrl.noaa.gov/csd/projects/influx/).Pearson et al. (2009) provides a complete description of the system and its operating principles.For data presented and used here, range gates (i.e., spatial resolution) were set to be equally spaced at 38.4 m apart.Thus, scales of motion and turbulence larger than the range gate size were explicitly resolved by the lidar.
The lidar directly measures range resolved line-of-sight velocities and backscatter intensity from aerosols and other scatterers.The lidar performs a sequence of velocity azimuth display (VAD) scans at multiple elevations, range height indicator (RHI) scans, and stares.This suite of scans that repeats every 20 min is used to make profiles of the wind speed, wind direction, velocity variances, and backscatter intensity.These profiles are used together to estimate the PBL depth.The 20-minute wind profiles measured by the lidar are also available for assimilation into WRF.In addition to the WMO and lidar observations, the winds, temperatures and moisture fields observed from the ACARS are also assimilated.The ACARS observations are available at times when observations are taken from commercial aircraft, and are distributed in the vertical from near-surface to cruising altitudes above the tropopause, and in the horizontal along the flight tracks.PBL data density is highest near major airports as aircraft ascend or descend.

Experimental design
In order to evaluate the effect of assimilating various observations, as shown in Table 1, four different WRF configurations (or experiments) are conducted and results of both meteorological fields and posterior CO 2 fluxes are compared among the four experiments: 1) NOFDDA -No data assimilation of any form is applied, and WRF is purely driven by the North America Regional Reanalysis (NARR), an analysis product that is a combination of model background and observations but on a coarse spatial and temporal scale (i.e., 32 km); 2) FDDA_WMO -Only standard WMO hourly surface (winds) and 12-hourly upper-air observations (winds, temperature and water vapor) are assimilated; 3) FDDA_WMO_Lidar -In addition to WMO observations, wind profiles from the local INFLUX lidar are also assimilated; and 4) FDDA_WMO_Lidar_ACARS -In addition to the WMO and lidar observations, the ACARS observations are also assimilated.Since all five WMO sondes are outside the 3-and 1-km grids, the effect of assimilating WMO sondes can only propagate into the 3-and 1-km through the grid boundaries between the 3-and 9-km grids.Since in FDDA the impact of surface observations is limited to the lowest portion of atmosphere up to the model-simulated PBL top (Rogers et al. 2013), it is anticipated that assimilating additional upper air observations such as Lidar and ACARS observations (Figure 2b, 2c, 2d) can cause additional improvements in the WRF model solutions so that the transport error in the inversion system is further reduced.During model integration, observations are continuously assimilated using the WRF FDDA technique that is described in Deng et al. (2009) and Rogers et al. (2013).Similar to previous studies (i.e., Deng et al. 2004, 2006, Rogers et al. 2013, Lauvaux et al. 2013, Karion et al. 2015), assimilation of temperature and moisture observations are only allowed above the model-predicted PBL top so that the internal model PBL physics may operate without interference from the data assimilation, while winds are assimilated through the entire atmosphere.
All experiments use identical model physics for all grids except the cumulus parameterization scheme (that is: 4-layer Noah LSM, the 2.5-level MYNN PBL scheme, and the RRTM longwave/Dudhia shortwave scheme are used on all grids, while the KF cumulus scheme is used only on the 9-km grid).For the preliminary evaluations conducted in this paper, each WRF simulation segment of the four experiments is five days long, and was initialized with 3-hourly NARR at 32-km × 32-km resolution for the initial conditions and lateral boundary conditions (ICs/LBCs).The NARR analyses were downloaded from the Research Data Archive (RDA) maintained by the Computational and Information Systems Laboratory (CISL) at the National Center for Atmospheric Research (NCAR).
In addition to assimilating observations during the model integration, the initial condition fields are further enhanced by sonde and surface data through the WRF objective analysis process, Obsgrid, using a modified Cressman analysis method (Deng et al. 2009, Rogers et al. 2013).The three-dimensional (3D) analyses and the surface analysis fields used for analysis FDDA are also enhanced by the objective analysis process and are defined at threehour intervals (Deng et al. 2009), which means that WRF solutions are nudged towards more accurate analyses than NARR on the WRF grid by including observations.

Model evaluation methods
The WRF-simulated meteorological fields are evaluated quantitatively by comparing the error statistics of the model-simulated wind speed, wind direction, and temperature.Evaluation is performed on the 1-km grid only since the high-resolution grid is our primary interest.Mean absolute error (MAE) and root mean squared error (RMSE) are calculated to measure how close the model values are to the observed values.Mean error (ME) is calculated to measure the model bias for a given variable.MAE and ME are computed for both the surface and upper air observation locations.For the surface, the WRF (2-m temperature and 10-m wind) values derived from the lowest model layer using similarity theory are compared with the surface observations.For the upper air, the model values are interpolated onto the observation locations in both horizontal and vertical pressure space, and are then compared with the observations.A calm wind threshold was used in this study to remove situations with very light winds (less than or equal to 1 m s -1 ) for the wind direction statistics calculation because the wind direction for near-calm wind is undefined.Note that due to the limited size of the 1-km grid, there are only three WMO surface stations and no sondes available within the grid, thus reserving a separate set of WMO observations for independent verification (as done in Rogers et al. 2013) is not possible.Since the WRF FDDA technique uses a relaxation term in the model tendency equations rather than conform the model solution to an observation, one should not expect data assimilation to produce an exact match between the modeled and observed values, and spatial correlations in the nudging coefficients limit the extent of the observation influence.
As shown in Rogers et al. (2013), the accuracy of a simulation using FDDA scheme highly depends on the accuracy of the model dynamics.Therefore, the model statistics (i.e.MAE and ME) should remain similar with either assimilated or independent data.In addition, as shown in Rogers et al. (2013), meteorological observations within a distance (e.g., ~60 km) are correlated.For these two reasons, the complete set of meteorological observations including WMO, lidar and ACARS was used in model verification.However, the ACARS data could be considered to be independent observations for the first three numerical experiments since they are not assimilated in them; similarly, lidar is not assimilated in the first two experiments and so is independent of those simulations.
In addition to wind speed and wind direction, we present the evaluation of PBL depth which impacts directly the transport and dispersion of trace gases near the surface, and therefore the estimation of surface fluxes in the inversion.Note that PBL depth is not assimilated, thus it is an independent variable for validation.WRF model can produce a diagnosed PBL depth from its PBL sub-model or PBL scheme.Several methods have been used to diagnose the PBL depth within the different PBL schemes that have been proposed in the literature, usually based on either the vertical thermal profile or the vertical profile of TKE as predicted by the specific PBL scheme.Here, we used the 2.5-level MYNN scheme (Nakanishi and Niino 2006) which is a TKE-predictive scheme.We compare the model-predicted PBL-depth with the observed PBL depth to measure how well the model represents PBL processes.Nighttime comparisons of PBL depth were not performed due to the fact that modeled PBL depth in stable conditions is often less reliable and harder to define, and thus nighttime tower data are not currently utilized in our inverse CO 2 flux estimates.
To evaluate the impact of the different model configurations on the CO 2 emissions from the inversion, we compute the ratios of the Bayes Factors (BF) which represent the ratios of the marginal likelihoods for the different model configurations.BFs relate to the relative values of posterior conditional probabilities for a given model and can be expressed as follows: where H is the Jacobian, B and R the prior and data covariances, y the observations and x b the prior estimate.Larger values of BF correspond to more probable models.By calculating the ratios of BF across our model configurations, we can evaluate the improvement relative to one another.Ratios of BF greater than 10 suggest a strong evidence for improvement, moderate evidence between 3 and 10, and anecdotal evidence between 1 and 3. Table 3 shows ME and MAE of the WRF-predicted wind direction, wind speed, and temperature over the 1-km grid verified hourly (both day and night) against the lower tropospheric (below 2 km AGL) INFLUX lidar measurements (winds only) and ACARS measurements (winds and temperatures), averaged over the period between 00 UTC 27 August and 00 UTC 3 November 2013.The information shown in Table 3 is the same as Table 2 except that validations are now performed against all lower-tropospheric (below 2 km AGL) measurements excluding the three surface stations.Now we clearly see error magnitudes that gradually decrease as additional observations are assimilated into the WRF model from NOFDDA to the best model performance with FDDA_WMO_Lidar_ACARS.

Meteorological evaluation
As shown in Table 3, clear wind error reductions, both MAE and ME, are achieved by assimilating the INFLUX lidar wind measurements.For example, there is a 9-degree reduction in wind direction MAE and 0.7 m s -1 reduction in wind speed MAE.There are no temperature improvements since no temperature observations are available from the lidar.Assimilation of ACARS observations further reduces model MAE and ME error consistently in both wind and temperature fields except the wind speed ME which is already very small.For example, there is a 0.5 °C ME reduction and 0.6 °C MAE reduction in temperature error when ACARS observations are assimilated.Similar to the surface layer error statistics, the FDDA_WMO_Lidar_ACARS case has the smallest overall errors.
To demonstrate and summarize the effect of data assimilation more directly, Figure 3 and 4 show the scatterplots comparing the simulation without assimilating observations (i.e., NOFDDA) and the best simulation with assimilating all the available observations (i.e., FDDA_ WMO_Lidar_ACARS) for the INFLUX project.The improvement due to assimilating observations is evident and the error reduction is significant, especially for wind direction (e.g., nearly 40% reduction in the surface wind direction MAE).Assimilation of upper air wind observations from the INFLUX lidar and ACARS, and temperatures from ACARS clearly improves model performance, especially for wind speed and wind direction (e.g., nearly 50% reduction in upper-air wind direction MAE).
Model errors averaged in the vertical and over the entire two-month period do not represent the temporal and spatial error distributions, both of which are important for the inverse flux corrections at 1-km, five-day resolution.
Figure 5 shows the model error diurnal variation of the WRF-simulated wind direction and wind speed, averaged over the two-month period.It is shown that the wind direction MAEs for all experiments do not show apparent diurnal variations (Figure5a), while wind speed MAEs are larger in daytime than in nighttime (Figure 5b), likely due to larger wind speed in the daytime.It is also possible that the larger daytime errors are due to the effect of PBL large eddies that is not well represented in the model, while during the nighttime winds are more influenced by large-scale weather patterns that are controlled by the large-scale dynamics.FDDA reduces the model MAE quite significantly due to assimilating the three WMO surface observations (e.g., nearly 40% error reduction in the surface wind direction).Note that assimilating lidar in FDDA_ WMO_Lidar and ACARS in FDDA_WMO_Lidar_ACARS is not expected to further reduce the model surface errors since they are upper-air observations.Similar to the MAEs, the wind direction MEs (Figure 5c) do not show diurnal variations.Although the wind direction MEs are quite small (< 10 degrees), error reductions are noticeable in the three FDDA simulations.For wind speed (Figure 5d), like its MAEs, the MEs are larger and positive (model is faster than the observations) in daytime, and FDDA tends to reduce the daytime biases, although the overall MEs are quite small (< 0.6 m s -1 ).Note that assimilating upper-air observations from INFLUX lidar and ACARS pushed the model bias slightly to the negative side.Since the lidar and ACARS observations do not directly influence the surface errors and the magnitudes of the MEs are quite small (Figure 5d and Table 2), these small negative biases are likely the artifact of imbalances introduced by insertion of lidar and ACARS observations, or possibly due to the behavior of the model vertical mixing.
Figure 6 shows vertical MAE and ME distributions for WRF-predicted wind direction (Figure 6a and c) and wind speed (Figure 6b and d) within the lowest 2.5 km AGL, averaged over the two-month period, comparing among four numerical experiments listed in Table 1.For both wind speed and wind direction, NOFDDA has the largest error through the entire atmosphere below ~2 km AGL (except for the wind speed near the 1.5-km level where FDDA_WMO appears to have slightly larger error), with a ~30-degree wind direction error and 2-2.5 m s -1 wind speed error at the surface.Model errors generally decrease Although not significant, the improvement over a vertical layer is expected since the FDDA sub-model in WRF allows the influence of the surface observations to spread to the entire depth of the PBL depending on the stability regime (Deng et al. 2009, Roger et al. 2013); however, the magnitudes of improvements, especially for wind direction, gradually decreases with height.The addition of the INFLUX lidar winds to the data assimilation further reduces model error.Consistent with Tables 2 and 3, the large gap between the FDDA_WMO and FDDA_WMO_Lidar shows that addition of upper air observations creates substantially improved model solutions beyond what can be obtained from assimilating surface observations alone.
It is clear that addition of ACARS observations further reduces model errors although the further improvement is not large (likely because after assimilating the lidar data the model errors are already quite small).The wind direction biases (Figure 6c) are quite small for the low levels (< 5 degrees), and biases increases with height.It is apparent that FDDA reduces the biases.Wind speed biases (Figure 6d) in NOFDDA demonstrate large positive biases within the lowest 1 km although the magnitudes of biases decrease with height, and becomes somewhat negative in mid-and upper-PBL.Assimilating surface observation slightly reduces the low-level biases, and the error reduction appears to be limited to below 1 km, while assimilating lidar and ACARS observations more significantly reduces model biases.

Evaluation of model-predicted PBL depth using the Halo Doppler Lidar data
The PBL depth for this study is manually estimated from the lidar observations for each 20-min time period, identified by large gradients in Signal-to-Noise Ratio (SNR) and the height where the vertical velocity variance becomes small (less than ~0.1 m 2 s -2 ).NOAA is currently implementing and testing an algorithm to automate the PBL depth estimation process for the INFLUX data set.As an example, a comparison of PBL structures between WRF and the INFLUX lidar observations at Indianapolis for 29 and 30 August 2013 is shown in Figure 7. Generally, the model-predicted TKE structures are highly correlated with the lidar-observed vertical velocity variances and large gradients in SNR.The diurnal variation of the PBL structure can be clearly seen within both model output and observations.However, differences in the vertical extent of the PBL depth between the model output and observations are apparent.
Table 4 compares the MAE and ME of the WRF-predicted PBL depth on the 1-km grid verified hourly against the lidar estimates of PBL depth at the lidar site in Indianapolis.The evaluation of PBL depth is conducted only for the daytime period between 17 and 22 UTC when a wellmixed PBL is fully developed and quasi-stationary, for the 2-month period between 00 UTC 27 August and 00 UTC 3 November 2013.Our results show that for all experiments the MAE of model-predicted PBL depth is quite similar, in a range between 223 and 272 m.

Evaluation of inverse emissions
To evaluate the impact of the different WRF simulations on the posterior fluxes estimated from the inversion system, we coupled the WRF-FDDA modeled variables (mean winds and turbulence) to generate the corresponding LPDM tower footprints, or influence functions.Using the different WRF simulations discussed above and their corresponding LPDM tower footprints, the five-day inverse emissions were computed for whole-city emissions using a Bayesian inversion system at 1-km resolution over the urban area of Indianapolis, and the CO 2 mole fraction from the 11 of the 12 towers from the INFLUX tower network (Figure 8), for the entire two-month period ( Sept-Oct 2013).Inverse CO 2 emissions were computed over five-day periods and the configuration was similar to Lauvaux et al. (2016) and kept identical across the four inversions.Therefore, the differences in the inverse emissions correspond to the impact of different transport model realizations, and more precisely the impact of the FDDA systems, assimilating various data sources.
Figure 9 illustrates the different influence functions (or tower footprints) for the 12 tower locations for one single observation time.The extent and the main axes of the tower footprints vary significantly depending on the WRF-FDDA model results, which translates into varying spatial attributions of the observed atmospheric CO 2 mole fractions.Overall, four atmospheric inversions were performed for the period September-October 2013 producing five-day emissions estimates at 1 km resolution over the domain.Figure 10 shows the relative differences among the different transport solutions (influence functions) in space over the simulation domain (equivalent to the 1-km grid of WRF) represented by average pairwise differences.The maximum differences, up to 50%, are located at some of the tower locations emphasizing the importance of the near-field contribution and the high sensitivity of the footprints to wind direction changes.Differences average around 15% across the domain.Once combined with prior information, the impact of different transport fields is   2016), we assume that the error variances scale with the prior emissions, which amplifies the beltway, for example, as a likely locations for corrections in the optimization procedure.Figure 11 shows the maps of the differences across the four different CO 2 inverse flux estimates.As expected, differences are distributed following the error variances and not the spatial differences in the influence functions.As shown in Lauvaux et al. (2016), prior emission errors impact significantly the spatial distribution and the whole-city inverse emissions.Because large spatial gradients are present in the emissions and in the emissions error variances, the minimzation will attribute differences to high-variance areas, lessening the impact of the transport.Here, varying influence functions remains less influential on our inverse solutions than the constraints imposed by structures in the prior emission error covariances.The magnitudes of the flux corrections are as high as 15%, similar to the influence functions.However, the local maxima observed previously at the tower locations are not visible in the CO 2 flux differences.Finally, we show the aggregated impact over the domain for each five-day periods of the two-month inversion.Figure 12 shows that the total impact of meteorological data assimilation on each five-day segment is relatively small compared to the correction by the inversion (shown by the distance to the prior in blue).With the exception of few five-day periods showing larger differences (e.g.September 6-11, 2013), most inversion periods correspond to similar inverse fluxes; this is explained by the compensation of higher and lower emissions distributed spatially across the inversion domain.Differences remain small at the five-day time scale (less than 10%) and over the two months (less than 5% change of the total 2-month emissions).We conclude based on the inversion results that all the transport configurations do not introduce any significant bias in the solution, which confirms that the low systematic errors in the meteorological variables across the four simulations match the small differences in total posterior emissions.Random errors in meteorological variables propagates into the flux solution as additional posterior uncertainties but do not create any systematic errors in the fluxes.While this ensemble of simulations is not calibrated to meteorological observations (e.g.Grimit and Mass, 2007), and therefore does not necessarily represent the true transport errors in the simulation, we show here that for this experiment, the different observations used in the WRF-FDDA simulations do affect the spatial attribution of flux corrections but have a limited impact on total inverse emissions over the entire domain.Furthermore, we computed the Bayes Factors for each model configurations.Results are presented in Table 5.The ratios between the different WRF-FDDA configurations show strong evidences (ratio > 10) that the last two configurations (i.e. with lidar and ACARS observations) better represent the transport compared to the original FDDA (i.e.WMO surface stations only) or no assimilation of any data.The ratio between the last two configurations remains small (about 2.6) which suggests only a moderate evidence of an improvement.These results agree with the meteorological evaluation, allowing us to conclude that the improvement of the transport is also noticeable and statistically significant in the CO 2 flux inverse solutions.

Conclusions
Atmospheric transport model errors limit the accuracy and precision of inverse flux estimates.Assimilation of meteorological observations is a well-established method for reducing transport model errors.This paper represents a first quantification of the impact of meteorological data assimilation on urban-scale CO 2 inverse flux estimates.As expected, meteorological data assimilation significantly reduces transport model error.For example, ~40% error reduction in the surface wind direction and ~50% error reduction in upper-air wind direction are achieved due to data assimilation.This paper demonstrates that observations of winds throughout the PBL have significantly greater value in reducing random errors (or mean absolute error, MAE) in wind speed and wind direction than surface layer observations.Random errors in wind speed and wind direction in the PBL were reduced by approximately a factor of two when PBL wind measurements were assimilated.The transport model showed very small biases in wind speed and direction, and PBL depth prior to meteorological data assimilation, and these small biases were either reduced or unchanged with data assimilation, with the exception that the assimilation of only surface layer observations did cause a noticeable increase in the PBL depth bias.We expect that if our initial meteorological simulation had been biased, that meteorological data assimilation would have reduced these errors as well.Sarmiento et al., (2016, this issue), for example, show that for Indianapolis model-data biases can be significant at different times of year for this configuration, and that biases vary across choices of model configurations and land surface data.
Dedicated observational systems such as the lidar deployed for INFLUX, capable of continuous profiling of PBL winds, are recommended as a straightforward and potent means of minimizing transport errors for urban inversions.The relatively small domains for urban inversions make direct measurements of regional wind fields feasible.We also show, however, that commercial aircraft data from ACARS have a similarly potent influence on atmospheric transport errors.This is promising since most major urban centers are collocated with major airports, thus as long as commercial aircraft meteorological observations are recorded and reported, these data can be used to improve atmospheric transport model performance.
We were unable to explore the quantitative impact of data assimilation on the performance of a biased meteorological model configuration.Further, the data assimilated here did not significantly improve the random errors in PBL depth, and might not have a significant impact on a model configuration with a PBL depth bias.Lidar and ACARS observations, however, could clearly identify such biases, and additional data assimilation approaches could be adopted to address this issue.Improvements to model physics and input data (e.g.Sarmiento et al. 2016, this issue) represent another important approach to improving transport model performance.
Inverse emissions from the four simulations show a significant improvement when using the transport configurations assimilating vertical observations (with lidar and aircraft observations) as demonstrated by the Bayes Factors.The differences in space were directly related to the quality of the transport simulations, with local differences of about 15% in the emission corrections after inversion.However, the whole-city inverse emissions remained similar across the different model configurations (less than 5% change over the two months).This result is in agreement with the low biases in the different meteorological variables before and after meteorological data assimilation.It is reasonable to expect that if model meteorology was initially biased, meteorological data assimilation could have a substantial impact on the timeintegrated, whole-city emissions as well.
In summary, this work shows the benefit of meteorological data assimilation on urban transport model performance, especially in reducing PBL wind speed and direction MAE when assimilating PBL wind profile observations.This reduction in the transport errors for wind speed, direction, and PBL height will provide more robust CO 2 inverse emissions at the city-scale, by improving the spatial attribution of the emission corrections.We expect that similar results would be achieved for biases given transport simulations with initially biased wind speed and wind direction.

Figure 1 :
Figure 1: WRF grid configuration.a) WRF 9-/3-/1-km resolution grids, b) 9-km grid landuse, c) 3-km grid landuse, and d) 1-km grid landuse.Graph b corresponds to the full areas shown in graph a, and graphs c and d are enlargements corresponding to the areas indicated by the black boxes in graph a. Color notation for the landuse panels: Black: urban; Light Yellow: dryland cropland and pasture; Yellow: mixed dryland-irrigated cropland and pasture; Light Green: irrigated cropland and pasture; Green: forest; Blue: water bodies.DOI: https://doi.org/10.1525/elementa.133.f1

Figure 2 :
Figure 2: Distribution of the assimilated observations, for 21 UTC 15 October 2013.a) National Weather Service surface (open circles) and radiosonde (solid circles) observations, b) LIDAR and ACARS at 935 hPa level, c) LIDAR and ACARS at 885 hPa level, and d) ACARS at 250 hPa level.Note that star symbols denote the ACARS observations and solid square denotes the Halo lidar observations.DOI: https://doi.org/10.1525/elementa.133.f2

Figure 3 :Figure 4 :Figure 5 :
Figure 3: Surface layer model performance.WRF-predicted versus observed wind direction (a: NOFDDA, b: FDDA_ WMO_Lidar_ACARS), wind speed (c: NOFDDA, d: FDDA_WMO_Lidar_ACARS) and temperature (e: NOFDDA, f: FDDA_WMO_Lidar_ACARS) on the 1-km resolution grid, between 00 UTC 27 August and 00 UTC 3 November 2013, comparing between NOFDDA (left) and FDDA_WMO_Lidar_ACARS (right) experiments.Note that x-axis represents the model value and y-axis represents the observed value, and each data point represents a model-observation pair averaged for three National Weather Service surface stations inside the 1-km resolution grid at a given hour.The dashed line in the figure represents the y = x line that indicate a perfect model-observation match; thus the points above y = x line represents the situation where model underpredicts and points below represents the situation where model overpredicts.Note that ME represents mean error and MAE represents mean absolute error.DOI: https://doi.org/10.1525/elementa.133.f3

Figure 10 :Figure 11 :Figure 12 :
Figure 10: Spread of influence functions.Spatial distribution of influence function (averaged pairwise differences) over the two months (September-October 2013, 17-22UTC) representing the variability in the tower footprints across the different WRF-FDDA experiments.Note that numbers 1-12 indicate the tower locations as detailed in Figure 8 and two towers were not operational during the time period Oct 26-30.DOI: https://doi.org/10.1525/elementa.133.f10

Table 1 :
FDDA configuration for WRF simulations.DOI: https://doi.org/10.1525/elementa.133.t1 FDDA_WMOWMO surface and upper-air sonde observations of winds, temperature and water vapor FDDA_WMO_Lidar Same as above with addition of INFLUX Halo Doppler lidar winds FDDA_WMO_Lidar_ACARS Same as above with addition of ACARS wind, temperature and water vapor

Table 2 :
Mean error and mean absolute error of the WRF-predicted 10-m wind direction, wind speed and 2-m

Table 4 :
Mean error and mean absolute error of the WRF-predicted PBL depth on the 1-km grid verified against the Indianapolis INFLUX lidar measurements, averaged for all times between 17 and 22 UTC each day for the period between 00 UTC 27 August and 00 UTC 3 November 2013.DOI: https://doi.org/10.1525/elementa.133.t4

Table 5 :
Median ratio of the Bayes Factors derived from the Kalman matrix inversion when comparing the different model configurations for the period between 00 UTC 27 August and 00 UTC 3 November 2013.DOI: https://doi.org/10.1525/elementa.133.t5