Open Access Open Access Solid Earth R. Hamdi Solid Earth Discussions

Abstract. The newly developed land surface scheme SURFEX (SURFace EXternalisee) is implemented into a limited-area numerical weather prediction model running operationally in a number of countries of the ALADIN and HIRLAM consortia. The primary question addressed is the ability of SURFEX to be used as a new land surface scheme and thus assessing its potential use in an operational configuration instead of the original ISBA (Interactions between Soil, Biosphere, and Atmosphere) scheme. The results show that the introduction of SURFEX either shows improvement for or has a neutral impact on the 2 m temperature, 2 m relative humidity and 10 m wind. However, it seems that SURFEX has a tendency to produce higher maximum temperatures at high-elevation stations during winter daytime, which degrades the 2 m temperature scores. In addition, surface radiative and energy fluxes improve compared to observations from the Cabauw tower. The results also show that promising improvements with a demonstrated positive impact on the forecast performance are achieved by introducing the town energy balance (TEB) scheme. It was found that the use of SURFEX has a neutral impact on the precipitation scores. However, the implementation of TEB within SURFEX for a high-resolution run tends to cause rainfall to be locally concentrated, and the total accumulated precipitation obviously decreases during the summer. One of the novel features developed in SURFEX is the availability of a more advanced surface data assimilation using the extended Kalman filter. The results over Belgium show that the forecast scores are similar between the extended Kalman filter and the classical optimal interpolation scheme. Finally, concerning the vertical scores, the introduction of SURFEX either shows improvement for or has a neutral impact in the free atmosphere.


Introduction
Numerical weather prediction models need parameterizations of the surface processes to estimate the fluxes for physical budgets such as sensible heat, latent heat, momentum and radiation between the atmosphere and the surface features Published by Copernicus Publications on behalf of the European Geosciences Union.R. Hamdi et al.: Coupling SURFEXv5 to ALADINcy36 and ALARO-0 such as soil, vegetation and sea.The budgets depend strongly on the characteristics of the underlying surface, and with the increase of resolution in most applications up to kilometre scales, the role of the surface interactions in atmospheric models is steadily increasing.
The international ALADIN (Aire Limitée Adaptation Dynamique Développement International) consortium (AL- ADIN, 1997) has over the past two decades developed a limited-area model (LAM) to serve the specific needs of its participating partners.Currently this consortium consist of 16 partners, covering Europe and the Mediterranean region and including some north African countries.The code of the ALADIN model (Bubnová et al., 1995) is mostly shared with the code of the French global ARPEGE (Action de Recherche Petite Echelle Grande Echelle) model and the IFS (Integrated Forecast System) of ECMWF (European Centre for Medium-Range Weather Forecasts).The lateralboundary conditions (LBCs) of the operational ALADIN model configurations are imposed by the Davies scheme (Davies, 1976;Radnóti, 1995;Termonia et al., 2012) at regular time intervals of 3 h (Termonia et al., 2009) with LBC data provided by either ARPEGE, IFS or a larger ALADIN domain.For the present study the version of Radnóti (1995) is used.
ALADIN has been further developed with a physics parameterization package called ALARO, which has been designed specifically to be run at convection-permitting resolutions.The key concept behind this package lies in the precipitation and cloud scheme called modular multiscale microphysics and transport (3 MT), developed by Gerard and Geleyn (2005), Gerard (2007) and Gerard et al. (2009).The multi-scale behaviour of 3 MT has been validated in a NWP context up to a spatial resolution of 4 km (see Gerard et al., 2009).The ALARO model version ALARO-0, which has been used for the present study, utilizes the ACRANEB scheme for radiation (Ritter and Geleyn, 1992), a semi-Lagrangian horizontal diffusion scheme called SLHD (Váňa et al., 2008), some pseudo-prognostic turbulent kinetic energy (TKE) scheme (pTKE, i.e. a Louis-type scheme for stability dependencies, but with memory, advection and auto-diffusion of the overall intensity of turbulence) and a statistical sedimentation scheme for precipitation within a prognostic-type scheme for microphysics (Geleyn et al., 2008).The ALARO physics package is coupled to the dynamics of the ALADIN model via a physics-dynamics interface based on a flux-conservative formulation of the equations proposed by Catry et al. (2007).The configuration of the model with these physics runs operationally in a number of countries of the ALADIN and HIRLAM consortia 1 for the national NWP applications since 2008.
Historically ARPEGE, ALADIN and ALARO models relied on the ISBA scheme developed by Noilhan and Planton (1989) and Noilhan and Mahfouf (1996) for 1 At, Be, Cz, Hr, Hu, No, Pt, Ro, Se, Si, Sk and Tr. the parameterization of the surface processes.It is also used within the ARPEGE climate model of Météo-France (Mahfouf et al., 1995).The ISBA scheme has also been implemented in the meso-NH model of Météo-France (Lafore et al., 1998).Masson (2000) developed the town energy balance (TEB) scheme for the simulation of the interactions with urban areas and this scheme became part of the meso-NH model.Within the ALADIN community the code also runs with the physics parameterization of meso-NH.This configuration is called the AROME model (Seity et al., 2011).
During the last decade, the surface scheme, including ISBA and TEB, has been externalized from the core of the atmospheric meso-NH model following the approach of Polcher et al. (1998) and Best et al. (2004).This led to the creation of the SURFEX scheme (SURFace EXternalisée).Additionally, parameterizations for all components of the surface (ocean and inland water) have been added to SUR-FEX.Recently, a new multilayer parameterization for the natural and urban canopy (Hamdi and Masson, 2008;Masson and Seity, 2009) was also added to SURFEX in the so-called CANOPY scheme.The rationale for this externalization was twofold.First, once this externalization is done, and if the scheme is plugged into any applications, it becomes available within all the applications.Secondly, SURFEX contains the ISBA scheme for its soil and vegetation interactions, so there is a priori no need to maintain the ISBA scheme separately in the different model versions ARPEGE, ALADIN, AROME, ALARO, ARPEGE climate and ALADIN climate.In operational contexts it is important that the scheme is sufficiently numerically stable to run with the long time steps imposed by the operational applications.Hence the implicit coupling proposed by Best et al. (2004) has been used.The physiographic characteristics of the surface in SURFEX are specified by the ECOCLIMAP database (see Masson et al. (2003) and Champeaux et al. (2005)).An extra advantage of this externalization is that SURFEX can be used in an offline mode for scientific applications where the atmospheric feedbacks are not taken into account, for instance for studying the urban heat island (UHI) evolution (Hamdi et al., 2009(Hamdi et al., , 2011)).
The value of operational weather forecasts is determined by verification scores.So if the particular ISBA scheme in one of the models other than the AROME model is replaced by the ISBA version in the SURFEX scheme one would a priori expect to reproduce exactly the same model performance.This is the problem of reproducibility in model development.However, the implementation of ISBA in ALADIN and its evolution in meso-NH and then later its implementation in AROME diverged slowly, and as a result the versions are no longer interchangeable.An attempt to find reproducibility of the model behaviour by replacing the old ISBA scheme within the ALADIN configuration by the SURFEX-ISBA did not succeed.Nevertheless, the question of the first rationale still stands: why should one maintain different ISBA schemes to serve a large community of users?The obviously preferable version of the two ISBA is the one within SUR-FEX due to its higher potential for rapidly scientific evolution.For instance, new surface data assimilation schemes are being developed for SURFEX, as will be briefly discussed in the last section of the present paper, and it will be necessary to switch to SURFEX in order to benefit from these developments.
The aim of the present study is to address the following question: can one, by exhibiting the novel features developed in SURFEX over the past decade plus the additional options in the configuration of the atmospheric part of ARPEGE/ALADIN/ALARO models, reproduce forecast performance that is equivalent or better in terms of the set of verification scores that are put forth in the operational context of each of the participating ALADIN partners?Apart from the user-oriented goal of allowing a science-based decision for the configuration of the NWP system by each partner within the consortium, this provides a very extensive validation of the SURFEX scheme rather than a specific validation such as in Hamdi et al. (2012) for the use of TEB within ALARO.Finally, it should be stressed that the present paper does not address other important issues which represent crucial criteria such as efficiency, code optimization, code design, its interface to the atmospheric part and the userfriendliness of the SURFEX implementation.

Model: description and configurations
The description of SURFEXv7.2can be found in Masson et al. (2013) (in this SURFEX special issue).Note that at the time of testing, we used version 5 of SURFEX.Table 1 presents a summary of the different model configurations available within the model code.

Two radiation schemes
There are two radiation schemes available in the model code.AROME and ALADIN use the ECMWF radiation scheme (referred to as FMR hereafter).It has a shortwave radiation scheme (Fouquart and Bonnel, 1980) with 6 spectral bands, whereas the longwave radiation with 16 spectral intervals is computed by the rapid radiative transfer model (RRTM) code (Mlawer et al., 1997) using climatological distributions of ozone and aerosols.For the ozone monthly profiles it uses the analytical functions that have been fitted to the UK Universities' Global Atmospheric Modelling Programme (UGAMP) climatology (Li and Shine, 1995).Distributions of organic, sulfate, dust-like and black carbon, as well as uniformly distributed stratospheric background aerosols, are extracted from the Tegen climatology (Tegen et al., 1997).
The ALARO physics package has been developed with the ACRANEB scheme built on Ritter and Geleyn (1992).This is a two-stream approximation with a net exchange rate (NER) formulation for solving the thermal part.All the  (Malkmus, 1967).The scheme has been extended by using a Voigt-line profile for coping with the high model levels (Geleyn et al., 2005).These schemes, FMR and ACRANEB, represent two different approaches for the problem of the extensive computing cost in radiation schemes.FMR is called intermittently to save computing costs.Only the shortwave flux dependency on the zenithal solar angle is updated at every time step.The rest of the radiation computations are updated with a frequency of 1 h for ALADIN and 15 min for AROME.This is how SURFEX is used in the Météo-France versions of ALADIN (Masson et al., 2013).ACRANEB, on the other hand, is in itself designed for cost effectiveness and is called every time step.Both schemes can be called in all model versions of the ARPEGE/ALADIN/ALARO model configurations.

Urban effects
TEB is based on the canyon concept, where the town is represented by a roof, a road and two facing walls.The advantage is that relatively few individual surface energy balance evaluations need to be resolved and radiation interactions are simplified, and therefore computation time is kept low.Water, energy and momentum fluxes are computed by each parameterization and then aggregated at the grid-mesh scale according to the cover fraction of each tile.
For operational application running with long time steps, the TEB scheme is not activated and the town is replaced by rocks.The ISBA scheme is therefore used for all grid points of the domain because of numerical instabilities in the coupling with explicitly computed TEB variables at the time of testing.This is the way how SURFEX is used in the French double suite of ALADIN (Masson et al., 2013).interpolation between the lowest level and the surface, making use of the stability functions of the dry static energy and applying the Monin-Obukhov similarity theory for the surface boundary layer (Geleyn, 1988).However, Best and Hopwood (2001) found that the choice of stability functions at night can have a significant impact on both the surface temperature and the sensible heat flux and therefore on the diagnostic of screen temperature in stable situations.

Surface boundary layer computation
In order to improve the description of the physical coupling between the air and the surface, a one-dimensional surface boundary layer has been implemented in SURFEX (CANOPY scheme) following the methodology described in Hamdi and Masson (2008) and Masson and Seity (2009).With this version, six prognostic air layers (0.5, 2, 4, 6.5, 10, and 17 m above the ground) are added from the ground up to the lowest atmospheric level.The surface boundary layer is thus resolved prognostically (there is no need for analytical extrapolation such as in Geleyn (1988)), taking into account large-scale forcing, turbulence and, if any, drag and canopy forces.

Surface data assimilation
The initialization of the soil variables is very important in order to provide accurate short-and medium-range forecasts.Surface assimilation techniques mainly use screen-level observations of relative humidity and temperature to infer realistic estimates about the soil variables (i.e.soil moisture and soil temperature) by optimally combining the screenlevel observations with a short-range forecast.Two common soil analysis techniques are Optimum Interpolation (OI) and the extended Kalman filter (EKF) or a simplified version of the extended Kalman filter (SEKF) in which the background error covariance matrix is kept constant.
A local OI algorithm is available in SURFEX.Its coefficients have an analytical formulation that mostly depends on the diurnal cycle and the vegetation fraction.The coefficients have been derived from Monte Carlo single-column experiments performed by Mahfouf (1991) with an analytical formulation proposed by Giard and Bazile (2000).A drawback of the OI is that it is difficult to incorporate new observation types that may improve the analysis.An alternative method is the EKF, for which it is easier to add new observation types.An EKF has been developed for SURFEX that is capable of assimilating screen-level observations (Mahfouf et al., 2009), and has been extended to include AMSR-E surface soil moisture retrievals (Draper et al., 2009), radar precipitation information (Mahfouf and Bližňák, 2011) and ASCAT surface soil moisture (Mahfouf, 2010).In contrast to OI, the EKF uses dynamical coefficients that depend on the Jacobian of the model observation operator which projects the model state into the observation space.The Jacobian elements are calculated using a finite differences approach by comparing a perturbed run to a reference run for each of the soil prognostic variables.In order to make the EKF computationally efficient, these runs are calculated using SURFEX in offline mode, i.e. with the surface scheme decoupled from the atmospheric model.

Operational validation
The use of SURFEX as a new land surface scheme for the ALADIN and ALARO models has been extensively tested during the last two years by several partners of the ALADIN consortium.In Masson et al. (2013), SURFEX was tested within the ALADIN model running over France and using the FMR radiation scheme.Those authors found that the introduction of SURFEX had a neutral impact for surface pressure, precipitations, total cloudiness and 10 m wind direction but improved the scores for the 2 m temperature and humidity and 10 m wind speed.In the present study a more complete set of tests will be presented over the operational Belgian domain, while giving pertinent illustration for the other partners (Hungary, Morocco, Poland, Slovenia, and Turkey).
At the Royal Meteorological Institute (RMI) of Belgium, the operational version of the code is the ALARO configuration, running with the ACRANEB radiation scheme and ISBA, with a resolution of 7 km and 4 km (see Fig. 1).Tests were carried out to replace the ISBA scheme with SURFEX for the 7 km domain and, additionally, making a comparison by switching on TEB for the 4 km high-resolution domain.The primary goal of this study is to examine the operational viability of ALARO coupled with SURFEX.As a result, the setup of the ALARO model was designed to mimic an operational configuration over the domain presented in Fig. 1.It is a regular grid on a Lambert projection, with its centre at (50.57• N, 4.55 • E), and the domain is vertically divided in 46 layers, separated by hybrid pressure terrain-following levels (Simmons and Burridge, 1981).The height of the lower layer is about 17 m above the ground.The model time step is 300 s and 180 s for the 7 km and 4 km domain, respectively.The ALARO model is run operationally four times a day (at 6 h interval) based on analyses coming from the AL-ADIN France analyses, which is the model providing also the 3 h lateral boundary coupling data.Forecasts of 60 h and 36 h are issued from the 00:00, 06:00, 12:00 and 18:00 UTC nominal analysis times for the 7 km and 4 km domain, respectively.Vertical model fields are post-processed by interpolation of fields onto pressure or altitude levels each hour.For non-urban surfaces, the SURFEX scheme diagnoses the 2 m temperature, 2 m relative humidity and 10 m applying the interpolation method of Geleyn (1988).For urban areas, the standard 2 m temperature, 2 m humidity and 10 m wind are obtained from the diagnosed TEB canyon temperature, humidity and wind, respectively.Three tiles are activated (sea, nature, lakes) (town is replaced by rock for the 7 km domain, while TEB is used for the 4 km domain).A three-layer forcerestore version of ISBA is used (instead of the former twolayer version) with a one-layer snow scheme of Douville et al. (1995).The ECUME (Exchange Coefficients from Unified Multi-campaign Estimates) parameterization of sea surface fluxes is used over seas (Belamari and Pirani, 2007).It is a bulk iterative scheme developed in order to obtain an optimized parameterization covering a wide range of atmospheric and oceanic conditions, while ALARO used the classical Charnock formula (Charnock, 1955).Physiographic data have also been improved compared to the one used by ALARO (GTOPO30, ECOCLIMAP (Masson et al., 2003), and FAO maps (FAO, 2006) for soil texture).
For two months (January 2010 and July 2010), a series of simulations is performed, with (OPER + SFX) and without SURFEX (OPER), with one simulation of 36 h (60 h for the 7 km domain) each day, starting at 00:00 UTC (from the operational ALADIN French forecast model analysis).The comparison with observations is then done at each 3 h of forecast time.The results are presented separately for the two months, representing the two types of season and for stations located in flat topography, high-elevation and coastal environments.The statistical scores computed are the bias and the root-mean-square error (rmse) between model and observations for all simulations (31 in January and 31 in July).The statistical significance of the differences between OPER+SFX and OPER simulations will be quantified by confidence intervals computed with bootstrap techniques (Wilks, 1995).Confidence intervals are calculated by re-sampling the 31-fold samples, for January and July, 1000 times and taking the 2.5 % and the 97.5 % percentiles of |bias|/rmse {OPER + SFX} − |bias|/rmse {OPER} as lower and upper value to get a 95 % confidence interval for the difference.For instance, this means that a null hypothesis -"the difference of two bias/rmse is negative and there is an improvement when using SURFEX" -is accepted with a 97.5 % confidence level.
The parameters that are compared are 2 m temperature, 2 m relative humidity and 10 m wind.We recognize that single-station measurements cannot capture the spatial variability within the ALARO grid cells.In an ideal situation, a high sampling density of measurements would be used to provide a spatial average to validate the performance of the model.    of eight stations belonging to the synoptical network of the Royal Meteorological Institute of Belgium), where the + sign means improvement, 0 means neutral effect, and the − sign means degradation of the scores with respect to the 95 % confidence levels calculated with the bootstrap method.During the winter night-time (which is longer in January than in July), forecasted 2 m temperatures are generally colder than observations over Belgium for both simulations, with and without SURFEX.The origin of the cold bias is that the model physics yields too little near-surface vertical turbulent mixing during calm night-time conditions (i.e.stable night-time low-level temperature inversions, referred to as the stable boundary layer).This problem is amplified in the cold season because of longer nights, the increased tendency during the cold season of night-time winds to become very weak, and the cooling effect of snow cover yielding even stronger night-time temperature inversions (see Hamdi (2009) for more details).Moreover, the night-time situation has a positive feedback character, because as the low-level inversion sets in, the surface vertical turbulent mixing of heat falls off, which in turn acts to strengthen the inversion and so forth.Moreover, Best and Hopwood (2001) found that the choice of stability functions at night can have significant impact on both the surface temperature and the sensible heat flux and therefore on the diagnostic of screen temperature in stable situations.In fact, using Monin-Obukhov similarity theory with log-linear stability functions cuts off the flux of heat with increasing stability too quickly compared to the observations (Best and Hopwood, 2001).This leads to incorrect lower surface temperatures as the warmer atmospheric air is no longer mixed down to the surface (Masson and Seity, 2009).The average mean bias and rmse for the Uccle station (Flat) are significantly reduced when using SURFEX.It can also be seen from Fig. 3 that the improvement of bias and rmse is statistically significant.The average mean bias is significantly reduced when using SURFEX, with an average of +2 • C for OPER versus almost zero for OPER + SFX at the Uccle station.It can also be seen from Table 2 that OPER + SFX simulation gives better results at the coast.The improvement of bias and rmse during the summer is statistically significant.For the high-elevation synoptic station, the use of SURFEX has a neutral impact on the scores and the null hypothesis is not accepted during winter and summer.

2 m temperature
During winter, OPER + SFX has a tendency to produce a too high maximum temperature at a high-elevation station.The average mean bias is significantly warmer when using SURFEX, with an average of 1 • C for OPER versus 1.5 • C for OPER + SFX.It can also be seen from Table 2 that OPER+SFX did not give any improvement and the null hypothesis is not accepted during the winter.However, during the summer, OPER + SFX gives an improvement.For the flat topography and coastal synoptic stations, the use of SUR-FEX either shows improvement for or has a neutral impact on the scores.Masson and Seity (2009) found that the use of CANOPY improves the forecast of near-surface air temperature at night for strong stability conditions.

2 m relative humidity
Figure 5 presents the scores obtained for the Uccle station and Fig. 6 shows the improvement in bias and rmse obtained when using SURFEX.Table 2 shows the average daytime/night-time scores.The temperature results correlate with the 2 m relative humidity results that show a large improvement during winter and summer.It can also be seen from Fig. 6 that during winter, the OPER + SFX significantly improve the scores.However, during the summer, the improvement is only seen during the night-time.
Over the Slovenia domain, SURFEX has also been tested within the ALARO model using the FMR radiation scheme for two short test periods: 4-11 February 2011 and 12-17 July 2011.At its introduction, SURFEX was tested with two horizontal resolutions (4.4 and 9.5 km).Tables 3 and  4 present the average daytime/night-time 2 m temperature and relative humidity scores for five locations for the 9.5 km and 4.4 km horizontal resolution, respectively.For this short period, scores are in general neutral or marginally positive; only in some cases is there a medium deterioration (particularly in wintertime for the 9.5 km run with a cold bias at Novo Mesto, Kranjska Gora and Ljubljana stations).SURFEX yields improved relative performance for the high-resolution run.For the 4.4 km run almost all scores are neutral or positive in winter and summer period.Significant deterioration is only observed in Kranjska Gora for the 2 m relative humidity during winter night-time.
In Poland, SURFEX has been tested during the last decade of March 2011 within the ALADIN operational suite and the results show a neutral impact on the 2 m temperature and relative humidity scores.
As can be seen from Figs. 7 and 8 and Table 2, the use of SURFEX has a neutral impact on the 10 m wind direction, while it improves the 10 m wind speed during the night for flat and coastal stations.

Surface fluxes: test with data from Cabauw tower
The Cabauw tower is situated in the central river delta in the south-western part of the Netherlands, 0.7 m below mean sea level.The surroundings are flat and consist of meadows and ditches with scattered villages, orchards and lines of trees.
The immediate surroundings of the tower are free of obstacles up to a few hundred metres in all directions, with the local surface consisting mainly of short grass.For the predominant wind direction (south-west), the flow is unperturbed over an upstream distance of about 2 km.The routine observations include profiles of wind speed, wind direction, air temperature and dew point temperature at 10, 20, 40, 80, 140 and 200 m above ground level.The temperature is also measured at 2 m, and fluxes of momentum and heat at 5 m.In addition, there are sensors for a number of surface radiation fluxes and precipitation at the site (see www.cosmo-model.org/srnwp/view/).Figure 9 presents the scores obtained with the 4 km domain for the 2 m temperature at the Cabauw station, and Fig. 10 shows the improvement in bias and rmse obtained when using SURFEX.

Night-time
Just as found for the station, the average mean bias and rmse for the Cabauw station are significantly reduced during the summer when using SURFEX (see Fig. 9).It can be seen from Fig. 10 that the improvement of bias and rmse is statistically significant.During the summer the average mean bias is significantly reduced, with an average of +1.5 • C for OPER versus almost zero for OPER + SFX.During the winter, the OPER + SFX simulation did not give any improvement and the null hypothesis is not accepted.As it can be seen from Table 5, there is also a significant improvement of the upward longwave radiation and storage heat flux during the summer night-time.In fact, the average mean bias and rmse of the storage heat flux is significantly reduced when using SURFEX (not shown), with an average overestimation of 10 W m −2 for OPER + SFX versus 34 W m −2 for OPER.The use of SURFEX has a neutral impact of the partitioning between sensible and latent heat flux during summer and winter (their values are very small during the night).

Daytime
During daytime, the use of SURFEX has a neutral impact on the 2 m temperature at the Cabauw site.However, as it can be seen from Table 5, there is a significant improvement of the upward shortwave radiation.In fact, the average mean bias and rmse of the upward shortwave radiation flux is significantly reduced (up to 10 W m −2 , not shown) during the summer when using SURFEX.There is also a significant improvement of the surface heat flux, especially during the summer, with a reduction (not shown) up to 20 W m −2 in the rmse of sensible and latent heat flux.These improvements in the upward radiative flux and surface heat flux when using SURFEX are probably due to (i) the use of improved physiographic data within the ECOCLIMAP database compared to the one used by ALARO, and due to (ii) the tiling approach used in SURFEX since TEB was also activated for the 4 km domain.Finally a three-layer force-restore version of ISBA is used within SURFEX instead of the former two-layer version used by ALARO.

Urban effect
Recently, in Hamdi et al. (2012), the TEB scheme was implemented within ALARO, running operationally at 4 km resolution.The primary question addressed was the ability of TEB to work properly at this relatively coarse resolution and thus assessing its potential use in an operational configuration to improve sensible weather performance over Belgium.
Results in Hamdi et al. (2012) show that promising improvements are achieved by introducing TEB.The 2 m temperature and 2 m relative humidity improve compared to measurements in urban areas.Important urban characteristics, such as increased heat storage and Bowen ratio and the urban heat island effect, were successfully reproduced.In addition, comparison of wind speed and wind direction above the urban canopy indicate that the structure of the flow in urban areas is better reproduced with TEB (Hamdi et al., 2012).These improvements of the treatment of the urban areas within ALARO have implications for simulating air chemistry processes over Belgium at this scale (Delcloo et al., 2012).The use of TEB within SURFEX has also been tested over Turkey using the ALARO model and the FMR radiation scheme at 4 km resolution.Figure 11 presents the rmse of 2 m temperature and 2 m relative humidity against observations at the Istanbul city station averaged over July 2010 for ALARO run with SURFEX and ALARO with SURFEX and TEB.The results show a demonstrated positive impact when activating TEB within SURFEX.The forecasted 2 m temperature and 2 m relative humidity improve compared to measurements in Istanbul, especially during the night-time, which is due to the urban heat island effect of Istanbul.
During the night the average mean bias of the 2 m temperature is reduced (not shown), with an average cold bias of −1 • C for ALARO with SURFEX versus almost 0 • C for ALARO with SURFEX and TEB.Also during the day the average mean bias of the 2 m relative humidity is significantly reduced (not shown), with an average of +15 % for ALARO with SURFEX versus 8 % for ALARO with SURFEX and TEB.

Precipitation
In order to investigate the influence of introducing SUR-FEX on winter and summer precipitation, the precipitation fields of the run with (OPER + SFX) and without SURFEX (OPER) are verified against a quantitative precipitation estimates with a radar-gauge merging method (Goudenhoofdt and Delobbe, 2009) using the SAL (structure, amplitude and location) method of Wernli et al. (2008).This method characterizes the quality of a forecasted precipitation field by means of three components: structure, amplitude and location.The structure component characterizes the size and shape of the precipitation objects and ranges from −2 (predicted precipitation objects too small or too peaked) to 2 (predicted precipitation objects too large or too flat).The value of S = 0 indicates that the model has the correct structure.The amplitude component also varies between −2 and 2, with a value of −2 indicating an under-predicted total precipitation amount, a value of 2 indicating an over-predicted total precipitation amount and 0 denoting a perfect forecast in terms of amplitude.Finally, the location component quantifies whether the predicted precipitation objects are situated at the correct location, and ranges from 0 (predicted precipitation objects at correct position) to 2 (predicted precipitation objects at incorrect position).Figure 12 shows the structure and amplitude precipitation scores for January 2010 for the ALARO 7 km with (SFX) and without (OPER) SURFEX against radar observations.As a sensitivity test, SAL scores were also computed for the run with SURFEX against the operational runs.Table 6 presents the average (for January and July 2010) SAL scores for the 4 km and 7 km runs with and without SURFEX.
From Fig. 12 and Table 6, it appears that the use of SUR-FEX has a neutral impact on the three components of the SAL method when comparing the ALARO runs against the observations.However, it seems that the use of SURFEX tends to cause rainfall to be locally concentrated (S < 0), and the total accumulated precipitation decreases slightly (A < 0).When comparing the 4 km runs against observations during July 2010, this effect becomes clearer, with A = 0.0548 for OPER against A = 0.0161 for OPER + SFX.Thus the use of SURFEX slightly reduces the bias of the total precipitation amount (the cross marker is closer to the centre, not shown).Hamdi et al. (2012) found that the implementation of TEB within SURFEX for the 4 km run during the summer tends to cause rainfall to be locally concentrated, and the total accumulated precipitation obviously decreased, but extended validation would be needed to address this further.

Surface data assimilation
In order to compare OI and the EKF for surface assimilation, several experiments were run.All experiments have the same setup.The experiments are run with ALARO in combination with the external land surface model SURFEX.All runs were performed on the 4 km domain with 46 vertical levels.Surface assimilation is performed every 6 h.There is no atmospheric assimilation as in Mahfouf et al. (2009).The screen-level relative humidity and temperature observations are taken from SYNOP and TEMP reports in the Meteorological Archival and Retrieval System (MARS).The screenlevel observations are interpolated on the model grid using an optimum interpolation technique with high background error covariances to minimize the influence of the analysis background.The gridded observations are then used for the pointwise EKF or OI assimilation.The parameters used for the EKF are the following: the observation error covariance matrix R is a diagonal matrix with elements set to 1 K for 2 m temperature and 10 % for 2 m relative humidity.The background error covariance matrix B is also a diagonal matrix, with values of 2 K for the background errors of surface and deep soil temperature (T s and T 2 ) and 0.1 (W fc − W wilt ) for surface and deep soil moisture content (WG 1 and WG 2 ), with W fc and W wilt the volumetric water content at field capacity and at permanent wilting point.The EKF is simplified by assuming a constant B matrix and is therefore a SEKF.The setup and values are the same as in Mahfouf et al. (2009).Runs have been performed with surface assimilation (EKF and OI), without assimilation where surface fields are taken from the previous 6 h forecast of the coupled model (free run), and without assimilation where surface fields are interpolated from an ARPEGE analysis (open loop).The experiments were run over the period of one month, July 2010.
Figures 13 shows the increments for WG 1 (top) and WG 2 (bottom) accumulated over the month of July 2010 for the OI run (left) and the EKF run (right).For WG 1 the spatial structure of the increments is similar for OI and EKF, but the increments of OI have larger values than those of EKF.This is due to the fact that the EKF has dynamical coefficients that are better able to simulate the weak link between the screen-level errors and the superficial soil moisture content (Mahfouf et al., 2009).The accumulated increments for WG 2 show more differences in spatial structure and sign between OI and EKF.The spatial structure for the OI increments is much smoother and the values of the EKF increments are somewhat higher.The irregular spatial structure of the WG 2 increments for the EKF and their differences with the OI increments stems from the different handling of negative soil wetness index (SWI) values between OI and the EKF.SWI is defined as (WG 2 − W wilt )/(W fc − W wilt ).If the soil moisture content is between the wilting point and the field capacity (i.e.SWI between 0 and 1), the assimilated screen-level observations are sensitive to changes in the soil moisture content; that is, the gain coefficients will be different from zero (Balsamo et al., 2004).In regions where the SWI is below 0 or above 1, the screen-level variables are not sensitive to changes in soil moisture content.
In OI this sensitivity to the SWI value is explicitly coded.For soil moisture below the wilting point, only positive or zero increments are allowed, while for soil moisture above the field capacity only negative or zero increments are allowed.If the soil moisture is in the SWI sensitivity region, increments are allowed but limited in size so that they do not push the soil moisture content outside of the SWI sensitivity region.
In the EKF this sensitivity to the SWI value is present directly in the Jacobian values of the observation operator (and thus the gain values that depend on those).For a negative SWI value (or a SWI value above 1) the screen-level variables do not change for a small perturbation of the soil moisture and hence the Jacobian and gain value are zero at these locations, independent of whether the increment is positive or negative.Also, when the SWI value is in its sensitivity region, there is no check included to make that sure the increments do not push the soil moisture content outside of this sensitivity range.Thus, as soon as WG 2 drops below the wilting point at a certain location, the EKF will not give any increments (not even positive ones) until the soil moisture rises above the wilting point again, while OI will only block the negative increments in such a case and allow positive ones.Therefore it will be easier for OI to recover from negative SWI values than for the EKF and OI will allow for more positive WG 2 increments.This results in regions with a small or negative accumulated WG 2 increment for the EKF where OI has a larger positive increment.For WG 2 above but close to the wilting point, the link between the root zone soil moisture and the screen-level variables is the largest, resulting in high gain coefficients and increments in the regions neighbouring the ones with negative SWI values.
The EKF can be changed to include a limitation for the increments to make sure they are not too big and do not push the SWI value outside of the sensitivity range (like in Mahfouf et al., 2009).This is more similar to what is done in OI, although there will still be no positive increments allowed in the EKF for negative SWI values.When the EKF is modified in this way, the spatial structure is already less irregular and more like that of OI (see Fig. 14).
In general, there is a good correspondence between the increments of OI and EKF, with the EKF increments showing a more fine-grained spatial structure.Also the forecast scores (RMSE and BIAS) for T 2 m and RH 2 m are similar for EKF and OI (Fig. 15).

Vertical scores
In Hungary, SURFEX has been tested using the ALARO physics and the FMR radiation scheme over a continental European domain with 8 km grid based on atmospheric analyses coming from the ECMWF/IFS global model, which is the model providing also the 3 h lateral boundary coupling data.The surface analyses was taken from ARPEGE due to the different surface schemes between IFS and ARPEGE/ALADIN/ALARO.For two periods: (i) in summer (1 July-15 August 2010) and (ii) winter (10-29 December 2010), simulations are performed, with (S003) and without SURFEX (A003), with a forecast range of 48 h, starting at 00:00 UTC.Scores are averaged over the whole domain.
Figures 16 (winter) and 17 (summer) present the effect of using SURFEX on the rmse (model against analysis) along the vertical as a function of forecast range averaged over the whole domain.The introduction of SURFEX either shows improvement for or has a neutral impact on the vertical.However, during the winter, SURFEX slightly deteriorates the temperature rmse for the lowest model levels.A recent test (H.M. S. Kullmann, personal communication, 2011) using SURFEX together with the CANOPY scheme gives better results for the lowest model levels.The introduction of SURFEX is neutral on the vertical profile of the wind speed (see Figs. 18 and 19).

Conclusions
This study was motivated by the desire to evaluate the performance of SURFEX as a new land surface scheme for the ALADIN and ALARO model.The aim of the present study is not to fully reproduce the model behaviour while replacing the old ISBA scheme with the SURFEX-ISBA scheme, but rather we would like, by exhibiting the new features developed in SURFEX, to reproduce forecast performances equivalently or better in terms of the set of verification scores.The results over Belgium show that the introduction of SURFEX either shows improvement for or has a neutral impact on the 2 m temperature, 2 m relative humidity and 10 m wind.However, it seems that SURFEX has a tendency to produce a too high maximum temperature at a high-elevation station during winter daytime, which degrades the scores.In addition, surface radiative and energy fluxes improve compared to observations from the Cabauw tower.The results also show that promising improvements with a demonstrated positive impact are achieved by introducing TEB.The 2 m temperature and 2 m relative humidity improve compared to measurements in urban areas, and important urban characteristics such as increased heat storage and Bowen ratio and urban heat island effect were successfully reproduced.It was found that the use of SURFEX has a neutral impact on the precipitation scores.However, the implementation of TEB within SURFEX for the high-resolution 4 km run tends to cause rainfall to be locally concentrated and the total accumulated precipitation obviously decreases during the summer.One of the recent evolutions within SURFEX is the development of a more advanced surface data assimilation using the extended Kalman filter.The comparison for Belgium shows that the forecast scores are at least similar between the extended Kalman filter and the classical optimal interpolation scheme.However, the use of EKF will address some fundamental limitations when using the optimal interpolation coefficients (e.g. in usage of satellite remote sensing and ground-based observed precipitation).Finally, concerning the vertical scores, the introduction of SURFEX either shows improvement for or has a neutral impact on the vertical.However, it was found that during the winter, SUR-FEX causes slight deterioration in the temperature scores for the lowest model levels.Overall, it can be stated that forecast performance can be improved on average when using SUR-FEX in ALARO.

Fig. 1 .
Fig. 1.Domains corresponding to the 7 km (border of the panel) and 4 km (dashed lines) operational applications.

Figure 2
Figure2presents the scores obtained for the Uccle station, which is situated some 6 km south of Brussels in a suburban area (50.80 • N, 04.35 • E), and Fig.3shows the improvement in bias and rmse obtained when using SURFEX.The 95 % confidence intervals for |bias OPER + SFX | − |bias OPER | and rmse OPER + SFX − rmse OPER were calculated with the bootstrap method explained above.Table2shows the average daytime/night-time scores for the flat (less than 100 m altitude), high-elevation and coastal synoptic stations (total

Fig. 3 .
Fig. 3.The improvement in bias (left) and rmse (right) of the 2 m temperature obtained when using SURFEX for January (top) and July (bottom).The 95 % confidence intervals for |bias OPER + SFX | − |bias OPER | and rmse OPER + SFX − rmse OPER were calculated with the bootstrap method.

Fig. 4 .
Fig. 4. Statistical scores (rmse: top; bias: bottom) of 2 m temperature against observations at the Ouarzazat station for a winter period 1-20 January 2010 for ALADIN without SURFEX (black solid line), ALADIN with SURFEX (red dashed line) and ALADIN with SURFEX and CANOPY (green dashed lines).

Table 3 .Fig. 6 .
Fig. 6.The improvement in bias (left) and rmse (right) of the 2 m relative humidity obtained when using SURFEX for January (top) and July (bottom).The 95 % confidence intervals for |bias OPER + SFX | − bias OPER | and rmse OPER + SFX − rmse OPER were calculated with the bootstrap method.

Fig. 7 .
Fig. 7. Statistical scores of 10 m wind speed and direction against observations at the suburban Uccle station (bias: thick line; rmse: thin lines) for January (top) and July (bottom) for ALARO without SURFEX (OPER, solid lines) and with SURFEX (OPER + SFX, dashed lines) simulations.

Fig. 8 .
Fig. 8.The improvement in bias (left) and rmse (right) of the 10 m wind speed obtained when using SURFEX for January (top) and July (bottom).The 95 % confidence intervals for |bias OPER + SFX | − |bias OPER | and rmse OPER + SFX − rmse OPER were calculated with the bootstrap method.

Fig. 10 .
Fig. 10.The improvement in bias (left) and rmse (right) of the 2 m temperature obtained when using SURFEX for January (top) and July (bottom).The 95 % confidence intervals for |bias OPER + SFX | − |bias OPER | and rmse OPER + SFX − rmse OPER were calculated with the bootstrap method.

Fig. 11 .
Fig. 11.Rmse of 2 m temperature (top) and 2 m relative humidity (bottom) against observations at the Istanbul city station averaged over July 2010 for ALARO with SURFEX (solid line) and ALARO with SURFEX and TEB (dashed lines).

Fig. 12 .
Fig. 12. Structure and amplitude precipitation scores of the 7 km ALARO run for January 2010.Against radar observation (top), the OPER run (left) and the run with SURFEX (right).The run with SURFEX against the operational run (OPER) (bottom), where each point corresponds to 1 day.The cross indicates the weighted mean.

Fig. 13 .
Fig. 13.Soil moisture increment (mm) accumulated over the month of July 2010.Top left: superficial produced by OI analysis; top right: superficial produced by the EKF analysis; bottom left: deep produced by OI analysis; and bottom right: deep produced by the EKF analysis.

Fig. 14 .
Fig. 14.Deep soil moisture increment accumulated over the month of July 2010, produced by EKF analysis where SWI is kept between 0 and 1.

Fig. 15 .
Fig. 15.Root-mean-square error and BIAS for relative humidity at Uccle averaged over the month July 2010 for optimum interpolation (OI), extended Kalman filter (EKF), open loop and free run.

Fig. 16 .
Fig. 16.Top: vertical profile of the temperature rmse difference between a run with (S003) and without SURFEX (A003) as a function of forecast range averaged over a winter period (10 December 2010-29 December 2010) and over the whole domain.Red shaded areas mean that the use of SURFEX improve the scores.Bottom: temperature rmse of the run with (red line) and without (black line) SURFEX at different pressure levels (250, 500, 700 and 850 mb).

Fig. 17 .
Fig. 17.Top: vertical profile of the temperature rmse difference between a run with (S003) and without SURFEX (A003) as a function of forecast range averaged over a summer period (18 July 2010-15 August 2010) and over the whole domain.Red shaded areas mean that the use of SURFEX improve the scores.Bottom: temperature rmse of the run with (red line) and without (black line) SURFEX at different pressure levels (250, 500, 700 and 850 mb).

Fig. 18 .
Fig. 18.Top: vertical profile of the wind speed rmse difference between a run with (S003) and without SURFEX (A003) as a function of forecast range averaged over a winter period (10 December 2010-29 December 2010) and over the whole domain.Red shaded areas mean that the use of SURFEX improve the scores.Bottom: wind speed rmse of the run with (red line) and without (black line) SURFEX at different pressure levels (250, 500, 700 and 850 mb).

Table 1 .
Summary of the different model configuration available within the model code.

Table 2 .
The average daytime/night-time scores for the flat/high-elevation and coastal synoptic stations.The + sign means improvement, 0 means neutral effect and the − sign means degradation of the scores.Winter NIGHT Winter DAY Summer NIGHT Summer DAY

Table 5 .
The average daytime/night-time scores for the radiative balance, energy balance at the Cabauw tower station.The + sign means improvement, 0 means neutral effect and the − sign means degradation of the scores.Winter NIGHT Winter DAY Summer NIGHT Summer DAY

Table 6 .
The average (for January and July 2010) S (structure) A (amplitude) L (location) scores, for the 4 km and 7 km runs with (OPER + SFX) and without SURFEX (OPER) against radar observations.A third column is added for each run corresponding to the SAL scores for the run with SURFEX with respect to the operational run.SFX OBS OPER + SFX OPER OPER OBS OPER + SFX OBS OPER + SFX OPER