Impacts of high resolution model downscaling in coastal regions

The issue of appropriate resolution of coastal models is addressed in this paper. The quality of coastal predictions from three different spatial resolutions of a coastal ocean model is assessed in the context of simulation of the freshwater front in Liverpool Bay. Model performance is examined during the study period February 2008 using a 3-D baroclinic hydrodynamic model. Some characteristic lengthscales and non-dimensional numbers are introduced to describe the coastal plume and freshwater front. Metrics based on these lengthscales and the governing physical processes are used to assess model performance and these metrics have been calculated for the suite of downscaled models and compared with observations. Increased model resolution was found to better capture the position and strength of the freshwater front. However, instabilities along the front such as the tidal excursion led to large temporal and spatial variability in its position in the highest resolution model. By examining the spatial structure of the baroclinic Rossby radius in each model we identify which lengthscales are being resolved at different resolutions. In this dynamic environment it is more valuable to represent the governing time and space scales, rather than relying on strict point by point tests when evaluating model skill.


Introduction
Increasing spatial and temporal resolution appears to be an obvious route for getting more accurate forecasts in operational coastal models. Also physical processes such as coastal baroclinic Rossby waves may need increased resolution (Garvine, 1995;Chant, 2011). However there are penalties for increasing resolution, for example the cost of running a higher resolution model may increase by several orders of magnitude, because if the resolution in 1-D is doubled the resolution in 2-D is 2 2 , plus usually there will be a related reduction in time step which may lead to an order of magnitude increase in the model run time for the same period of real time. Another problem is the introduction of high-frequency variability which is not necessarily deterministic. Thus a flow may appear more realistic by generating eddies but the simple statistics like root-mean-square (rms) error and correlation may deteriorate because the model variability is not exactly in phase with the observations (Hoffman et al., 1995).
Traditional error metrics, such as least squares methods, are not necessarily the best choice to illustrate model accuracy, e.g. small errors in the location of a front are translated to large differences in least squares of intensities (Ziegeler et al., 2012). Spatial error metrics have been developed in a number of studies (Gilleland et al., 2009(Gilleland et al., , 2010Marzban et al., 2009), many of which are in the atmospheric modelling discipline. By examining model output in terms of the length and timescales of the dominant physical process, rather than naive statistical measures, we will address the question "Do coastal predictions improve with higher resolution modelling?".
In collaboration with the UK Met Office, the National Oceanography Centre runs a suite of nested models (http://cobs.noc.ac.uk/ modl/). These Irish Sea Observatory (ISO) models provide predictions 36 h into the future, generating ocean forecasts of currents, waves, temperature and salinity on a variety of scales, ranging from 12 km to 1.8 km. The forecast area covers the Northwest European Shelf, with a focus on the Irish Sea, and information from the Met Office's FOAM model (a product of MyOcean http:// www.myocean.eu/) is used as an open boundary forcing for the the outermost model in the system.
We have used the ISO nested modelling suite, together with a further model nest covering Liverpool Bay at 180 m resolution, to investigate the impact of dynamical downscaling. The MyOcean product is thus downscaled through the nested models, with the aim of adding value to the coastal forecasts generated by this coarser product. Boundary information from the MyOcean model will impact upon each level of model nest, and ultimately affect the results in the 180 m model. At local scales and in limited coastal domains such as Liverpool Bay, boundary conditions become particularly important since the simulated field may well be controlled by boundary information. The zone of boundary influence is dynamic and process-dependent and has received relatively limited research attention in oceanography (with some exceptions such as e.g. Sanchez-Arcilla and Simpson, 2002). Liverpool Bay is a region of freshwater influence (ROFI) and will be controlled by the land-ocean boundary forcing, as well as the open-boundary ocean forcing. Thus we choose to assess the position and strength of the freshwater plume and front between mixed and stratified water as our metrics.
In Section 2 we present the physical situation in the study area, and in Section 3 we discuss the dominant length and time scales which need to be resolved. In Section 4 the modelling tools are described together with the error metrics used to assess model accuracy. The results of model downscaling are presented in Sections 5 and 6 and discussed in Section 7 and some final conclusions are drawn in Section 8.

Case study for Liverpool Bay
Our study area covers a corner of the eastern Irish Sea extending from roughly 2.51W-4.51W and 531N-541N. Under the classification of Simpson (1997), Liverpool Bay is a corner source ROFI (region of freshwater influence) with strong horizontal density gradients. The bay is also strongly tidally dominated, with a high tidal range (mean spring tidal range 8.22 m) and extensive intertidal areas (Polton et al., 2011;Howarth and Palmer, 2011). Freshwater enters Liverpool Bay from several rivers, including the Mersey, Dee, Ribble, Conwy and Clwyd, which collectively maintain a strong salinity gradient. As part of their review paper, Simpson and James (1986) identify the eastern Irish Sea as a region where river inflow dominates the stratification.
In the nearshore river discharge creates a plume of freshwater, which flows out over saltier water while remaining close to the coast. The front is defined as the point at which stratified coastal plume meets the tidally well-mixed shelf waters. In the nearshore there is strong vertical stratification and the front will be salinity controlled. The strength and extent of the stratification are governed by the stratifying influence of freshwater discharge injecting buoyancy into the system, and the de-stratifying effect of tidal and wind mixing. When the water column is well mixed, the sea surface temperature (SST) will match the bottom temperature. During periods of high discharge, or at slack water when tidal mixing is low, freshwater at the surface can spread further in the horizontal and the front may become detached from the bottom. The SST can be a tracer for the salinity stratification, which is useful as this can be detected in satellite images. Proper resolution of a front requires identifying the straining leading to filaments, as well as capturing the strong density gradients forming the front. Conversely the models must also contain accurate representation of turbulent mixing which break down these gradients.
Previous modelling studies of the Liverpool Bay ROFI have found the salinity particularly difficult to represent. O'Neill et al. (2012) evaluate the performance of POLCOMS models at 12 km and 1.8 km resolution in Liverpool Bay. When compared against observed temperature and salinity from CTD profiles and ferrybox data, POLCOMS was found to over-estimate the salinity range. They also found that, at these resolutions, POLCOMS displayed high errors in the region of the Mersey plume. In our study the freshwater plume front has been modelled at a range of resolutions. Fig. 1 shows snapshots of the modelled salinity from 17th February 2008. The horizontal extent of the plume, and the width and position of the freshwater front are affected by model resolution.

Lengthscales and timescales
The behaviour of freshwater plumes and fronts has been reviewed by Simpson and James (1986), and Garvine (1995) classifies the behaviour of buoyant plumes based on principal lengthscales. Two dominant lengthscales must be considered when we examine the freshwater plume. The first lengthscale to consider is the first baroclinic Rossby radius. This is the natural scale of baroclinic motion in the ocean, often associated with boundary currents, eddies and fronts (Gill, 1982). The Rossby radius (R n ) is the scale at which rotational effects become as important as buoyancy effects, defined as The first mode, n¼ 0, is only applicable for a barotropic ocean, but the next modes are baroclinic ones. The first baroclinic mode, n¼ 1, is the most important one as regards mesoscale motions. Here we will use only the first baroclinic Rossby radius (i.e. for n¼ 1) where which will be calculated as L R ¼ NH=f where f is the Coriolis parameter, H the water depth, and N the Brunt-Väisälä frequency, defined as The second important lengthscale is the extent of the freshwater plume in the horizontal. L is defined as the extent of the freshwater plume along the coast. Garvine (1995) define the extent of plume away from the coast as γL, where 1=γ represents the 'slenderness' of the plume. This plume width, γL, is measured from both models and observations and listed in Table 3. These two lengthscales can be combined through the Kelvin number, defined as a ratio of γL and L R : Garvine (1995) separates plumes into non-notating (of small or zero Kelvin number) and rotationally controlled (Kelvin number of order 1). Bolanos et al. (2013) use these, and other nondimensional numbers to characterise properties within the tidal channels of the Dee estuary. However, in these channels the lengthscale will be constrained by the channel geometry, while the freshwater plume will be free to extend more widely before coming under rotational control.
Though the freshwater front itself is a persistent feature, its mean position displays considerable spatial and temporal variability. This variability includes a regular tidal excursion studied by Hopkins and Polton (2012) in which the front can move as much as 10 km over a flood-ebb cycle. The freshwater entering Liverpool Bay also has an 'age' defined as a time-scale for freshwater entering from rivers to be flushed out of the system. The bay has a flushing time of approximately 136 days and mean residence time of approximately 103 days (Phelps et al., 2013). Water exchange in the region is impacted seasonally by both thermal stratification and wind action, for example Dabrowski et al. (2010) found considerable seasonal variability in the residence time of the Irish Sea, which was found to range from 386 days in the summer to 444 days in the winter.

Modelling
The well-established 3D baroclinic circulation model, POLCOMS (Holt and James, 2001) has been used. The model is discretised in the horizontal on a structured, regular, finite difference grid, and uses a series of nests to downscale from the outer to inner domains. Regularly spaced sigma co-ordinates are used in the vertical, and the number of layers used are specified in Table 1. POLCOMS uses an Arakawa B-grid model and a sophisticated advection scheme, the Piecewise Parabolic Method (James, 1996). This has feature-preserving properties making it ideal for the simulation of fronts (Proctor and James, 1996). POLCOMS is also coupled to the General Ocean Turbulence Model (GOTM) (Holt and Umlauf, 2008), and POLCOMS-GOTM will be used for the highest resolution simulations presented here.
The models used cover the Atlantic Margin (AMM), Irish Sea (IRS) and Liverpool Bay (LB) and their resolution and extents are summarised in Table 1 Table 1. The Met Office Northwest European Continental Shelf ( % 12 km resolution) mesoscale model was used to provide atmospheric forcing. Table 2 summarises some of the differences between the model configurations. For example, the LB models include flooding and drying, while the coarser models specify a minimum depth. In the AMM model the bottom friction is controlled by a constant drag coefficient, Cd. In all other models a quadratic drag momentum boundary condition is used at the bed (implemented in Holt and James, 2001). Following Ruddick et al., 1995, the bottom drag coefficient, Cd, is given by Cd ¼ ðκ=lnðz=z 0 ÞÞ 2 , where κ is the von Karman constant, z 0 ¼0.005 m is the bed roughness length and z is the height above the bed. If Cd is less than 0.005, it is assumed to be 0.005. The turbulence closure used is consistent between models and is described in Holt and Umlauf (2008).
Fields of temperature and salinity require a long period of spinup time. For the AMM and IRS runs, several years of data were already available, from the Irish Sea Observatory operational model suite. In order to initialise the Liverpool Bay model from a similarly spun-up state, outputs from the IRS model were interpolated onto the finer LB grid for initialisation. The LB models were then run for a further two months spin-up (December 2007-January 2008) before the study period of February 2008 which has been examined in depth. As a caveat, the results considered in this work cover only a short time, however February is a suitable month in which to perform this study, as there is minimal surface heating and no seasonal thermal stratification (Hopkins and Polton, 2012).

Analysis
To determine what degree of downscaling gives an improvement in our study, we must define a set of event based metrics, specific to what we are trying to model. We focus on: (i) the position of the front in an East-West transect; (ii) the magnitude of temperature difference across the front and (iii) the spatial  'sharpness' of the front d 2 T=dx 2 . When analysing the vertical structure, we also consider the absolute density difference (from observations), strength of stratification, baroclinic Rossby radius and the shape of the interface in a latitude-depth plane. O'Neill et al. (2012) performed a strict test of the models with no account taken of a slight difference in phase or location of features such as the Mersey river plume. They state that it is necessary to make use of quantitative skill score metrics to objectively and systematically compare the performance of different models. Some standard statistics of model performance are calculated: a mean, standard deviation, root mean square error (rmse) and coefficient of correlation (r 2 ) for the surface density (ρ); but these simple statistics were found to be unhelpful in understanding the models' abilities to represent governing physical processes.
We argue that it is instead more useful to evaluate the models in terms of some physical controls: here we use the vertical density difference and buoyancy frequency (N 2 ) as well as the nondimensional Kelvin number as suggested by Garvine (1995). Some characteristic lengthscales of the plume (width, sharpness, cross front temperature difference and Rossby radius) are also calculated for a single day (17th February 2008). The 17th is a period of particular interest, as it is at this time that the water column is seen to stratify close to the coast while remaining well mixed further offshore. For context, longer time series are considered covering the whole month in Section 5.
The cross-front temperature function takes the form of a cumulative normal distribution. The first derivative of this function is a Gaussian function, centred about the middle of the front (x 0 ). The longitude of x 0 is defined as The width of this Gaussian is the thickness of the front (x W ) which is defined as As the cross-front temperature profile is likely to be noisy, this approach allows us to consider an idealised curve, while maintaining the characteristic lengthscales.
To evaluate the models, they were compared with both satellite and in situ observations. Though water density in Liverpool Bay is salinity dominated, especially in winter months, SST can be used as a tracer for the position of the front (Hopkins and Polton, 2012). In order to validate the spatial position of the plume, modelled SST was compared with satellite observations to assess the performance of our downscaled models. The observations were provided by The Natural Environment Research Council (NERC) Earth Observation Data Acquisition and Analysis Service (NEODAAS). The NEODAAS observations have a spatial resolution o(1 km), for each weekly composite, comprising all the cloud free portions over seven days.
The vertical structure of the plume must also be assessed, and for this purpose observations from a grid of CTD observations available across Liverpool Bay  were used. We concentrate on two locations: site A (located at 53.5340N-3.357833W in 22.2 m water depth), and site B (located at 53.449667N-3.643833W in 24.7 m of water). At both sites temperature and salinity measurements were available daily throughout February 2008 from three depths: a SeaBird SBE 16plus CTD sensor at 0.5 m above the bed, and two SeaBird MicroCATs at 5 m and 10 m below the surface. Fig. 3 shows the location of the 7 1C contour at three model resolutions (coloured lines), compared with satellite observations (in black). In the coarsest AMM model (blue) the front is too far offshore throughout Liverpool Bay, deviating by up to 11 km in places, and is represented as a smooth interface. When model resolution is increased, in the IRS model, the front is shifted too far towards the coast (with a difference of up to 20 km in places). Increasing the resolution further to the 180 m LB_12 model also moves the front towards land, while introducing more variability in it's position. This is because the downscaling also changes the shape of the modelled front, with more small scale variability  Yes À 10 m Spatially varying Gauged observed; the difference in position between modelled and measured front is as much as 23 km in the LB_12 case, reducing to 13 km in the LB_34 model. Particularly noticeable is the indentation of the front at the Wirral Peninsula which is not captured in the AMM model. Contrasting the LB_12 and LB_34 models demonstrates the impact of vertical resolution. In this case the temperature front is brought more in line with observations when the number of vertical levels is increased. A caveat to these results is that we are looking at temperature. This is only a diagnostic variable, as the salinity is controlling the buoyancy, and thus the position of the front. The spatial patterns are also sensitive to which contour is selected. Nonetheless the satellite SST is the only synoptic observation available and so proves useful as an overview. In Section 6 results for density are used as a more robust measure of the frontal position.

Assessing model skill
Next, a transect is taken through the modelled and observed temperatures at 53.51N (presented in Fig. 4). Again the suite of models and NEODAAS observations are compared. The AMM model is seen to consistently underpredict temperatures, while the IRS and LB models are close to observations. The two LB models show very similar results, and are the most successful at simulating both the absolute temperature and cross front temperature difference. The IRS model has too small a cross-front temperature difference associated with a too-narrow front. The position of the front was measured using two methods: first the position of the T¼7 1C contour at 53.5N, and secondly with the more complex x 0 method defined in Eq. (5). Similar longitudes were found using these two methods, with the AMM better representing the observations in both cases. In the higher resolution models the position of the front is closer to the shore, which can also be seen in Fig. 3. Table 3 shows that the freshwater plume is too large in the AMM model, and consequently the front is too far west. When model resolution is increased, the plume width is seen to decrease, and the front moves closer to land. The front widths can be examined in terms of model grid resolution: in the AMM model the front is represented by 1-2 grid boxes, of order 10 grid boxes in the IRS model, and order 100 in the highest resolution LB models. This suggests that the front is under-resolved in the AMM model, while being captured in IRS and well resolved in the LB models. The NEODAAS observations have a very wide front when measured using this method. This may perhaps be due to the use of a composite image covering a full week, while the modelled fronts are representative of a single day. The mean temperature of the sections tells us that the AMM is too cold overall. The IRS model has a mean temperature for this section closer to the observations, but is not so strongly stratified as the AMM. Both LB models have a slightly cold mean value, but do well in terms of capturing the cross-front temperature difference.
The modelled Kelvin number was found to be much greater than 1 for all model resolutions, showing the freshwater plume to be under rotational control. With a highly variable Rossby radius (L R ) this is a difficult value to calculate, so a broad average L R was used for each calculation.
Figs. 5 and 6 show time series of density at sites A and B respectively. In Liverpool Bay salinity dominates the density. Both sites show the AMM and IRS models to have too low a density (by 5 and 2 kgm À 3 respectively) suggesting that the modelled river plume is extending too far from the shore. The LB models are closer to the observations, but at both sites around 1 kg m À 3 too dense. When the vertical resolution of LB is increased to match that of IRS, there was little effect on the modelled density at sites A and B. This shows that the number of vertical levels is not affecting the extent to which the water column stratifies at these sites.
From the observations during this period, we can see that the water column at site A is subject to periodic stratification (between days 42 and 52), while site B remains well-mixed throughout. Site A is closer to the coast, and thus more likely to be affected by the freshwater discharge. However, Fig. 7 shows that the stratification is dependent on the state of the tide, rather    than the rate of freshwater input. Stratification occurs at site A during neap tides, here the surface and bottom series are seen to diverge in the observations. All models are seen to stratify at site A during the neap tide, though the IRS model stratifies more strongly. At site B stratification is not observed, and the water column remains well mixed throughout March. The LB models represent this correctly, while the coarser AMM and IRS models are seen to over-stratify: again showing that the freshwater plume is extending too far offshore in these models.
In Tables 4 and 5 some quantitative model skill metrics (rmse and r 2 ) are presented, together with the mean and standard deviations for modelled and observed density at sites A and B for the whole month of February 2008. These are standard objective measures used to assess model accuracy, but tell us little about the models' ability to represent the freshwater front. O'Neill et al. (2012) perform model skill assessments across the full Irish Sea Observatory CTD grid (34 sites in 4W-3W and 53. 2N-53.5N regions). This grid includes site A, where they find r 2 values in the range 0.0-0.1 for the POLCOMS and NEMO surface salinity. They also state that the models do have some predictive skill, particularly in the open sea areas, which fits with our finding of an increased r 2 value for all models at site B.
The bottom-top density difference and buoyancy frequencies are also calculated at each site: showing the AMM and IRS to be overstratified at both sites. At both sites errors in modelled density difference decrease with increased resolution. At the shallower, nearshore site A, the errors are larger (error in mean density at site A is more than double that at site B). In all simulations the strength of the stratification is seen to decrease with increased resolution, as the modelled densities are brought closer to the observations. This reduction in turn reduces both N 2 and the baroclinic Rossby radius. In areas where there is no stratification, a baroclinic Rossby radius does not make sense, but the vanishing values for N 2 , when model resolution increases, suggest that we are approaching a well-mixed water column.

Frontal structure
Next, the shape and vertical structure of the freshwater front is examined. Fig. 8 shows slices of modelled density through 53.5N from 17th February 2008. A buoyant freshwater plume flows out at the surface, in agreement with observations by e.g. Hopkins and Polton (2012). The freshwater plume is only captured by one or two grid boxes in the AMM model, while both the IRS and LB_12 models better resolve the shape of the plume and front. The overall density of the AMM and IRS section is less than in the LB models, suggesting that the incoming freshwater is having in impact away from just the plume itself.
In both the LB models the freshwater front is located close to 3.21W and little difference in vertical structure is seen between Fig. 8c and d. Fig. 9    3.4 Â 10 À 4 2.0 Â 10 À 4 À 3.3 Â 10 À 5 5.1 Â 10 À 7 2.1 Â 10 À 3 L R (m) 2334 2313 145 146 1001 8.1 Â 10 À 4 1.3 Â 10 À 5 À 1.6 Â 10 À 6 3.5 Â 10 À 7 À 2.0 Â 10 À latitude the coarser 12-level model has a more extensive plume, while the 34-level model appears more mixed in the vertical. Maps of internal/baroclinic Rossby radius from three model resolutions are plotted in Fig. 10. The maximum values are found to agree well with the location of the modelled freshwater front in each case. As well as highlighting the front, the simulated internal Rossby radius gives an indication of the lengthscale of baroclinic instability present in the models. In the AMM model, the values are large (3-7 km), and unresolved by the 12 km grid. In the IRS models the values are of the order of 2 km. The IRS model has a resolution of 1.8 km, very close to the modelled internal Rossby radius, suggesting that the IRS is able to 'permit' the motion of eddies and internal tides at the first baroclinic Rossby radius.
The highest resolution 180 m LB model has sufficient resolution to capture the structure of the instability. In Fig. 10c much more structure emerges. The river channels of the Ribble, Dee, and Mersey stand out, due to the faster tidal flows o(1 ms À 1 ) which are constrained by the narrow channels, rather than the background rotation rate. Bolanos et al. (2013) found Kelvin numbers less than 1 in both the Hilbre and Welsh channels of the Dee estuary, showing them to be controlled by the channel width. Away from the estuaries the distribution of monthly mean baroclinic Rossby radius is similar to that found in the IRS model, although the values are considerably lower: around 400 m for LB compared with 2000 m for IRS. However, Fig. 10(d) shows a daily mean in contrast with the monthly mean plotted in Fig. 10(c). Daily maps of L R in the LB models show large variability in baroclinic Rossby radius between 0 and 2 km. There is also a lot more spatial structure revealed, with filaments and patches of large L R . Daily and monthly means for coarser models look very similar to one another, with none of the patchiness displayed by the LB models suggesting there is much less temporal variability in the IRS and AMM models.

Discussion
Simulating the same physical processes at a range of resolutions reveals the lengthscales which each model can and cannot resolve. For example, the frontal width is represented by 1-2 grid cells in AMM, of order 10 in the IRS model, and order 100 in the highest resolution LB models. Even though the front is not well resolved in the AMM, a sharp front is preserved. The sharp discontinuity may be explained by the numerical scheme used: POLCOMS' piecewise parabolic advection scheme was chosen because it helps to preserve horizontal features such as fronts in Holt and James (2001). O'Neill et al. (2012) also found the front to be too strong in POLCOMS AMM, while the NEMO model which uses the total variance diminishing scheme was too diffusive when applied in the same area.
The IRS, and to a lesser extent the AMM, model were seen to overstratify, while the LB models seemed to better capture the  vertical structure of the plume. Increasing the number of vertical levels in the LB model from 12 to 34 did not increase the modelled stratification. There are two possible candidates for the difference in model stratification: freshwater inputs and the strength of vertical mixing. The LB runs use annually observed river data, while the IRS and AMM models use a 55 year climatology. The climatology for February has a mean river discharge of 418 m 3 which is around double that seen for the gauged outflow in February 2008. This additional freshwater could be driving the stronger stratification. Souza and Lane (2013) used the 180 m LB model to simulate sediment dynamics in Liverpool Bay. They found that the inclusion of daily-mean river flows in the model introduce baroclinic processes, which drive near bed sediments onshore, towards the shipping channels and the estuaries. It is likely that the position of the freshwater plume is also sensitive to the use of realistic river flow versus climatological river flow. In future it would be desirable to run the LB model with climatological rivers, to unpick the differences caused by the freshwater input and model resolution.
The calculated Kelvin numbers for the observed and modelled plumes in Liverpool Bay (Table 3) all are large (i.e. rotationally controlled). Garvine (1995) states that K is a measure of the physical size of the plume in relation to its degree of stratification. So the weaker stratification in the LB models is generating the large Kelvin numbers. There is evidence of some stratification in all models, though it is most apparent in the IRS model (Fig. 8). The timing of onset of this stratification appears to be delayed in all models, when compared with observations. The onset of this stratification may have important impacts on the biology/phytoplankton bloomso the timings are another key feature to resolve and will be followed-up in future work.
The structure of the fronts becomes more complex with increased resolution, because individual frontal fluctuations can be resolved. By resolving this scale, we introduce a lot more uncertainty in the instantaneous position of the front. When doing downscaling is there a point beyond which there is no further advantage in increased resolution? At these high resolutions, we are resolving extra features in our models, rather than relying on well tuned parametrisations. Does adding this extra level of complexity to the modelling improve the skill? Instabilities along the front may be responsible for mixing across it. While these features will be missing in the coarser models, at high resolution this exchange dominates the horizontal dispersion in Liverpool Bay. Broad features such as mean surface density and plume-width were seen to converge when model resolution was increased. However, the time and lengthscales of instability along the front were seen to reduce as the model resolution increased. The plot in Fig. 10(d) is still only a daily mean value. We would expect that calculating this value hourly would show more variability, e.g. the tidal excursions reported by Hopkins and Polton (2012).
The different methods of assessing model skill can be broken down into variables, statistics, and metrics. As an example, a variable could be the temperature bias, a statistic the rmse, and a metric the frontal width. Though statistics are a subset of metrics, in this case we are separating the traditional methods naively applied from a metric specifically designed for the process of interest. Comparing modelled and measured variables equates to evaluating absolute measures, such as the mean temperature. From this point of view, the mean temperature is seen to improve with increased resolution, as is the modelled density. When point statistical measures such as r 2 , rmse, and STD are considered (Tables 4 and 5) all models seem to be behaving poorly, especially at the nearshore site (site A). However, using the front and plume metrics listed in Table 3 gives us a more integrated measure of the models' performance. The location of the front is good in all models, and front widths are also relatively consistent. By using a metric that can bypass the 'noise' of small scale processes while highlighting physically controlled lengthscale, we have designed a method to test the usefulness of our models to represent a process, in this case the behaviour of the large freshwater plume which dominates Liverpool Bay. In this way we can gain insights from including the small, physically realistic variations along the front which only act to generate noise in statistical measures.
Overall, it appears that to represent the mean position and strength of the front as well as the extent of the freshwater plume, the 1.8 km IRS model is sufficient. This is because the model resolution is of the order 1/10th the size of the lengthscales of the freshwater plume, giving sufficient detail to capture the plume and front well. Considering another characteristic lengthscale: the first baroclinic Rossby radius, the IRS model is no longer suitable. The Rossby radius is governing the shape of the front, but not its overall position and lengthscale. The extra detail being captured in the LB runs is thereby introducing uncertainty to the frontal position in the LB models, which is not captured in the coarser simulations. On introducing extra resolution when applying the LB models we better capture this smaller controlling lengthscale, but at the cost of uncertainty in the front and plume lengthscales. The user of these model products must know what they are interested in, rather than assuming added resolution will automatically produce a superior result.

Conclusion
On downscaling coarse resolution models to higher resolution, one might expect features to be 'focussed' more sharply, but otherwise little altered. This is not necessarily the case: with higher resolution comes improved physics, and thus altered behaviour.
In terms of model validation, observations at fixed sites A and B were found to be useful when examining vertical structure. However point sampling is insufficient when examining the spatial structure of fronts. At higher resolution the models are able to resolve baroclinic instabilities.
When sensitivity to vertical resolution was examined in the LB model, the number of levels had little impact on the density at sites A and B, however, when slices of density are considered, the 34 level model appeared more well mixed in the vertical than the 12 level model.
Using controlling lengthscales to validate our models, rather than simple point-wise statistics gives a better understanding of what physics each model nest is able to resolve. The model may appear to be behaving very poorly when a deterministic measure is used. A synoptic view given by frontal maps or slices is used to better represent the spatial structure of the freshwater plume, and help understand the discrepancies between model and observations. The baroclinic Rossby radius was found to be resolved only by the highest resolution model, though the intermediate IRS model is of a scale to 'permit' instabilities at this lengthscale. Increased temporal and spatial variability in maps of Rossby radius was revealed in the 180 m model which were not observed at coarser resolution.
There are still remaining open questions which merit further work in future, namely: What is the impact of vertical resolution on the resolution of the freshwater front, and what is the impact of including real river fluxes as opposed to a climatological mean? NERC iCOASST Project NE/J005541/1. The authors would particularly like to thank the Boundary Layer And Sediment Transport (BLAST) group for their input and advice. We would also like to thank Jason Holt for his helpful conversations regarding the structure of the paper.