Automated continuous verification and validation for numerical simulation

Introduction Conclusions References


Introduction
Since the development of the computer, numerical simulation has become an integral part of many scientific and technical enterprises.As computational hardware becomes ever cheaper, numerical simulation becomes more attractive relative to experimentation.Much attention is paid to the development of ever more efficient and powerful algorithms to solve previously intractable problems (Trefethen, 2008).However, in order to be useful, the user of a computational model must have confidence that the results of the numerical simulation are an accurate proxy for reality.A rigorous software quality assurance system is usually a requirement for any deployment to industry, and should be a requirement for any scientific use of a model.Catastrophic accidents such as the Sleipner platform accident, in which an offshore platform collapsed due to failures in finite element modelling, underscore the importance of such efforts.
More recently, the controversy surrounding the leaking of emails from the Climate Figures Often, undergoing verification and validation is seen as a discrete event, happening after the computational model is completed.Yet many computational models that are used for practical applications are also under active software development; it does not make sense to assert that the code has been verified, when the code has changed since the verification was performed.In principle, whenever the code is changed, verification must be applied to the new revision, for any previous results are irrelevant.Furthermore, the static view ignores the reality that the mathematical model itself may change in the course of development, as new subgrid-scale parameterisations or more appropriate equation sets are researched and implemented.In turn, this renders previous validation efforts incomplete or irrelevant.We therefore advocate the view that verification and validation are continuous processes, happening alongside both software development and usage.This is the only way to ensure that the entire modelling effort stays correct as it is developed.
This poses a difficulty.These processes are generally seen to be time-consuming, uninteresting work.Therefore, they should be automated in order to minimise the amount of manual intervention required in the scrutiny of the newly-changed code.In Knupp et al. (2007), it is implicitly argued that code verification takes too much effort to perform on the very latest version, and is thus relegated to release candidates; however, by automating the process, we have achieved great success in applying code verification to a codebase that changes daily.This philosophy is widespread in the software engineering community (Adrion et al., 1982), but is not yet the generally accepted practice among the scientific modelling community.We discuss a framework for automated continuous verification and validation with particular suitability to numerical models.The framework is of wide applicability to any numerical model, although emphasis is placed on modelling in a geoscientific context.

Fluidity-ICOM
Although the framework presented here is applicable to any numerical model, it is useful to present particular examples.For this we will use the example of Fluidity-ICOM, 1590 Introduction

Conclusions References
Tables Figures

Back Close
Full the primary software package to which this particular framework has been applied.Fluidity is a finite element, adaptive mesh fluid dynamics simulation package.It is capable of solving the full Navier-Stokes equations in compressible or incompressible form, or the Boussinesq equations.It is equiped with numerous parameterisations for subgridscale processes and has embedded models for phenomena as varied as ocean biology, traffic polution and porous media.Fluidity supports a number of different finite element discretisations and is applied to flow problems in a number of scientific and engineering domains including, in particular, ocean flows.In this last context it is known as Fluidity, the Imperial College Ocean Model (Fluidity-ICOM) and this is the name we will use here.For particular information on Fluidity-ICOM as an ocean model, the reader is referred to Piggott et al. (2008) While verification and validation are essential for all scientific computer software, the complexity and wide applicability of the Fluidity-ICOM makes this a particularly challenging and critical requirement.The model is developed by several dozen scientists at a number of different institutions and there are approximately 15 commits (changes to the model) on an average work day.In the absence of constant verification and validation, development would be impossible.

Verification and validation tests
There are two general strategies to inspect the source code of a computational model.-The simplest and least rigorous is to compare the output to previous output; this tests code stability, not code correctness, but can still be useful (Oberkampf and Atrucano, 2002).
-The expected output could be output previously produced by the code that has been examined by an expert in the field.While flawed, asserting the plausibility of the results is better than nothing at all.The test in this case can again be that the output has not changed from previous runs, except where expected.
-The expected output could come from a high-resolution simulation from another verified computational model of the same discretisation.However, analytical solutions are to be preferred as they remove the possibility of common algorithmic error among the implementations.
-The expected output could come from an analytical solution.The test in this case could be the quantification of error in the simulation or numerically computing the rate of convergence to the true solution as some discretisation parameter (h,∆t,...) tends to 0. Comparing the obtained rate of convergence against the theoretically predicted rate of convergence is generally considered the most powerful test available for ensuring that the discretised model is implemented correctly, as it is very sensitive to coding errors (Roache, 2002;Knupp et al., 2007).
-The analytical solution could come from the method of manufactured solutions (Salari and Knupp, 2000;Roache, 2002).This method involves adding in extra source terms to the governing equations being solved in order to engineer an equation whose solution is known.It is a general and powerful technique for generating analytical solutions for use in error quantification or convergence analyses.
-Once the code verification tests have completed, solution verification for a library of simulations may take place.The functional could be some a posteriori estimate of the discretisation error, and the expected output that it is below a given Figures

Back Close
Full tolerance.Formally, this solution verification step is only necessary when the discretisation has changed.
-Once model verification has established the correspondence between the computational model and the mathematical model, the output of the computational model may then be used in the process of validation of the mathematical model.For this purpose, the expected output is derived from a physical experiment.Again, the test could assert that the rate of convergence to the physical result is the same as theoretically predicted.Comparing output to experimental data asserts the applicability of both the computational and mathematical models to the physical world.
It is to be emphasised that model verification should happen before model validation, for otherwise the error introduced in the mathematical modelling cannot be distinguished from discretisation or coding errors (Babuška and Oden, 2004).Formally, this model validation step is only necessary when the mathematical model itself has changed.
In general, a test case is a set of input files, a set of commands to be run on those input files, some functionals of the output of these commands to be computed, and some comparisons against independent data of those functionals.While the purpose and level of rigour of the test changes with the source of the comparison data, this is irrelevant for the execution of the test itself.Indeed, the generality of this view is a great benefit to the design of the framework: code stability tests, code verification, solution verification and model validation can all be performed by the same system.Note that this conception of a test case encompasses both static and dynamic analysis: in static analysis, the command to be run is the analysis tool; in dynamic analysis, the command to be run is the model itself.

Verification and validation as continuous processes
In reality, most computational models are both in production use by end-users and undergoing continual development.Even if the verification procedure has been passed 1593 Figures

Back Close
Full for a previous revision, it does not necessarily mean that the next revision of the computational model will also pass.New development can and does introduce new coding errors.Therefore, verification must be seen as a continuous process: it must happen alongside the software development and deployment.Integrating this process alongside software development greatly eases the burden of deploying stable, working releases to end users.
Regarding verification as a continuous process has many benefits for the parallel process of software development.As new features are added, feedback is immediately available about the impact of these changes to the accuracy of the computational model; since coding errors are detected soon after they are introduced, they are easier to fix as the programmer is still familiar with the newly introduced code.Furthermore, as software development teams become large, it can be difficult to predict the impact of a change to one subroutine to other users of the model; if those other uses of the software are exercised as part of the continuous verification process then unintended side-effects can be detected early and fixed.Since the code is run continuously to check for correctness, other metrics can be obtained at the same time: for example, profiling information can be collected to detect any efficiency changes in the implementation.
In many fields, the appropriate mathematical model is well-known (such as the Navier-Stokes equations for computational fluid dynamics), but in the geoscientific context, development of the mathematical model is much more frequent, as new subgridscale models are developed, and new equation sets are implemented.This motivates also regarding the validation procedure as a continuous process.

Verification and validation should be automated
Verifying every change to a code base as part of a continuous code verification procedure is laborious and repetetive.It is a therefore natural candidate for automation.
Automating the process of code verification means that checking for correctness can be performed simultaneously on multiple platforms with multiple compilers, platforms Figures to which an individual developer may not have access.It also means that more tests can be run than would be practical for a single human to run; these tests can therefore check more code paths through the software, and can be more pedantic and timeconsuming than a human would tolerate.In practice, without automation, the amount of continuous code verification is limited to the problems of immediate interest to the currently active development projects.Depending on the frequency of changes to the mathematical model, it may also be economical to automate or partially automate the validation procedure.Once the automated framework described here has been implemented, the amount of marginal work required to automate some part of the validation procedure is very small, and we therefore include validation cases as part of the test suite.

A framework for automated continuous verification and validation
This section discusses the technical details of the automated continuous verification and validation procedures.

Commit to source
A canonical copy of the source code is kept in a source code control system.A source code control system is a suite of software for managing the software development process.Developers check out a copy of the source code, make changes locally, and commit them back to the source repository.The source code control system merges changes in the case where another developer commits a change between a checkout and a commit.Source code control systems such as Subversion (Collins-Sussman et al., 2004) are an essential component of modern software development.
When a developer commits to the source code repository, that means that the code has changed, and thus any previous verification is irrelevant to the new version.The source code control system emits a signal to the test framework, notifying that the source has changed and that the code verification process should begin.Introduction

Conclusions References
Tables Figures

Back Close
Full

Automated build
The automated verification procedure is managed by buildbot1 .Buildbot is a software package for the automation of software test cycles.It is designed on the client-server architecture: each machine that builds and tests the newly changed software runs a buildslave client, while the overall process is managed by a buildmaster server.It is the buildmaster that is notified by the source code control system.When the buildmaster is notified of a software change, it instructs the buildslaves to execute the steps defined in the buildbot configuration.The buildslave updates the copies of the source code it holds, and compiles the source with the compilers specified in the configuration.Any errors in the build process halt the code verification process immediately and are reported via email to the developers, who can examine the output of the build process to inspect the error messages.Currently, the Fluidity-ICOM project (Piggott et al., 2008) compiles 32-and 64-bit versions of each revision, in single and double precision and in various build configurations, with GCC and the Intel Compilers.
Assuming the software builds successfully, the test cycle begins.There are two types of tests considered here, unit tests and test cases.

Unit tests
A unit test operates at the level of an individual unit of source code, for example a subroutine in a procedural language or a class method in an object-oriented language.
A unit test passes input to a unit of code and makes assertions about its output.Unit testing is a very powerful and useful technique for programming, as it allows the programmer to write down in an executable manner what is expected of a unit.Examples of unit tests include asserting that the derivative of a constant is zero, asserting that the eigendecomposition of a specified input matrix is correct, or asserting that the residual of a linear solve is less than the specified tolerance.Regular unit testing of individual Introduction

Conclusions References
Tables Figures

Back Close
Full pieces of code allows them to be relied upon in other, more complex algorithms.This also allows computationally expensive debugging tools such as valgrind 2 and Elec-tricFence 3 to be applied to individual components rather than to the whole code at once.With the increasing trend towards common components in software, the possibility of code changes in third party libraries introducing subtle bugs also increases.Unit testing is an excellent way to guard against this possibility, as it defines precisely what the software expects of the libraries it depends upon.

Test cases
Test cases operate at the level of the entire computational model.The buildbot invokes the test harness, a piece of software which manages the execution of the test case.
A test case typically runs the newly built revision on a given simulation and makes assertions comparing the output to some external data source.The purpose and level of rigour of the test, and thus its usefulness, is determined by the reliability of the source of the external data.One advantage of this framework is that stability tests, verification and validation may be performed automatically by the same means: code stability tests make assertions against output data obtained from previous runs of the model, code verification tests make assertions against analytical solutions (possibly obtained by the method of manufactured solutions), solution verification makes assertions about discretisation errors, and model validation tests make assertions against physical data.
Seen abstractly, a test case consists of four things: some input files, a set of commands to be executed, functionals to be computed from the output of those commands, and assertions to be made about the result of those functionals.In order for the test case to be automatable, the test must be completely described in a machine-parseable way: in this framework, a test case is described by an XML document, with functionals Introduction

Conclusions References
Tables Figures

Back Close
Full of the output and the assertions described in Python code fragments embedded in the XML file.An example XML file is given in Fig. 1.Each test problem is assigned a name, which is used for printing out status messages.The <problem definition> tag gives information about the expected length of the problem, which is used by the test harness for scheduling decisions, and the number of processors on which the problem is designed to run.The <command line> tag contains the commands to be executed; typically this will be the commands to run the software.Other possible commands might be to run a static analysis tool on the source tree or to run a post-processing tool on software output.
Once the commands to be executed have completed, functionals of the output are computed.A functional is computed by a fragment of Python code in a <variable> tag.Making use of a widely available general-purpose scripting language such as Python gives great power and flexibility to the test author.Reference results can be retrieved from a relational database on a networked server, or the computed functional value stored in a database for future reference.Powerful scientific libraries such as SciPy (Jones et al., 2001), VTK (Schroeder et al., 2006) and SAGE (Stein et al., 2007) are made available to the programmer.These can be used for the computation of diagnostics, linear regressions, statistical analyses, image and signal processing, etc.
Once the functionals are computed, the test assertions are made.These also take the form of Python fragments, and typically are composed of one or more assert statements.The test is deemed to fail if the code raises a Python exception indicating that an error has occurred (in Python, a failed assert statement raises an AssertionError exception).If no exception is raised, the test is deemed to have passed.If an individual test is defined in a <pass test> tag, its failure causes the entire test to fail; if the test is defined in a <warn test> tag, a warning is issued.
Warning tests are typically used in code stability tests to notify that the results have changed, without necessarily implying that the change is due to an error.
The test harness can integrate with a cluster management system and push the execution of test cases out to a batch queue.In the Fluidity-ICOM test suite, longer Introduction

Conclusions References
Tables Figures

Back Close
Full test cases are executed in parallel on a dedicated test cluster, while shorter test cases are automatically executed on dedicated workstations.

Automated profiling
The binaries compiled in Sect.3.2 are built with the compiler flags necessary to turn on the collection of gprof profiling data (Graham et al., 1982).The compiler augments the code with routines to profile the execution of the code.If the test case passes, the profiling data is processed and stored for future reference.This enables developers to inspect the efficiency of each subroutine or class method of the code as a function of time, and correlate any changes in efficiency with code modifications.The automated collection of other code quality metrics such as cache misses or the presence of memory leaks could also be performed as part of this framework.

Fluid dynamics test cases
In this section a selection of the test cases for use in the verification and validation of a computational and geophysical fluid dynamics code are described.To begin, a number of tests which are suitable for use with a standard fluid dynamics model are described; this is followed by a number of tests suitable for models which incorporate buoyancy and Coriolis effects, for example geophysical fluid dynamics codes and ocean models.Note that for the problems presented here the model has been set up so as to optimise the efficiency of the test, i.e. to give rigorous checks on the code in minimal computational time The model being tested here uses finite element discretisation methods on tetrahedral or hexahedral elements in three dimensions and triangular or quadrilateral elements in two dimensions.The underlying equations considered in the tests presented here include the advection-diffusion of scalar fields, the Navier-Stokes equations, and the Boussinesq equations with buoyancy and Coriolis terms included.The model has the ability to adapt the mesh dynamically in response to evolving solution fields.For background to the model see Pain et al. (2005), Piggott et al. (2008).

The method of manufactured solutions: tracer advection
To test the implementation of the advection-diffusion and Navier-Stokes equations spatial convergence tests are performed using the method of manufactured solutions (MMS, Roache, 2002).MMS provides an easy way of generating analytical solutions against which to verify model code.A sufficiently smooth desired analytical solution is designed and a suitable source term added to the right hand side to ensure the validity of the equation.The source is calculated by substituting the desired analytical solution in the underlying differential equation.
The numerical equation is then solved on a series of successively finer meshes.The solution on each of these is then compared to the known exact solution and the order of convergence compared to the expected order for that method.When convergence of the solution is not seen it is an excellent indicator of an error in the model code or the numerical formulation.MMS has been shown to be highly effective at finding such problems (Salari and Knupp, 2000) and continuous monitoring of the results through an automated system allows errors that affect the order of convergence to be immediately To test tracer advection-diffusion the desired analytical solution is taken as: T (x,y,t) = sin(25xy) − 2y/x 1/2 , (1) while a prescribed velocity field, u = (u,v), is given by The source term, S, is calculated symbolically using SAGE (Stein et al., 2007) by substituting T and u into the advection-diffusion equation: The computational domain is 0.1 ≤ x ≤ 0.6; −0.3 ≤ y ≤ 0.1 and is tesselated with a uniform unstructured Delaunay mesh of triangles with characteristic mesh spacing of h in the x and y directions.The analytical solution (Eq. 1) is used to define Dirichlet boundary conditions along the inflowing lower and left boundaries while its derivative is used to define Neumann boundary conditions on the remaining sides.Both the boundary conditions and source term are defined through Python functions defined in the Fluidity-ICOM preprocessor (Ham et al., 2009), where the diffusivity, κ, is taken as 0.7.
As we are performing a spatial convergence test, the desired solution is temporally invariant.However, the equation contains a time derivative and requires an initial condition.This is set to zero everywhere leading to a numerical solution that varies through time.The simulation is terminated once this reaches a steady state (to a tolerance of 10 −10 in an infinity norm) .Introduction

Conclusions References
Tables Figures

Back Close
Full Once a steady state has been obtained on all meshes the convergence analysis may be performed.Given the error, E , on two meshes, with characteristic mesh spacing h 1 and h 2 for example: (3) where C is a constant discretisation specific factor independent of the mesh, c p is the order of convergence of the method and r is the refinement ratio (r = 2 in this case), then the ratio of errors is given by: and the order of convergence can be calculated as: This can then be compared to the expected order of convergence for a particular method.
Several model configurations and discretisations are used in the full testing suite.Here, the results of a first-order upwinding control volume (CV) discretisation and a second-order piecewise-linear Galerkin (P1) discretisation are presented.In both cases the Crank-Nicolson method is used to discretise in time.Table 1 demonstrates that the expected order of spatial convergence, or better, is achieved for both discretisations.Figure 2 shows the source term, the numerical solution using the P1 discretisation, the absolute difference between this and the analytical solution at steady state and meshes with average nodal spacings, h, of 0.08 and 0.01.Introduction

Conclusions References
Tables Figures

Back Close
Full The method of manufactured solutions can also be used to test more complicated sets of equations involving multiple coupled prognostic fields, such as the Navier-Stokes equations.Initially an incompressible, smooth and divergence free desired velocity field, u = (u,v) is considered: along with a desired pressure, p: These are substituted into the momentum equations, with tensor-form viscosity, using SAGE (Stein et al., 2007) to derive the required momentum source: Table 2 presents the convergence results for velocity and pressure on a series of unstructured meshes with successively smaller average mesh spacings.For both velocity and pressure the expected order of convergence, or better, is observed.
Further variables may be introduced by considering the fully compressible Navier-Stokes equations with a divergent desired velocity field: and a spatially varying density field, ρ: Assuming, a desired internal energy, e: it is then possible to define the desired pressure field using a stiffened gas equation of state: In this case, coupled momentum, continuity and internal energy equations are solved, each of which require a source term, S u , S ρ and S e respectively, to be calculated: 14) where the deviatoric stress tensor, τ, is linearly related by the viscosity, µ, to the strainrate tensor, .The derivation of these sources is omitted here for clarity but as with Figures

Back Close
Full previous MMS test cases they are easily found using a symbolic maths toolkit (e.g., SAGE, Stein et al., 2007) The problem is considered in the computational domain −0.1 ≤ x ≤ 0.7; 0.2 ≤ y ≤ 0.8, which is tesselated using an unstructured mesh of triangles with successively smaller average mesh lengths.As before a Galerkin discretisation is used for velocity (piecewise-quadratic elements) and pressure (piecewise-linear elements) while a streamline upwind Petrov-Galerkin (SUPG) discretisation is used for the internal energy (piecewise-linear) and density (piecewise-quadratic).The desired velocity is imposed via strong Dirichlet boundary conditions on all sides of the domain while the other variables are prescribed on the lower and left inflowing boundaries.All the sources and boundary conditions are input using Python functions in the Fluidity-ICOM preprocessor, taking the square of the speed of sound, c 2 B , the reference density, ρ 0 , the specific heat ratio, γ, and the viscosity, µ, as 0.4, 0.1, 1.4 and 0.7, respectively.
Table 3 presents the order of spatial convergence for all the prognostic variables in the compressible Navier-Stokes test case, all of which demonstrate the expected order of converence.
The method of manufactured solutions is an extremely versatile code verification tool, being easily applicable to a large range of equation sets.As well as increasing the complication of the problems considered, as done above, equations may also be simplified.By considering the terms in equation sets individually within an automated testing platform any new coding error introduced may be pinpointed almost instantaneously, even within a large code base.This has motivated the development of over forty MMS test cases in the Fluidity-ICOM verification suite, all of which assert that the expected order of convergence is maintained after each revision of the code.

The lid-driven cavity
The lid-driven cavity is a problem that is often used as part of the verification procedure for CFD codes.The geometry and boundary conditions are simple to prescribe and in two dimensions there are a number of highly accurate numerical benchmark solutions 1605 Figures

Back Close
Full available for a wide range of Reynolds numbers (Botella and Peyret, 1998;Bruneau and Saad, 2006;Erturk et al., 2005).Here the two-dimensional problem at a Reynolds number of 1000 is given as an example.
The unsteady momentum equations with nonlinear advection and viscosity terms are solved in a unit square in the x and y directions along with the continuity equation, which enforces incompressibility.No-slip velocity boundary conditions are imposed on boundaries x = 0,1 and y = 0, and the prescribed velocity u = 1, v = 0 are set on the boundary y = 1 (the "lid").The problem is initialised with a zero velocity field and the solution allowed to converge to steady state via time-stepping.A subset of the benchmark data available from the literature is then used to test for numerical convergence.
Here this involves the calculation of the kinetic energy which is compared against the value 0.044503 taken from Bruneau and Saad (2006).
In addition, the x-component of velocity and pressure are evaluated at 17 points along the line x = 0.5 and compared against the data from Botella and Peyret (1998).
Plots of the solutions and benchmark data are given in Fig. 3. Also shown is a plot of the error convergence with mesh spacing.A regular triangular mesh is used with progressive uniform refinement in the x,y plane.Second order spatial convergence can clearly be seen for the three quantities compared.
The automated assertions in this case are that second order convergence is attained and that the magnitude of errors in the three quantities does not increase with code updates.

Flow past a sphere: drag calculation
In this validation test uniform flow past an isolated sphere is simulated and the drag on the sphere is calculated and compared to a curve optimised to fit a large amount of experimental data.1606 Introduction

Conclusions References
Tables Figures

Back Close
Full The sphere is of unit diameter centred at the origin.The entire domain is the cuboid defined by −10 ≤ x ≤ 20, −10 ≤ y ≤ 10, −10 ≤ z ≤ 10.The unsteady momentum equations with nonlinear advection and viscous terms along with the incompressibility constraint are solved.Free slip velocity boundary conditions are applied at the four lateral boundaries, u = 1 is applied at the inflow boundary x = −10, and a free stress boundary condition applied to the outflow at x = 20.A series of Reynolds numbers in the range Re ∈ [1,1000] are considered.The problem is run for a long enough period that the low Reynolds number simulations reach steady state, and the higher Reynolds number runs long enough that a wake develops behind the sphere and boundary layers on the sphere are formed.This is deemed sufficient for the purposes of this test which is not an in-depth investigation of the physics of this problem, nor an investigation of the optimal set of numerical options to use.Here an unstructured tetrahedral mesh is used along with a mesh optimisation (adaptivity) algorithm (Pain et al., 2001).Figure 4 shows a snapshot of the mesh and velocity vectors taken from a Reynolds number 1000 simulation.The mesh can be seen to be resolving the wake and the boundary layers on the sphere with enhanced anisotropic resolution.At higher Reynolds numbers the dynamics become more complex and if a full numerical study was being conducted here more care would be taken is the choice of mesh optimisation parameters and the use of averaged values from simulations allowed to run for longer periods.The drag coefficient is calculated from where ρ is the density, taken here to be unity; u 0 is the inflow velocity, here unity; and A is the cross-sectional area of the sphere, here π 2 /4.F x is the force exerted on the sphere in the free stream direction; S signifies the surface of the sphere; n is the unit outward pointing normal to the sphere (n x is the x-component and n i the i th component, here summation over repeated indices is assumed); p is the pressure and τ is the stress tensor; see Panton (1996).Introduction

Conclusions References
Tables Figures

Back Close
Full Figure 4 also shows a comparison between the computed drag coefficient with a correlation (to a large amount of laboratory data) taken from Brown and Lawler (2003): The assertions tested are that the difference between the computed drag coefficient and values from the correlation (Eq.18) at a number of Reynolds numbers are within acceptable bounds.Checks on the number of nodes produced by the adaptive algorithm for given error measure choice and other options are also conducted.While all of these simulations can be run comfortably in serial, the Reynolds number 100 and 1000 cases are performed on 8 cores both to accelerate the tests and as a test of the parallel implementation.

Geophysical fluid dynamics examples
In this section some of the test cases used for the model in its "oceanographic mode" (i.e. with the incorporation of buoyancy and Coriolis effects) are also presented.Further useful test problems can be found in Haidvogel and Beckmann (1999), Williamson et al. (1992), Ford et al. (2004), Giraldo and Restelli (2008).Model validation can also be conducted by comparing against real world data, for example by comparing against tide gauge data in tidal simulations and tsunami arrival times (Mitchell et al., 2010).

Lock exchange
In this validation problem an advection-diffusion equation for density and the Boussinesq equations for velocity and pressure are used to solve for the evolution of a system where fluid of two densities are initialised next to one another in a rectangular tank in the (x,z) plane.The dense water slumps under gravity and moves under the lighter fluid.A Kelvin-Helmholtz instability causes the generation of overturning billows at the interface between the two densities which contributes to the eventual mixing of the wa-Introduction

Conclusions References
Tables Figures

Back Close
Full ter column (Simpson, 1999).Here a no-slip velocity boundary condition is used at the bottom of the domain with free-slip at the top.The domain is defined by 0 ≤ x ≤ 0.8, 0 ≤ z ≤ 0.1.A constant time step of 0.025 is used and the mesh is adapted every 10 time steps.A full description of the physical parameters used to set-up this problem are given in Fringer et al. (2006Fringer et al. ( ), H ärtel et al. (2000)), and a comprehensive study of this scenario in Fluidity-ICOM is given in Hiester et al. (2010).Here a case with Grashof number of 1.25×10 6 has been used.
The speed of the gravity current head in the horizontal are derived at the upper and lower boundaries by first extracting the maximum and minimum x values of an isosurface of the density field, and then computing the linear growth of each with time after an initial relaxation time, Fig. 5.These values are then compared with the values quoted in H ärtel et al. (2000), Fringer et al. (2006), namely −0.012835 for the noslip boundary and 0.015093 for the free-slip boundary.H ärtel et al. (2000) use direct numerical simulation (DNS) to study this problems and hence these metrics of the flow dynamics for this problem are considered as truth.
The automated assertions tested are that the head speeds, computed from a time series of model output via Python script, agree with the DNS values to within an allowed tolerance.Checks on the number of nodes used in the calculation are also performed.

Stommel's western boundary current
This test involves the steady state wind driven barotropic circulation in a rectangular domain, and compares against an analytical solution.Stommel (1948) was the first to describe why one observes the intensification of boundary currents on the western side of ocean basins, for example the Gulf Stream in the North Atlantic.The streamfunction equation in the domain 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,

Conclusions References
Tables Figures

Back Close
Full with homogeneous Dirichlet boundary conditions is considered, see Hecht et al. (2000).Here β = 50 is the North-South derivative of the assumed linear Coriolis parameter, F = 0.1 is the strength of the wind forcing which takes the form τ = −F cos(πy), and R = 1 is the strength of the assumed linear frictional force.The analytical solution to Eq. ( 19) is given by Figure 6 shows a comparison of results obtained with uniform and anisotropic adaptive refinement.The form of the streamfunction yields a velocity field with strong shear in the direction normal to the western boundary.The error measure and mesh optimisation algorithm used here yield a mesh which has long, thin elements aligned with the boundary.The error plots show the high error focused in the western boundary region in the case of the uniform resolution mesh.
The automatic assertions here involve ensuring errors from uniform and adaptive mesh calculations are within acceptable bounds of the analytical solution.In particular, the L 2 norm of the error obtained with the adapted mesh is checked to be an order of magnitude lower than that with the fixed mesh, with the adapted mesh using approximately one quarter the number of nodes.For further details of this problem solved using Fluidity-ICOM with isotropic and anisotropic mesh adaptivity see Piggott et al. (2009).

Conclusions References
Tables Figures

Back Close
Full  As geoscientific simulations become ever more complex, the software complexity of the computational models increases with it; therefore, the standard of software engineering used to write and manage those scientific models must rise also.The widespread deployment of automated frameworks such as that described here is a necessary step if society at large is to trust the results of geoscientific models.Introduction

Conclusions References
Tables Figures
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Static analysis involves (usually automated) inspection of the source code itself for issues such as uninitialised variables, mismatched interfaces and off-by-one array indexing errors.Dynamic analysis involves running the software and comparing the output (some functional of the solution variables) to an expected output.The source of the expected output determines the rigour and purpose of the test.Various sources are possible: Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | , and not necessarily to optimise the accuracy of the overall calculation or of the particular metric being used.The test cases presented here are selected to illustrate a range of problem formulations and test statistics.The actual test suite employed by Fluidity-ICOM contains well in excess of two hundred tests.Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | sin(x)sin 2 (y) + cos(x)sin(x)cos 2 (y) +2µsin(x)cos(y) − sin(x)cos(y) ρ cos(y)sin(y)sin 2 (x) + cos(y)sin(y)cos 2 (x) −2µcos(x)sin(y) − cos(x)sin(y) The incompressible Navier-Stokes equations are then solved in the computational domain 0 ≤ x ≤ π; 0 ≤ y ≤ π tesselated using an unstructured Delaunay mesh of triangles.Velocity is discretised using a piecewise-quadratic Galerkin discretisation while pressure uses piecewise-linear elements.Strong Dirichlet boundary conditions for velocity are provided on all sides of the domain using the desired solution while pressure has natural homogeneous Neumann boundary conditions enforced.Both the strong boundary conditions and source term are defined through Python functions defined in the Fluidity-ICOM preprocessor (Ham et al., 2009), while the density, ρ, and viscosity, µ, are taken as 1.0 and 0.7, respectively.Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 1 .Fig. 2 .Fig. 4 .Fig. 5 .Fig. 6 .
Fig. 1.Example test case, as described in Sect.3.4.The test is described by an XML file.The command to be executed is recorded in the command line tag.Functionals of the output to be computed are recorded in variable tags.Assertions to be made about the values of the functionals are recorded in test tags.
Automated continuous testing is widely regarded as industry best practice in the software engineering community, but this message has not yet fully penetrated the numerical modelling community.Rigorous verification and validation is necessary for users to have confidence in the model results, and is generally a requirement for deployment to industry.If the model is under active development, these processes must run continuously as the model is changed; it should therefore be automated.This paper has presented an overview of the software infrastructure uses to automate the Fluidity-ICOM test suite, as well as several of the test cases used.The deployment of the test suite has yielded dramatic improvements in code quality and programmer efficiency.Almost no developer time is wasted investigating the failure of simulations that used to work.Since feedback about a change to the code is given almost immediately, any errors introduced by new code development can be rapidly fixed.As the test suite acts to lock in correct behaviour of the computational model, the computational model becomes provably more efficient and more accurate over time.

Table 1 .
Spatial order of convergence results for the method of manufactured solutions advection-diffusion test case described in Sect.4.1.1.The difference between the analytical and numerical solutions using a first-order control volume (CV) discretisation and a secondorder piecewise-linear Galerkin (P1) discretisation are calculated in the L 2 norm.The ratio between these on two spatial mesh resolutions, h 1 and h 2 , are used to estimate the order of spatial convergence of the model for this problem.The expected order of convergence, or better, is observed for both spatial discretisations.

Table 2 .
Spatial order of convergence results for the method of manufactured solutions incompressible Navier-Stokes test case described in Sect.4.1.2.The difference between the analytical and numerical solutions using a piecewise-quadratic velocity, (u,v), and a piecewiselinear pressure, p, Galerkin discretisation are calculated in the L 2 norm.The ratio between these on two spatial mesh resolutions, h 1 and h 2 , are used to estimate the order of spatial convergence of the model for this problem.The expected order of convergence is observed for all variables.

Table 3 .
Spatial order of convergence results for the method of manufactured solutions compressible Navier-Stokes test case described in Sect.4.1.2.The difference between the analytical and numerical solutions using a piecewise-quadratic velocity, (u,v), a piecewise-linear pressure, p, a piecewise-quadratic density, ρ, and a piecewise-linear internal energy, e, Galerkin discretisation are calculated in the L 2 norm.The ratio between these on two spatial mesh resolutions, h 1 and h 2 , are used to estimate the order of spatial convergence of the model for this problem.The expected order of convergence is observed for all variables.