Multifidelity Monte Carlo Estimation for Efficient Uncertainty Quantification in Climate-Related Modeling

. Uncertainties in an output of interest that depends on the solution of a complex system (e.g., of partial differential equations with random inputs) are often, if not nearly ubiquitously, determined in practice using Monte Carlo (MC) estimation. While simple to implement, MC estimation fails to provide reliable information about statistical quantities (such as the expected value of the output of interest) in application settings such as climate modeling for which obtaining a single realiza-tion of the output of interest is a costly endeavor. Specifically, the dilemma encountered is that many samples of the output of 5 interest have to be collected in order to obtain an MC estimator having sufficient accuracy; so many, in fact, that the available computational budget is not large enough to effect the number of samples needed. To circumvent this dilemma, we consider using multifidelity Monte Carlo (MFMC) estimation which leverages the use of less costly and less accurate surrogate models (such as coarser grids, reduced-order models, simplified physics, interpolants, etc.) to achieve, for the same computational budget, higher accuracy compared to that obtained by an MC estimator or, looking at it another way, an MFMC estimator obtains 10 the same accuracy as the MC estimator at lower computational cost. The key to the efficacy of MFMC estimation is the fact that most of the required computational budget is loaded onto the less costly surrogate models, so that very few samples are taken of the more expensive model of interest. We first provide a more detailed discussion about the need to consider an alternate to MC estimation for uncertainty quantification. Subsequently, we present a review, in an abstract setting, of the MFMC approach along with its application to three climate-related benchmark problems as a proof-of-concept exercise. = t


Introduction
In many application settings -climate modeling being a prominent one -large computational costs are incurred when solutions to a given model are approximated to within an acceptable accuracy tolerance. In fact, this cost can be prohibitively large when one has to obtain the results of multiple simulations, as is the case for, e.g., uncertainty quantification, control and optimization, to name a few. Thus, there is often a need for compromise between the accuracy of simulation algorithms and 20 the number of simulations needed to obtain, say, in the uncertainty quantification setting, accurate statistical information.
For example, consider the following case which represents the focus of this paper. Suppose one has a complex system, say, a system of discretized partial differential equations, for which the input data depends on a vector of randomly distributed parameters z. Letting u(z) denote the solution of the discretized partial differential equations, we define an output of interest F u(z) = F (z) that depends on u(z) so that, of course, it also depends on the choice of z. Next, suppose we wish to use Monte Carlo sampling to estimate the expected value Q = E [F (z)] of the output of interest F (z), i.e., we have the Monte Carlo estimator of Q given by where {z m } M m=1 denotes a set of M independent and identically distributed samples of the random vector z. This is a common task in uncertainty quantification which provides useful statistical information in both predictive and inferential applications.
smoothness of solutions to achieve better accuracy. Furthermore, most, if not all, of these methods are superior to Monte Carlo sampling only for moderate dimension of the parameter vector z; see the references just cited.
A second approach towards reducing the cost of Monte Carlo (and for that matter for any type of) uncertainty estimation is 60 to use approximate solutions of the PDE system that are less costly to obtain compared to the cost of obtaining the approximation of actual interest. For example, using simulations obtained using coarser grids or using reduced-order models such as reduced-basis or proper orthogonal decomposition methods are less costly as are interpolation and support vector machine approximations; see, e.g., Cristianini and Shawe-Taylor (2000); Fritzen and Ryckelynck (2019) ;Keiper et. al. (2018) ;Quarteroni and Rozza (2014) ;Quarteroni et. al. (2016); Steinwart and Christmann (2008). However, such approaches, by definition, result 65 in less accurate approximations compared to the accuracy that one wants to achieve.
In this paper we do not consider any of the possible alternate sampling schemes nor do we exclusively consider using less costly and less accurate approximate solutions of the PDE system. Instead, because of the near ubiquity of its use in practice, our goal is to outperform traditional Monte Carlo estimation by using a nontraditional Monte Carlo sampling strategy and in so doing refrain from incurring any loss of accuracy. To meet this goal, we invoke multifidelity Monte Carlo estimation 70 which, in addition to the expensive and accurate PDE system approximation of interest (hereafter referred to as the "truth" approximation) also uses cheaper to obtain and less accurate approximations (which are referred to as the "surrogates").
The bottom line is that multifidelity Monte Carlo estimation meets our goal by leveraging increased sampling of the less accurate/less costly approximations alongside low sampling of the more expensive/more accurate truth approximation. The multifidelity Monte Carlo algorithm systematically determines the number of samples taken from each surrogate (i.e., there 75 is no guess work involved) and systematically (i.e., again no guess work involved) combines the samples of the surrogates to obtain the desired estimator. We note that multifidelity Monte Carlo estimation has already been shown to outperform Monte Carlo estimation in a variety of application settings; see, e.g., Clare et. al. (2022) ;Dimarco et. al. (2022); Konrad (2019) ;Law et. al. (2022); Modderman (2021) ;Quick et. al. (2019);Rezaeiravesh et. al. (2020);Romer et. al. (2020);Valero et. al. (2022) for some examples.

80
In Section 2, we review how the Monte Carlo and multifidelity Monte Carlo estimators are constructed in an abstract setting; here we follow the expositions in Peherstorfer et. al. (2016) and also . Then, in Sections 3 and 4 we demonstrate the effectiveness of multifidelity Monte Carlo estimation using three well-known climate-related benchmarks.
Sections 3.1 and 3.2 are respectively devoted to single layer shallow-water equations model for the SOMA Test Case of Wolfram et. al. (2015) and Test Case 5 of Williamson et. al. (1992). Section 4 is devoted to a benchmark case for the first-order 85 model for ice sheets; see Blatter (1995); Pattyn (2003) ;Perego et. al. (2012);Tezaur et. al. (2015). We close by providing some concluding remarks in Section 5.

Monte Carlo and multifidelity Monte Carlo estimators
An abstraction of the specific settings considered in Sections 3 and 4 involves first vector of parameters z ∈ Γ, where Γ denotes a parameter domain.
Note that the input data to this PDE system, e.g., forcing terms, initial conditions, coefficients, etc., could depend on one or more of the components of the random vector z. Moreover, -we are interested in situations such that, for any z ∈ Γ, obtaining u(z) is a costly endeavor.
In addition,

95
-we define a scalar output of interest (OoI) F 1 (z) = F 1 u(z) that depends on the solution u(z) of the (discretized) PDE system; of course, if obtaining u(z) is costly, then so is obtaining F 1 (z).
While they are not considered in this work, vector-valued outputs of interest can also be treated with multifidelity Monte Carlo techniques. OoIs could be, e.g., averages or extremal values of the energy, etc., associated with the solution u(z).

Having defined an OoI
where E [ · ] denotes the expected value with respect to z. Because the estimation of Q 1 = E F 1 is the central goal of this paper, -we refer to F 1 (z) as the "truth" output of interest.

105
-a Monte Carlo (MC) sampling method is used to (approximately) quantify the uncertainty in the chosen OoI F 1 (z).

Specifically,
-MC sampling is used to estimate the quantity of interest Q 1 = E F 1 corresponding to F 1 (z), i.e., we have that -the MC estimator of the QoI Q 1 = E F 1 is given by where {z m } M1 m=1 denotes a set of M 1 randomly selected points in the parameter domain Γ. We are then faced with the following dilemma: on the one hand, -obtaining an acceptably accurate MC estimator Q M C 1 of the QoI Q 1 requires obtaining the solution u(z) of the discretized PDE system at many randomly selected points z ∈ Γ, 115 and, on the other hand, -each of those approximate solutions u(z) are computationally costly to obtain.
Quantifying uncertainties in climate system settings are victimized by this two-headed dilemma to the extent that, e.g., accurate long-time integrations can often not be realized in practice.
Due to the issue of prohibitive computational cost, we turn to multifidelity Monte Carlo (MFMC) methods for uncertainty 120 quantification. MFMC methods leverage the availability of surrogate outputs of interest F k (z), k = 2, . . . , K, which have smaller computational complexity compared to that of the truth OoI F 1 (z). As mentioned in Section 1, there are many types of surrogates that can be used for this purpose, e.g., discretized PDEs with coarser spatial grids and larger time steps, reducedbasis and proper orthogonal decomposition-based reduced-order models, and interpolants of F 1 (z), to name just a few.
For the truth and the K − 1 surrogate OoIs, Monte Carlo (MC) sampling yields the K unbiased estimators where z 1 , . . . , z M k denote M k i.i.d. samples of z ∈ Γ. The goal of an MFMC method is using an appropriate linear combination of the K MC estimators Q M C k in (3) to estimate the QoI Q 1 = E F 1 . Specifically, we define the MFMC estimator as where {α k } K k=2 denotes a sequence of scalar weights and {M k } K k=1 denotes a nondecreasing sequence of integers defining the 130 numbers of samples.
Letting C k denote the cost of evaluating the k th output of interest F k , the costs of computing the respective MC and MFMC estimators are given as where C = {C k } K k=1 and M = {M k } K k=1 denote K-vectors formed from the previously introduced sequences.

135
The variance σ 2 k of the k th output of interest F k and the correlation ζ k,k ′ between the outputs of interest F k and F k ′ are given by, for k ′ , k = 1, . . . , K, respectively, where Cov [·, ·] denotes the statistical covariance. We then have that the mean-squared error (MSE) incurred by the MC estimator Q M C k of the QoI Q k is given by whereas the MSE incurred by the Q M F M C estimator of the QoI Q 1 is given by (see Peherstorfer et. al. (2016)) Note that it can be shown (see e.g. Peherstorfer et. al. (2016)) that the MSE for Q M F M C is lower than that for Q M C C k C 1 ζ 2 1,k − ζ 2 1,k+1 < 1.
Given a fixed computational budget B, MFMC aims to construct an optimal sampling strategy M = {M k } K k=1 along with an optimal set of weights α = {α k } K k=2 so that the MSE (6) of the multifidelity estimator Q M F M C is lower than the MSE (5) of the Monte Carlo estimator Q M C 1 . Viewed differently, this means that an appropriate estimator Q M F M C can achieve a fixed MSE ε > 0 at a smaller computational cost compared to that incurred for the Monte Carlo estimator Q M C 1 achieving MSE ε.
150 In Peherstorfer et. al. (2016, unique optimal values of M = {M k } K k=1 and α = {α k } K k=2 are analytically obtained by minimizing the MSE (6) of the Q M F M C estimator (4) over the real numbers. However, the values M must certainly be integers for practical use, and the heuristic of Peherstorfer et. al. (2016) which determines the optimal values can result in either a biased estimator of the expectation or (with naïve modification) a violation of the given budget B. For "small" computational budgets, the consequences of this can be quite severe as illustrated in .

155
Because "small" computational budgets are of high interest for climate modeling, here, instead of using the MFMC method of Peherstorfer et. al. (2016of Peherstorfer et. al. ( , 2018, we use the modified MFMC method of  which guarantees that the optimal sampling numbers are integers and that the computational budget is not exceeded. In that method, instead of simply minimizing the MSE e(Q M F M C ) defined in (6), the modified MFMC estimator is determined through the use of at least some of the sequential minimization problems given by 160 for k = 1, 2, . . . , K and, if k > 1, for given M 1,1 , . . . , M k−1,k−1 , minimize the functional where µ k ′ ,k for k ′ = k + 1, . . . , K, λ k , and ξ k > 0 are Lagrange multipliers.
In (8), the first term added to the MSE e(Q M F M C ) enforces the budget constraint, or, more precisely, for each k, enforces that the remaining available budget suffices to not exceed the given budget B after that budget has been depleted by the sampling already effected for the OoIs F k ′ , k ′ = 1, . . . , k − 1, where, of course, for k = 1 the whole budget B is available. The Remark 2.1. It is clear from (11) that the weights α * k ′ ,k depend only on input data, specifically on the variances and correlations for all k = 1, . . . , K and all k ′ = k + 1, . . . , K.
Thus, the set of weights {α * k ′ ,1 } K k ′ =2 determined from the minimization of L 1 suffices to determine α * k ′ ,k for all such k ′ and k, i.e., all α * k ′ ,k for all k ′ and k are determined once and for all by the minimization of L 1 . □ A. Optimal non-integer sampling numbers for the modified MFMC method; see  • Let {F k } K k=1 denote the set of computational outputs of interest with correlation coefficients {ζ 1,k } K k=2 and computational costs {C k } K k=1 that respectively satisfy The first requirement is easily satisfied by a reordering of the of the surrogate outputs of interest F 2 , . . . , F K . If after that reordering, the second requirement is not satisfied for some k ∈ {2, . . . , K}, then that F k is removed from the list of surrogates.
• Then, for each k = 1, . . . , K, the unique global minimizer (M * k , α * k ) of the functional L k is given by and Unfortunately, as was the case in Peherstorfer et. al. (2016Peherstorfer et. al. ( , 2018, the results given in box A do not immediately lead to a 175 practical MFMC method because the sample numbers M * k ′ ,k given in (10) and (11) are not, in general, integers. Moreover, as was also the case in Peherstorfer et. al. (2016Peherstorfer et. al. ( , 2018, the first and perhaps even the first few sampling numbers at stage k ′ = 1 may be < 1 in addition to being noninteger. In those papers, a rounding procedure is implemented, i.e., the sample numbers M k ′ ,1 are replaced by integers, though this is unsuitable for scenarios where M 1,1 < 1 as the choice M 1,1 = 0 leads to a biased estimator while the choice M 1,1 = 1 exceeds the computational budget. On the other hand, the modified MFMC method in 180 box B below is constructed to avoid these issues while preserving the optimality of the original solution (up to some amount of rounding). Note that in that box and elsewhere, ⌊ · ⌋ denotes rounding downwards to the nearest integer.

Ib. simply rounding downwards to the nearest integer produces integer sample numbers
, of sampling number optimal under the constraint M1,1 = · · · = M k ′ ,k ′ = 1 and preserves the computational budget. We then set K = k − 1 and exit to step IV.

Otherwise,
} which may or may not contain entries less than 1. If it does not, the model evaluation vector becomes . We then setK = k and exit to step IV.
Otherwise, III. we increment k → k + 1 and return to step Ia.
IV. The process terminates and the final set of sample numbers is given by 3 Tests for the single-layer rotating shallow-water equations Consider the single-layer rotating shallow-water equations (RSWEs) posed on the domain Γ × [0, T ] and given by (see Wallis (2012)) where Γ denotes the surface of a sphere or a subset of that surface,  h(x, t) denotes the fluid thickness, u(x, t) denotes the vector velocity field tangential to the surface of the sphere, G(h, u) denotes the a forcing term that depends on the specific setting, f denotes the Coriolis parameter, ρ denotes the density, g denotes the (constant) acceleration due to gravity, Note that the expression (k · ∇ × u + f )/h appearing in the velocity equation is known as the potential vorticity; see Wallis The RSWEs represent a useful simplification of the primitive equations (Wallis (2012)) which are commonly used in oceanic and atmospheric modeling and which are obtained by assuming a small ratio between the vertical and horizontal length scales.
In this way, the single-layer RSWEs describe the motion of a thin layer of fluid which lies on a rigid surface, yielding conditions 205 used extensively in the modeling of oceanic and atmospheric flows.
Spatial discretization of the system (12) is effected using the TRiSK scheme (Ringler et. al. (2010); Thuburn et. al. (2009)) which is a staggered C-grid mimetic finite difference/finite volume scheme that preserves desirable physical properties including conservation laws for mass, energy, and potential vorticity. TRiSK discretization involves the approximation h ℓ of the height h at the center of a grid cell P ℓ , the approximation u e of the normal to the edge component u · n of velocity at the Temporal discretization of the system (12) is effected using an explicit 4 th -order Runge-Kutta method, although many alternative time-stepping schemes have also been proposed for this purpose; see, e.g., Leng et. al. (2019);Meng et. al. (2020); 215 Trahan and Dawson (2012). We denote by {t n } Nt n=0 with t 0 = 0 and t Nt = T the time instants used for temporal discretization.

A SOMA test case for the RSWE system
The specific setting considered here is the benchmark test case referred to as "simulating ocean mesoscale activity" (SOMA) (Wolfram et. al. (2015)) which involves a geodesic basin with radius 1250km centered at latitude/longitude (35 • , 0 • ) on the surface of the Earth. Inside the basin, the depth of fluid varies from 2500m at the center to 100m on the coastal shelf, creating a 220 realistic topography which yields interesting dynamical behaviors which are useful for studying the propagation of fronts and eddies.
In (12), with f bottom (h, u) denoting a forcing term due to the bottom drag and f wind (h) denoting a forcing term due to wind drag. Here, 225 c bottom denotes the bottom drag coefficient chosen to be 10 −3 , ρ denotes the density, and τ wind denotes the surface wind stress.
See, e.g., Wolfram et. al. (2015) for a discussion of this choice for G(h, u). A representative SCVT meshing of Γ is illustrated in Figure 1. Note that for the SCVT grids used by the TRiSK scheme, the  The output of interest we consider is the maximum sea-surface height which is a common benchmark in oceanic RSWE simulations and which is relevant to, e.g., the detection of phenomena such as flooding. In particular, we simulate the RSWE until the final time of T = 3 days using the finest grid (8km resolution) of 120, 953 cells and then choose the OoI F 1 soma given by where h ℓ denotes the sea-surface height (also called fluid thickness) at the center of the cell P ℓ at the final time T = 3.
The quantity of interest we choose considers the effect that perturbations of the initial velocity u 0 = u(0) have on OoI defined in (13) depends on the choice of z in that interval, i.e., we have that F 1 soma = F 1 soma (z). In particular, we choose the QoI to be the expectation Q 1,soma = E F 1 soma of the output of interest F 1 soma (z) defined in (13). Note that in practical RSWE simulations, as is the case for more sophisticated models such as the primitive equations, an approximation of the initial data u 0 is often obtained from a pre-processing procedure in which the RSWE system is spun-up from rest, i.e., from a zero initial condition for u, up to some specified time; see Anderson et. al. (1975); Bleck and Boudra 245 (1986). This procedure is invoked so as to eliminate transient artifacts which are not present in the current ocean or atmosphere and produces an initial configuration which is closer to observed oceanic and atmospheric data. The outcome of the spin-up calculation over the spin-up time frame of 15 days is illustrated in Figure 2. The initial conditions u 0 and h 0 that supplement (12) are then simply set to the outcome of this pre-processing step, i.e., to the sea-surface height and velocity obtained at the end of the spin-up calculation.

MC and MFMC estimators
The MC estimator Q M C 1,soma of Q 1,soma = E F 1 soma is given by (2). Unfortunately, obtaining acceptable accuracy using a MC estimator suffers from a double shortcoming. First, for any given z, obtaining F 1 soma is a costly endeavor because it requires the solution of the discretized RSWE system to obtain the necessary 120,953 values of h ℓ (z). Second, to obtain an MC estimator that is acceptable accuracy requires obtaining F 1 soma for many sample values of z.

255
Naturally, to mitigate this double shortcoming, we turn to the MFMC estimator described in box B of Section 2. To do so, we define three surrogate OoIs F 2 soma (z), F 3 soma (z), and F 4 soma (z), all three of which are less costly to obtain compared to that of F 1 soma (z). Here, F 2 soma (z) and F 3 soma (z) are simply based on solving the discretized RSWE system using the coarser 16km and 32km grids, respectively. The third surrogate F 4 soma (z) is the piecewise-linear interpolant based on the values of F 1 soma (z) which is exact at the three points z = {−0.5, 0, 0.5}. These four OoIs constitute a reasonable multifidelity ensemble of models which are available during computational ocean studies, even more realistic ones such as primitive equation models.
Moreover, all four OoIs can be leveraged for MFMC estimation whereas more traditional MC and multilevel MC estimators only make use of the first or the first three OoIs, respectively. Approximations of the costs and correlations for the four OoIs 265 where C k for k = 1, 2, 3 denote the average computation time (in wall-clock seconds) necessary to advance the relevant discretized RSWE system one time step, computed using 500 time steps for the simulation parameter z = 0.001. Note that the cost C 4 is assigned arbitrarily because the cost of evaluating the interpolant is negligible. These cost-correlation pairs are ideal for MFMC estimation, i.e., the surrogates are very well correlated with F 1 soma (z) and are much less costly to obtain compared to F 1 soma (z). Note that the 100 samples of z used to determine the data in (14) can be reused when determining the MC and 270 MFMC estimators.
From (14), we observe that the more expensive surrogate F 2 soma (z) is slightly less correlated to F 1 soma (z) than is the cheaper surrogate F 3 soma (z). So, to satisfy the first criteria in (9), the correlations should be ordered in the decreasing order ζ 1,1 , ζ 1,3 , ζ 1,2 , and ζ 1,4 . However, it turns out that then the second criteria in (9) for k = 2 is not satisfied so that the F 2 soma (z) is removed from the list of surrogates when computing the MFMC estimator. That estimator is given by (4) with K = 4 and where k = 2 275 is excluded from the sum. This is an example which illustrates, as discussed in Section 2, that not all surrogates one chooses contribute to the efficiency of MFMC estimation and, on the other hand, their omission does no harm to the accuracy of that estimator.
As already alluded to, the goal is to compare, for the same budget, the MSEs of the approximations Q M C soma and Q M F M C soma to the exact QoI Q 1,soma = E F 1 produced by the MC and MFMC estimators. However, there is still an additional challenge 280 to be dealt with, namely that it is unclear how to choose a reference QoI that one can use to determine these MSEs, as only It is interesting that most of the computational budget is loaded onto the very crude piecewise-linear interpolant approximation F 4 soma (z), allowing MFMC to achieve not only a lower MSE but also an estimate

Test Case 5 for the RSWE system
We again consider the RSWE system (12) but now Ω denotes the whole surface of the sphere. Specifically, we consider the configuration of Test Case 5 defined in Williamson et. al. (1992), which is widely used as a benchmark and employed as a stepping stone towards more realistic atmospheric models; see, e.g., Leng et. al Williamson et. al. (1992). Note that for this example there are no additional forces acting on the system, so that G(h, u) = 0 in (12). Test Case 5 considers the flow over an isolated mountain centered at longitude λ c = 3π 2 and latitude θ c = π 6 with height where a = π 9 , r 2 = min{a 2 , (λ − λ c ) 2 + (θ − θ c ) 2 }, and λ, θ denote longitude and latitude, respectively. In (15), z 1 denotes a 305 random variable that is uniformly distributed over the interval [1km , 3km].
The initial tangential (to the sphere) velocity in the longitudinal and latitudinal directions is chosen to be u 0 (z 2 ) = (z 2 cos θ, 0), where z 2 denotes a random variable that is uniformly distributed over the interval [15m/s , 25m/s]. The initial sea-surface height h is chosen as where h = 5.96km, R = 6371.22km, and ω = 7.292 × 10 −5 s −1 . With this, the solution u(z) and h(z) of the RSWE system (12) depends on the random vector z = (z 1 , z 2 ) ∈ [1km , 3km] × [15m/s , 25m/s]. In Figure 4 we provide an example of the initial thickness h 0 (z 2 ) and the thickness h(z 1 , z 2 ) after 10 days for specific values of z 1 and z 2 .
For the simulation results given in Figure 4 and for other results in this subsection, we use the TRiSK scheme for spatial discretization and a 4th-order explicit Runge-Kutta method for temporal discretization. SCVT uniform gridding is employed 315 as in Section 3.1, although now the grid covers the whole surface of the sphere. A comparison of two such SVCT grids with 480 km and 240 km resolutions is provided in Figure 5. In constructing the MC and MFMC estimators, we use the three globally refined SCVT meshes of the whole sphere given by For the output of interest, we choose where N 1 ℓ denotes the number of cell edges for the finest resolution case of 120km and u e,ℓ (z) denotes the value of the normal component of velocity u e at the ℓ-th cell edge for any choice of the random vector z = (z 1 , z 2 ) ∈ [1km , 3km] × [15m/s , 25m/s]. We then have that the quantity of interest is given by 325

MC and MFMC estimators
Similar to before, the goal is now to construct and compare, for the same computational budget, MC and MFMC estimators of the QoI defined in (16). The MC estimator Q M C 1,test5 of the QoI Q 1,test5 = E F 1 test5 is given by (2). The MFMC estimator Q M F M C 1,test5 of Q 1,test5 = E F 1 test5 makes use not only of the MC estimator Q M C 1,test5 for the finest 120km grid, but also of the MC estimators Q M C 2,test5 ≈ E F 1 test5 and Q M C 3,test5 ≈ E F 1 test5 corresponding the coarser 240km and 480km grids, respectively.

330
In this case, 4500 uniform i.i.d. realizations of z are drawn for pre-computation, of which all models share 1200 and models the two lower-fidelity surrogates share the entire 4500. Using 100 samples of the random variable z, the costs and correlation coefficients of these models are respectively approximated by where the costs C 1 , C 2 , and C 3 denote the computation time (in wall-clock seconds) necessary to advance the relevant dis-335 cretized RSWE system one time step. As was the case for the SOMA experiment, these cost-correlation pairs are ideal for MFMC estimation, i.e., the surrogates OoIs F 2 test5 (z) and F 3 test5 (z) are very well correlated with F 1 test5 (z) and are much less costly to obtain compared to F 1 test5 (z). Note the 100 samples of z used to determine the data in (14) can be reused when determining the MC and MFMC estimators. From this point on, all other details about the construction and use of the MC and MFMC estimators are the same as for the SOMA test case of Section 3.1.

340
The results provided in Table 2 and Figure 6 show that MFMC estimation produces a much more precise estimate compared to MC estimation. As was true for the SOMA test case, especially telling is the bottom plot in Figure 6 from which it is obvious that, for the same budget, MFMC estimation results in an order of magnitude greater accuracy compared to that MC estimation, or looking at it another way, for the same relative MSE, MFMC estimation requires a much smaller budget compared to MC estimation.

345
4 First-order ice sheet model The next experiment we consider illustrates the effectiveness of MFMC estimation on a QoI important for the realistic modeling of ice sheets such as those found near, e.g., Greenland, Antarctica, and various glaciers.  model. Here, we provide a short review of that model; detailed descriptions are given in, e.g., Blatter (1995); Pattyn (2003) and also in Perego et. al. (2012);Tezaur et. al. (2015).
Let Ω denote the three-dimensional domain occupied by the ice sheet having boundary Γ = Γ s ∪ Γ b ∪ Γ ℓ given by   Also, let u 1 and u 2 denote the x 1 and x 2 components of the velocity vector u = (u 1 u 2 u 3 ) ⊺ . Then, the first-order model equations for ice sheets is given by the partial differential equations where µ denotes the viscosity coefficient, g denotes the gravitational acceleration, and ρ denotes the density. In (18), the strain-rate tensor (ϵ 1 , ϵ 2 ) is given by and the nonlinear viscosity coefficient µ is given by the Glen flow law µ = 1 2 A − 1 n ϵ 1 n −1 e with n = 3 being the usual choice. The effective strain rate ϵ e is given by and A is often chosen to obey the Arrhenius relation where T denotes the absolute temperature measured in degrees Kelvin, R denotes the universal gas constant, Q denotes the activation energy for creep, and a is an empirical flow constant often used as a tuning parameter.
Once the horizontal components u 1 and u 2 of the velocity are determined, the vertical velocity component w is determined by enforcing incompressibility, i.e., we have that 375 Because the right-hand side is known, this is an ordinary differential equation for w.
Of paramount interest in the modeling of ice sheets is the monitoring of the temporal evolution of the ice sheet domain Ω. However, one notices that there are no time derivatives of u 1 and u 2 appearing in (18), i.e., that system is a static one.
The reasoning behind this curiosity is twofold. Firstly, we have that the system (18) is coupled to an equation for the temporal evolution of the temperature T within the ice sheet, and also coupled to an equation for the temporal evolution of the top surface 380 of the ice-sheet which engenders changes in the ice-sheet domain Ω. Secondly, the time scale of changes in the temperature is much shorter (e.g., hourly) when compared to the time scale (e.g., at least many days) of changes in the ice sheet domain Ω.
Thus, in a computational model of ice-sheet dynamics, determination of the domain Ω is coupled to that of the velocity u. Precisely, the temperature and top surface are first advanced from a given domain/velocity pair (Ω, u) over several time steps, producing an evolution that prescribes a new domain/temperature pair (Ω, T ). Following this, the pair (Ω, T ) are used 385 to generate a new velocity u by solving (18) and (20). While both stages of this process are important, the present experiment focuses on the computation of u from Ω and T .

MC and MFMC estimation
The specific application setting we consider is based on Experiment C of the benchmark examples in Leng et. al. (2012);Pattyn et. al. (2008). Here, the ice domain is a rectangular parallelepiped having a square base with side length L and thickness H, 390 which lies on a slanted bed with slope parameter θ. The upper surface boundary is given as and the basal topography is given as The first-order ice-sheet model is discretized using the stabilized P1-P1 finite elements given in Zhang et. al. (2011). To 400 solve the resulting nonlinear system of discrete equations, 15 Picard iteration steps are carried out after which a switch is made to a Newton iteration, up to a maximum of a total of 40 iterations. However, if the residual error decrease is less than 75% relative to the residual error of the previous step, the nonlinear iteration is switched back to a Picard iteration. An example illustration of the discrete solution of this ice-sheet model for specific values of the parameters θ and β is provided in Figure 8 using the highest resolution grid.

405
Because the cracking and melting of ice sheets is an important indicator of climate change (Clark et. al. (1999);Hanna et. al. (2013)), we consider the the output of interest where N 1 = 307, 461 is the number of vertices of the finest 120×120×20 grid, u n (z) denotes the value of the discrete velocity u at the nth vertex for any choice of the i.i.d. random vector z = (θ, β) ∈ [0.2, 0.8] × [800m, 1200m]. This OoI provides a 410 measurement of how energetic the ice sheet is depending on its slope and friction coefficient, and can be loosely related to how vigorously the ice will deform given a configuration specified by z. We then choose the quantity of interest given by Two surrogates for the ice-sheet model are defined respectively using the coarser 60 × 60 × 10 grid with N 2 = 40, 931 and the even coarser 30 × 30 × 5 grid with N 3 = 5, 766. Then, from (3), we have the corresponding three Monte Carlo estimators {Q M C 1,ice , Q M C 2,ice , Q M C 3,ice } and, from (4), we have the MFMC estimator Q M F M C ice which makes use of these three MC estimators.
At this point, we proceed as was done in Section 3.2. For example, we now have that the approximate costs (measured in wall-clock seconds and averaged over 100 random samples) for computing F 1 ice (z), F 2 ice (z), and F 3 ice (z), as well as the MFMC and MC estimation over 250 "independent" runs using this common data set are provided in Table 3 and Figure 9. Here it is especially apparent that the budget-preserving modifications to MFMC that led to the algorithm in Box B were necessary, as B ≪ C · R and only one high-fidelity run is ever selected by the algorithm.
Despite this, it is clear from Table 3 and the plots in Figure 9 that, at these small budgets, MFMC estimation produces an estimator which is much more accurate and precise than does MC estimation. This is not surprising because MFMC estimation 430 has the flexibility to load most of its budget onto the cheaper lower-fidelity surrogate models, evaluating them many times in order to bring down the overall variance of their estimates. Again, this provides empirical validation for the use of MFMC estimation over MC estimation when estimating model statistics even in practical cases where data availability is low.

Concluding Remarks
This paper serves to introduce multifidelity Monte Carlo estimation as an alternative to standard Monte Carlo estimation for 435 quantifying uncertainties in the outputs of climate system models, albeit in very simplified settings. Specifically, we consider benchmark problems for the single-layer shallow water equations relevant to ocean and atmosphere dynamics and we also consider a benchmark problem for the first-order model of ice sheet dynamics. The computational results presented here are promising in that they amply demonstrate the superiority of MFMC estimation when compared to MC estimation on these examples. Furthermore, the use of MFMC as an estimation method will surely be even more efficacious when quantifying 440 uncertainties in more realistic climate modeling settings for which the simulation costs are prohibitively large, e.g., for longtime climate simulations. Thus, our next goal is to apply MFMC estimation to more useful models of climate dynamics (such as the primitive equations for ocean and atmosphere and the Stokes model for ice sheets) that are also coupled to the dynamics of other climate system components and also coupled to passive and active tracer equations.
Code and data availability. The climate simulation data used in this work along with Python code for reproducing the relevant experiments