Advancing the Universality of Quadrature Methods to Any Underlying Process for Option Pricing

Exceptional accuracy and speed for option pricing are available via quadrature (Andricopoulos, Widdicks, Duck, and Newton, 2003), extending into multiple dimensions with complex path-dependency and early exercise (Andricopoulos, Widdicks, Newton, and Duck, 2007). However, the exposition is incomplete, leaving many modelling processes outside the Black-Scholes-Merton framework unattainable. We show how to remove the remaining major block to universal application. Although this had appeared highly problematic, the solution turns out to be conceptually simple and implementation is straightforward (we provide code on the Journal of Financial Economics website). Crucially, the method retains its speed and flexibility across complex combinations of option features but is now applicable across other underlying processes.


Introduction
Numerical techniques are widely required in derivatives pricing, since it is often the case that no analytic equation has been found for the valuation of a particular class of option. Ideally, in place of numerical methods, we would eventually have a suite of analytic solutions to cover all derivatives pricing situations or, failing that, analytic approximations of sufficient accuracy and utility for all practical cases. For example, the work of Kristensen and Mele (2011) is highly encouraging, yet we remain a long way from generality along this route. Beyond the solutions of Black and Scholes (1973) and Merton (1973) and a limited set of other cases (generally those with no early exercise), numerical techniques are frequently required. The available numerical techniques are classified as trees (Cox, Ross, and Rubinstein, 1979), solution of partial differential equations usually by finite difference methods starting with the most basic explicit method (Brennan and Schwartz, 1977), Monte Carlo simulation (Boyle, 1977) and quadrature in the form of the QUAD technique (Andricopoulos, Widdicks, Duck, and Newton, 2003).
Each of these has been the subject of modification and refinement, especially in relation to handling early exercise with Monte Carlo (Longstaff and Schwartz, 2001) and pathdependent features with the other techniques. Andricopoulos, Widdicks, Newton, and Duck (2007) further developed QUAD into a flexible, robust option pricing tool of wide applicability, covering multiple dimensions, early exercise and heavy pathdependence in complex combinations of exercise features. QUAD is usually overwhelmingly fast, making it especially useful in those cases in which computation with other methods is inconveniently slow. However, it has largely been limited to the Black-Scholes-Merton framework. Overcoming this limitation is the subject of our paper, which completes the exposition of the method.
Just as the mathematics of trees, finite difference and Monte Carlo approaches were all known and used in the natural sciences and engineering long before their introduction into finance, basic quadrature goes back centuries. In essence, it is the calculation of an area under a graph via an approximation, splitting the area into a series of shapes, such as rectangles, and summing their individual areas. Taking smaller shapes produces more accurate results, converging on the correct one. Well-known methods for doing this are the Trapezium Rule, Simpson's Method and Gaussian quadrature, and there are others. Each has differing properties and is more or less easy to program, but of particular interest is the rate of convergence to a correct solution as the number of calculations is increased in progressively finer approximations.
A key concept in the financial application of quadrature, sometimes not appreciated, is that the mathematical quadrature component is merely a computational engine to be chosen appropriately to fit into the wider calculations of the particular options problem (Andricopoulos, Widdicks, Duck, and Newton, 2003;Andricopoulos, Widdicks, Newton, and Duck, 2007). Thus, even the very simple Trapezium Rule can be adequate when elements in the wider calculations are less refined. Similarly, Gaussian quadrature, though in itself a very fast scheme, only provides useful extra speed over what may be the best practical engine, Simpson's Method, where unusually heavy calculational demands are made on the quadrature component versus the rest of the computational scheme. We shall return to the engine analogy later, when we show how previously intractable problems in applying quadrature can be circumvented by including a second type of numerical "engine".
The foundation work was presented in the Black-Scholes-Merton framework but (as explained in Section 2.2) the technique applies whenever the conditional probability density function is known. This restricts the immediate use of the technique to the Black-Scholes-Merton setup, to Merton's jump-diffusion model (Merton, 1976) and to certain interest rate models such as those of Vasicek (1977) and Cox, Ingersoll, and Ross (1985). Extension to Merton's process is straightforward. The interest rate models are more subtle, though Heap (2008) has successfully extended the coverage to some (but not all) interest rate derivatives with mean-reverting underlying processes.
A notable advance was made by O'Sullivan (2005), who used the observation that many useful processes without a well-known density function do, nonetheless, have a well understood characteristic function. The density function, as the inverse Fourier transform (FFT) of the characteristic function, can be computed using fast Fourier transform and the output may then be inserted in the QUAD scheme to price derivatives. We refer to this method as FFT-QUAD. O'Sullivan's method applies in particular to exponential Levy processes. This made FFT-QUAD an important advance but it does suffer several drawbacks. First, it requires two integrations even for a derivative on a single underlying process. This brings the complexity of the algorithm to at least OðN 2 Þ, where N is the number of grid points used in the numerical integrations; by comparison, the original QUAD has a much better complexity of just O(N) for vanilla options. Second, it does not cover every option type; for example, the single-variable FFT-QUAD cannot be used to price heavily path-dependent options in stochastic volatility frameworks, since it does not keep track of the evolution of the volatility process in moving from one observation point to the next.
O'Sullivan's FFT-QUAD was improved considerably by the CONV technique of Lord, Fang, Bervoets, and Oosterlee (2007). We refer to this method as CONV-QUAD (Staunton, 2007). This excellent method uses the observation that the fundamental pricing integral may usually be regarded as the convolution (strictly speaking, the cross-correlation) of the payoff and the density function. The beauty of this insight is that the two integrals of FFT-QUAD may then be replaced by two fast Fourier transforms. This brings the complexity of the algorithm down to OðNlogðNÞÞ and, for example, for Bermudan options (on M observation points), the complexity remains at OðMN logðNÞÞ, which beats even QUAD's OðMN 2 Þ. The CONV-QUAD method applies to exponential Levy processes and, hence, in particular to the Black-Scholes-Merton model, thereby improving on the speed and accuracy of the plain QUAD technique of Andricopoulos, Widdicks, Duck, and Newton (2003) and Andricopoulos, Widdicks, Newton, and Duck (2007). Due to its nearly linear speed, it clearly replaces plain QUAD as the fastest method for a great many cases.
Useful as these developments were, the road to full universality for underlying processes remains blocked. The CONV method cannot be applied to, for example, the CEV or the Heston processes with early exercise and, while a single-variable characteristic function for the latter has been used in O'Sullivan (2005) and Fang and Oosterlee (2008) to price European options, a universal QUAD-style treatment of these processes is still lacking.
In this paper we return to the methods of Andricopoulos, Widdicks, Duck, and Newton (2003) and Andricopoulos, Widdicks, Newton, and Duck (2007) and provide option pricing techniques for the missing underlying processes. At the core of this extension is the use of closed-form approximations for the appropriate single-or two-variable transition density functions. By using these approximations we can price complex combinations of option features precisely as if we were working in the Black-Scholes-Merton framework. Thus, we advance the range of the earlier papers without losing their generality; the universality promised in the title of the first paper (Andricopoulos, Widdicks, Duck, and Newton, 2003) is finally arrived at.

Basics
Descriptions of the QUAD method can be found in Andricopoulos, Widdicks, Duck, and Newton (2003), Andricopoulos, Widdicks, Newton, and Duck (2007) and Chen (2013). We also provide a detailed appendix on the Journal of Financial Economics website (http://jfe.rochester. edu).

QUAD in the Black-Scholes-Merton framework
Start with the well-known Black-Scholes-Merton partial differential equation for an option with an underlying Please cite this article as: Chen, D., et al., Advancing the universality of quadrature methods to any underlying process for option pricing. Journal of Financial Economics (2014), http://dx.doi.org/10.1016/j.jfineco.2014.07.014i asset following geometric Brownian motion: where V is the price of the derivative product and is a function of S, the value of the underlying asset, and time, t. The risk-free interest rate is r, the volatility of the underlying asset is σ, and continuous dividend yield is D c . In the earlier versions of QUAD, we found it convenient to take the log transform of the underlying asset (the method works equally well using the actual asset price). Suppose y is the corresponding value of the transform of the underlying at time t þ Δt, where Δt is a time step: and It is important to note that Δt is not restricted to small time periods; for example, were QUAD to be applied to a plain European call option (no need; we have the analytic solution!) then the complete time to expiry would be taken in a single time step, Δt. At expiry, the final condition (payoff) becomes maxðe y À X; 0Þ, where X is the strike (exercise price). The solution for the value of the option at time t on an underlying asset S is then and B x; y ð Þ¼ exp Àðx À yÞ 2 where, in turn, For a plain vanilla call option, for example, the integrand becomes This integral is key. Next, any of the many quadrature methods can be employed as a valuation engine for what is a European option with known payoff. The integration covers a single time step, Δt. The (considerable) advantage comes from treating more complex and interesting options problems as equivalent to series of European options.
Regions where there is no boundary condition to deal with can each be jumped in a single step. Although the range of integration is infinite, this is easily handled by truncation of the range, provided the integrand outside the truncated range is suitably small. Highly accurate calculations are then possible to any level required. American exercise is readily handled by extrapolation from the Bermudan case. If the QUAD grid is constructed to coincide with the discontinuities such as a strike price, barrier or exercise boundary, then convergence is perfectly smooth and suitable for improvement through extrapolation via a simple Richardson-type procedure, because only distribution error remains. It is so smooth that extrapolated results can often be further extrapolated themselves. This extrapolation is applicable in all cases, including those with early exercise (but see Andricopoulos, Widdicks, Newton, and Duck, 2007, for adaptive quadrature). Assuming the discontinuities are correctly located, for Simpson's rule the extrapolated results converge as ðΔyÞ 8 , which is 1=N 8 . For comparison, a trinomial tree converges merely with 1/N or in some cases with only 1= ffiffiffiffi N p , and finite difference methods at best converge at the rate of ðΔS 2 ; Δt 2 Þ, where ΔS and Δt are the step sizes in the S and t directions respectively. Impressive as the outcomes are, they are limited in scope for the underlying processes. In order to address this, we will need to reformulate the QUAD method.

QUAD and transition densities
In Section 2.1, we took as our starting point the Black-Scholes-Merton partial differential equation and formulated the price of the option in a particular fashion in order to pave the way for numerical integration by quadrature. Mathematically, this was a use of the corresponding Green's function. As we move to settings where Green's functions are not available, it will become necessary to reformulate this in terms of probability density functions. For this we simply model the asset price directly.
In this vein, the starting point is where V remains the value of the option (this time as a function of the asset price), and f τ ðyjxÞ is the risk-neutral conditional density function of the asset price over time- To see the equivalence with the previous approach, we note that in the Black-Scholes-Merton setup the density reads as To get from Eqs. (9) to (4), we merely need to carry out a change of variables from y to log y. In the original equations, factors A(x) and Bðx; yÞ are isolated for computational purposes onlycollecting to A(x) the factors that are independent of y speeds up the computations, but conceptually Eq. (9) is the key.
In particular, we see that we can, in principle, use quadrature techniques as long as we have some method of computing the values f τ ðyjxÞ. One particularly natural way to achieve this is through characteristic functions. This is the route taken in O' Sullivan (2005) and Lord, Fang, Bervoets, and Oosterlee, (2007), and indeed progress can be made with this approach: it applies to a great many processes and the use of fast Fourier transform (see Carr and Madan, 1999) allows speedy implementation. As Please cite this article as: Chen, D., et al., Advancing the universality of quadrature methods to any underlying process for option pricing. Journal of Financial Economics (2014), http://dx.doi.org/10.1016/j.jfineco.2014.07.014i explained in the Introduction, the CONV-QUAD method of Lord, Fang, Bervoets, and Oosterlee (2007) is particularly fast. This method hinges on one absolutely crucial condition: the transition density f τ ðyjxÞ for the log-asset price must depend only on y À x. This condition is satisfied (from their very definition) by exponential Levy processes, which makes CONV-QUAD the method of choice for this large class of processes. However, this condition is not satisfied in many interesting cases, including several local and stochastic volatility models, and we are left with a roadblock to the generality of the methods under the umbrella term QUAD.
We resolve this final difficulty to completion of the method by using approximations of the transition density. This opens up the remaining previously unattainable areas for pricing. Just as a fundamental feature of QUAD, in earlier papers, is interchangeability of quadrature methods as calculational engines, here, in the same spirit, any sufficiently accurate approximation to a transition density function can be employed. In some cases, even where a density function is available, it transpires that the approximation route is superior. To illustrate this, we use two sources: Aït-Sahalia (2008) and Henry-Labordère (2009), though we favour the former as the best all-purpose engine. In the process of formulating QUAD into this more generally applicable version, all the capabilities to handle complex combinations of features in Andricopoulos, Widdicks, Duck, and Newton (2003) and Andricopoulos, Widdicks, Newton, and Duck (2007) are retained.
In Section 3, we begin with a discussion of local volatility models, using this as a way of introducing approximations of the density function and covering several practically useful models. We then move to spot interest rate models, for which approximation techniques can be used to improve the techniques of Heap (2008) alluded to in the Introduction. Finally, we explain how approximations can be used to price options under stochastic volatility models such as Heston's and SABR (both with early exercise) that could not previously be priced via QUAD-style techniques. We also choose these models to illustrate the universality now achieved for QUAD methods because they are stochastically two dimensional and, thus, may be regarded as somewhat more difficult to implement than single dimensional processes.

Local volatility models
Local volatility models have been a popular route for practitioners to fit the day's volatility smile, from plain vanilla options, prior to pricing more complex options. Discrete and continuous derivations are due to Derman and Kani (1994) and Dupire (1994). Although calibrated to a current smile, these models generally fail to reproduce future volatility smiles; in other words, they are not well suited to modelling the dynamics of smiles (for early empirical work on the dynamics of implied volatility surfaces, see Dumas, Fleming, and Whaley, 1998). Therefore, in Section 5.2 we show how to handle fully stochastic volatility models while here we focus on (time-homogeneous) local volatility models, of the form where σðS t Þ is a deterministic function (the local volatility function) of the asset price S t , generalizing the constant volatility under Black-Scholes-Merton. For this family of models, we use an approximation of the appropriate density function. There are two reasons for this: first, the density function need not, in general, be known in a closed or even semi-closed form and, second, the approximation is sometimes easier to work with than the actual density function, even when this is known.
We next describe a general method of approximating the density function due to Aït-Sahalia. We then consider specific examples: the constant elasticity of variance (CEV) process of Cox (1996) and quadratic local volatility. Once the relevant approximations have been found, using them is as routine as using the density function for the Normal Distribution in the well-known Black-Scholes equations for calculating plain vanilla European option values.

Aït-Sahalia's algorithm
We now describe briefly Aït-Sahalia's method, which is explained with numerous examples worked in his paper (Aït-Sahalia, 1999). For theoretical justification of the computations, see his later paper (Aït-Sahalia, 2002). The algorithm begins from normalization of the process by modifying it to one with unit diffusion. From Ito's Lemma, and where an alternate σ is σðS t Þ. The rest of the algorithm involves working out an approximation for the transition density p Y ðt; yjy 0 Þ of Y t . This gives us the approximation we are looking for, as To find an approximation for p Y ðt; yjy 0 Þ, we first define The significance of this function is that Y t satisfies the where ϕ is the density of the normal distribution, ϕðzÞ ¼ e À z 2 =2 = ffiffiffiffiffiffi 2π p , and where the coefficients c l ðyjy 0 Þ are defined recursively as follows: set c 0 ðyjy 0 Þ ¼ 1 and define, for j Z1, Please cite this article as: Chen, D., et al., Advancing the universality of quadrature methods to any underlying process for option pricing. Journal of Financial Economics (2014), http://dx.doi.org/10.1016/j.jfineco.2014.07.014i While the expressions may look somewhat complicated, the algorithm is very easy to implement. All the steps can be carried out on a symbolic algebra package such as Mathematica or Matlab/MuPAD. Our computations were carried out using a ten to fifteen line script in which the user can specify the order of the expansion (number L in the above equations), and the program carries out the recursion in a matter of seconds. The resulting approximation can then be ported to whichever programming language one wishes to use for more numerically intensive algorithms.
An alternate method is to work out γ, γ À 1 and μ Y with pen and paper, leaving the recursion to the computer. For QUAD applications, γðSÞ and γðS 0 Þ are then calculated within the routine computing the density function and plugged into the approximation for p Y . The advantage of this approach is that the density calculation becomes slightly faster because γðSÞ is precomputed and the approximation for p Y tends to be slightly easier to work with.
Finally, we note that with t¼0.1, first-and second-order expansions are typically sufficient for achieving root mean square errors of order 10 À 5 and 10 À 8 , respectively. Similar accuracy was reported by Aït-Sahalia (1999).

CEV process
We apply the procedure of Section 3.1 first to Cox's CEV process (Cox, 1996). We begin here for two reasons. First, for this model the density function exists in a (semi-) closed form, so that it is easy to compare the approximation p a t ðf jf 0 Þ with the actual value. Second, we demonstrate the highly convenient feature that use of approximations to the density function, far from being inferior, can in some cases be a superior route.

The density function and related issues
We use the following formulation for the CEV model: Moreover, we require that the process is absorbed by zero: The evolution of this process is fairly well understood. In particular, the transition density is known in a semiclosed form. The prefix "semi-" is explained by the presence of Bessel functions in the expression for the density function. In fact, if r¼0, we have (for S 4 0) where ν ¼ 1=2ð1 À βÞ and I ν denotes the modified Bessel function of the first kind: A process S t with a positive drift r 40 can be obtained from one with zero drift; call it X t , as follows: The density of S t is then related to p 0 t as follows: Whether r ¼0 or not, we also have a positive (though frequently negligible) probability of default Finally, by our assumption that the process absorbs at zero, we have p t ðSjS 0 Þ ¼ 0 for S o0.
In principle, we can now price exotic and path-dependent options using the algorithms of Andricopoulos, Widdicks, Duck, and Newton (2003) and Andricopoulos, Widdicks, Newton, and Duck (2007) by replacing the Green's function by Eq. (18). The only difference is that the CEV process admits a non-zero probability of S t hitting zero and this must in some way be incorporated into the QUAD scheme. We will return to this point shortly.
The CONV-QUAD method of Lord, Fang, Bervoets, and Oosterleen (2007) cannot be used in the CEV context. First of all, CONV requires that we work with log-asset price s t ¼ log S t , and it is unclear how this should be done when S t has a non-negative (and sometimes significant) chance of hitting zero. More seriously, the CEV process does not satisfy the CONV condition that the density function can be written as Notice that this would imply in particular that s t þ Δt À s t should be independent of s t , and this fails for the CEV process. To see why, we use Ito's Lemma to find that We, therefore, have so that s t þ Δt À s t is independent of the initial state s t if and only if β¼1, i.e. if we have geometric Brownian motion.

Implementation
We can now use the QUAD methods of Andricopoulos, Widdicks, Duck, and Newton (2003) and Andricopoulos, Widdicks, Newton, and Duck (2007) to price exotic options: we merely have to replace the Black-Scholes density function with the CEV density function. There are, however, two points of note.
First, with certain parameters the CEV has a significant probability of hitting S T ¼0 over a time step from t to T ¼ t þ τ, and this probability cannot be read directly from the density function. However, we have If this p 0 is non-negligible, we should replace Eq. (9) by In practice, it could make sense to compute whether S t ¼0 is significant over the entire lifespan of the option. If not, we can ignore this point and use precisely the same QUAD routines as with geometric Brownian motion; however, modifying the algorithm is a trivial matter. This absorbing barrier at zero is well known. Rebonato (2004), for example, provides a financial commentary and a review of technical solutions for alternate numerical methods. The second and more serious point is that the presence of the Bessel function in the density function equation (18) is troublesome. Function I ν ðzÞ eventually explodes and while the growth is, in theory, absorbed by the exponential term preceding it in Eq. (18), this can lead to complications as the implementation tries to evaluate something of the form 0 Á 1. This is particularly prone to occur when pricing options involving very small time steps, as the t in the denominator then pushes the argument of I ν toward hazardously large values.

Quadratic local volatility model
Compare an alternate local volatility form (see also Dumas, Fleming, and Whaley, 1998) given by where σ 4 0 and b are constants. We can find an approximation for the density of this process in the same way as before and obtain γ x γ À 1 y The first-order approximation for p X ðt; xjx 0 Þ is theñ where in turn c 0 ðyjy 0 Þ ¼ 1 and Here, we also opt not to reproduce the much more complicated c 2 ðyjy 0 Þ term.

Numerical results
QUAD, in its original and modified forms, is an exceptionally fast method. Our purpose next is to demonstrate that the final modification presented in this paper does not degrade performance to any significant degree.
Computation times depend, of course, on both computer and software. The computers used in this work changed between Andricopoulos, Widdicks, Duck, and Newton (2003) and parts of Andricopoulos, Widdicks, Newton, and Duck (2007), from 550 MHz to 2.4 GHz Pentium-based university cluster computers using an optimized Fortran compiler. For the present paper, work was carried out on a university high performance computing service, using one node of eight cores at 3.0 GHz. Parallel codes were compiled using an Intel C þ þ compiler and OpenMPI. As with the earlier papers, we seek high intrinsic speed and fast convergence for test cases, such that more complex cases requiring intensive calculations are completed in reasonable time periods. Therefore, our interest is in performances relative to the Black-Scholes-Merton QUAD of Andricopoulos, Widdicks, Duck, and Newton (2003) and Please cite this article as: Chen, D., et al., Advancing the universality of quadrature methods to any underlying process for option pricing. Journal of Financial Economics (2014), http://dx.doi.org/10.1016/j.jfineco.2014.07.014i Andricopoulos, Widdicks, Newton, and Duck (2007). We use parameters as follows: Each of the processes has S 0 ¼ 10 and r ¼0.05. The Black-Scholes-Merton process has σ¼0.2; the CEV process has σ¼0.6 and β¼0.5; and the quadratic process has b¼ 1 and σ¼0.02 (in the last two cases, σis chosen so as to make the initial instantaneous volatility σðS 0 Þ=S 0 approximately equal to the Black-Scholes volatility 0.2).
Our proxy for accuracy of these computations, denoted by K, is related to the step size as follows: where T denotes the maturity of the option and M is the number of observation points. The thinking behind this somewhat bizarre looking equation is as follows. First, we want our algorithms to have a constant step size in all grids; second, as in Andricopoulos, Widdicks, Duck, and Newton (2003), we want an integer proxy that is inversely related to the step size; and, finally, we want the achieved accuracy to be about the same across the range of options we are interested in.
We initially took Δt to be a constant times T=KM, but this produced disproportionately dense grids for frequently observed options: the prices of these options were often an order of magnitude more accurate than those of less frequently observed options (and, of course, the computational times were longer). After experimenting with Δt equal to a constant times T= ffiffiffiffi ffi M p K, we finally settled for the equation above, the constant 7.5 being to some extent arbitrary, decided by trial and error. However, we include a proper mathematical treatment of K in the online appendix on the QUAD method (Journal of Financial Economics website at http://jfe.rochester.edu).
We record for each of these processes the root mean square error and the total time taken for computation of a package of eight options. For Bermudans, this package consists of options with strikes 10.0 and 10.5, maturities 0.5 and 1.0, giving four combinations; then each with numbers of observations 6 and 30, totalling eight combinations. For barrier options, we consider strikes 10.0 and 10.5, barriers 9.5 and 9.9; the numbers of observations 6 and 30 and the maturity constant at 0.5, totalling eight combinations. In both cases, the error is computed with respect to the QUAD price obtained using K ¼512. In the case of the CEV model, the reference is computed using the second-order approximation (see below for further comments on the choice of order). We record the root mean square errors of the prices obtained by Richardson extrapolation.
Computational time is for the entire bundle of options. The computational time for a single 6-step or 30-step option is typically about 1/60th or just under 1/4th of the total computational time. Also, 0.01% relative error for a 6-step option can be achieved in about one second for the Black-Scholes-Merton (BSM) process and in less than five seconds for the other processes on a common laptop equipped with Intel Core 2 Duo P7450 at 2.13 GHz.
Tables 1 and 2 illustrate for Bermudan options and down-and-out barrier options. The first comment we make on these results is that the QUAD prices for CEV and quadratic processes converge to the correct price at the same rate as those for the Black-Scholes-Merton process. Secondly, the pricing of each option under the CEV or the quadratic process takes three to five times as long as for the Black-Scholes-Merton process. This reflects the fact that our approximations are slightly more complicated than the usual Black-Scholes-Merton density and, therefore, take longer to compute. Given the enormous computational speed advantage of QUAD, this is a perfectly acceptable result.
The first conclusion was merely a consistency check: equal rate of convergence is what we expected. The second conclusion means that many exotic and path-dependent options can now be priced in local volatility situations essentially as quickly as if we were working in the Black-Scholes-Merton framework. In particular, there is no need to resort to Monte Carlo or solution of partial differential equations via finite difference methods.
Finally, we note the remarkable speed at which the extrapolated prices converge toward the correct value. The Bermudan prices are immediately within about 10 À 8 of the correct price. Bearing in mind that the first-order approximations are themselves accurate to order 10 À 5 , this means that these prices, obtained in less than a second for the Black-Scholes model and in about 1.5 and 3 seconds for CEV and quadratic models, respectively, are as accurate as the method allows. If even greater accuracy is required, second-order approximations must be used and then, including extrapolations, the computations can still be carried out in a matter of seconds.
In Table 3, we illustrate the computational cost of a complicated density function by "pricing" an option using a density function that is set to some constant. The option "price" is, of course, meaningless but the computational time indicates how much time is needed merely for setting up the grids and computing the numerical integrals. Even in Black-Scholes QUAD, 89% of the computational effort is used in the computation of grids and so on but for other processes the difference is greater. In practical implementations, therefore, it pays to optimize the computation of the density function!

Henry-Labordère's algorithm
Just as there is more than one engine of approximation via quadrature, there is more than one for approximating the density function. To demonstrate this, we note that an alternate method of approximating the density function has been found by Henry-Labordère (2009). His method applies nicely to processes that can be transformed into a process X t satisfying a stochastic differential equation of the form and so, in particular, the CEV model is covered. Moreover, it is easy to modify QUAD routines so that they use a forward rate as the random variable and, in this way, Henry-Labordère's method becomes as widely applicable as Aït-Sahalia's.
Here, we merely note that in the second-order his approximation is very similar in speed and accuracy to Aït-Sahalia's approximation in the first-order. In contrast with Aït-Sahalia's method, it is not easy to extend Henry-Labordère's expansions to higher order, since this requires knowledge of higher order heat kernel coefficients which cannot be deduced easily from the lower order terms. These expansions are, however, fast and accurate to evaluate (once established) so that they provide a viable alternative to the methods we have described so far.

Interest rate models
At this point, it is useful to note problems previously encountered with interest rate models used either with bonds or as a second underlying in more general derivatives pricing. Heap (2008) had some success in applying QUAD to these processes, especially Vasicek (1977), but for the CIR model (Cox, Ingersoll, and Ross, 1985), limitations remain. We show briefly how to circumvent these.
The CIR model is given by This process has a known transition density and QUAD type option pricing can be carried with certain restrictions, as before. To see how these restrictions arise, recall that the density function for this process is where γ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi κ 2 þ 2σ 2 p , α ¼ 2γ=σ 2 ð1 À e γt Þ, q ¼ 2κθ=σ 2 À 1 and I q denotes the modified Bessel function of the first kind. The difficulties with this expression stem from the presence of the Bessel functions. The reasons are precisely as we noted in the context of the CEV model. As with the CEV model, the actual breakdown point depends on the parameters but, as observed by Heap (2008), financially reasonable parameters can be chosen for which the density cannot be evaluated for t o0:05. This makes the pricing of a one-year option with more than 20 observation points impossible.  As with the CEV process, Aït-Sahalia's approximations can be used to rescue the situation. Again, the approximation is the best compromise between speed and accuracy. In fact, the first-order approximation becomes more and more accurate when t approaches zero, so the exact formula and the approximation may be seen to complement one another. QUAD style techniques can then be implemented easily, as with Vasicek's model.

Stochastic volatility models
We now turn to stochastic volatility models. In these, the asset price, S t , follows a geometric Brownian motion within which the asset's variance, v t , itself follows a stochastic process.
Our aim is to show that Bermudan (leading to American, by extrapolation) and path-dependent options can now be priced via QUAD when the asset price follows such a process. This is necessarily slightly more involved than what has gone before, since we need to keep track of the variance process as well. We do this by considering the joint density function of the pair ðS t ; v t Þ. To see why this is necessary, consider a Bermudan put option with only two observation points T 1 and T 2 . To price this in a QUAD-like fashion, we start by computing the payoffs of the option for some terminal grid of S T 2 values. This presents no problems. We then move to time T 1 . At each grid point at time T 1 we can either exercise (in which case we can readily compute the pay-off) or stick with the option, in which case we have to price a European put option. The question then is what the volatility should be. The variance is stochastic, and the price of the put option is different for different values of v T 1 , so clearly we need to use a twodimensional grid and compute the value of the put option for each point ðS T 1 ; v T 1 Þ. After choosing the maximum between the exercise and non-exercise prices, we obtain a twodimensional grid of values. Finally, we compute the price of the option at the initial time by summing over the twodimensional grid. Here, we must weight each of the values by the two-dimensional density function f T1 ðS T 1 ; v T 1 jS 0 ; v 0 Þ (also by other weights, depending our choice of quadrature scheme).

Heston's model
The Heston (1993) model has proven highly popular, doubtless due to its relative tractability with European style options, so we use it here to demonstrate how the final modification of QUAD presented in this paper can handle such models, including American and other more complex features.

Definition and maximum likelihood expansion
In Heston's model, the asset price follows geometric Brownian motion where the variance v t is assumed to follow another, meanreverting process, of the CIR type and the underlying Brownian motions are correlated, that is dW The interpretation of this model is well documented. We merely remark that the variance reverts toward level θ at rate κ.
The characteristic function of s t ¼ log S t is known in a closed form (see Heston, 1993). This means that vanilla options can be priced quickly using Fourier inversion and FFT. Moreover, even the joint characteristic function is known (see Zhylyevskyy, 2010Zhylyevskyy, , 2012. Pricing, therefore, becomes theoretically straightforward: we compute the joint density function from the characteristic function and use two-dimensional QUAD in the spirit of Andricopoulos, Widdicks, Newton, and Duck (2007). This part of the process, though, is quite slow: Zhylyevskyy (2010) reports computation times of about 44 seconds for the Fourier inversion on a computer equipped with Intel Core-2 2.83 GHz and 4 gigabytes of RAM.
It is natural to ask whether some CONV-style idea could be used for computational efficiency but this, too, is unlikely. In fact, it is easy to see that the Heston process, regarded as a two-dimensional process, does not satisfy the (two-dimensional version of) the condition in Eq. (23). To see why, simply consider two scenarios for where v t is at or far away from the mean-reversion level θ. In the first case, v t þ Δ is likely to remain near θ, so that the increment is concentrated near zero. In the second case, v t þ Δt is likely to be inverting closer to θ, leading to a nonzero increment.
Once again, a convenient approximation can be used with QUAD. The (logarithm of the) joint density function has been found by Aït-Sahalia and Kimmel (2007). The idea of the method is to seek an approximation of the functional form where the C j ðs; vjs 0 ; v 0 Þ are functions yet to be defined and Ω is the determinant of the diffusion matrix with respect to a pair of uncorrelated Brownian motions, i.e.
in the case of the Heston model. We view this essentially as a power series with respect to the time variable. The coefficient functions C j ðs; vjs 0 ; v 0 Þ, which are independent of t, are then expanded as Taylor series about ðs 0 ; v 0 Þ and the coefficients of these expansions are worked out recursively using Fokker-Planck equations and, in practice, a symbolic algebra program such as Mathematica. Finally, we obtain an approximation for the density function simply by exponentiating L t ðs; vjs 0 ; v 0 Þ. More details can be found in Aït-Sahalia and Kimmel (2007, pp. 419-421).
In the notation of the present paper, t is the time step in the QUAD procedure, Δt. As with local volatility models, the approximation is asymptotic in time. The approximations, moreover, are sensitive: we find that even with a second-order approximation (i.e. J¼ 2) we should not expect the approximation to be accurate for larger steps than 0.1 (see Section 5. find that the approximation becomes less accurate as v 0 approaches zero and that further care needs to be taken with low-likelihood events. As explained above, we are working with a Taylor expansion on (s,v) and so, if we venture too far away from ðs 0 ; v 0 Þ, the series will blow up. In practice, this problem can be avoided by setting the density to zero when s is 5 to 6 standard deviations away from s 0 (though the blow ups, if left to occur, are very easy to detect through absurd output values, such as 10 80 ).

Implementation
Once a suitable likelihood approximation has been ported to a preferred programming language, such as Fortran or C þ þ, implementation of QUAD schemes is straightforward. The multi-dimensional QUAD techniques of Andricopoulos, Widdicks, Newton, and Duck (2007) are used, with one of the assets replaced by v 0 .
The asymptotic nature of the density approximation brings with it some problems, the solution of which is perhaps best explained simply by pricing a European call option under Heston's model. In theory, this can be done in one step by evaluating the expectation As we have noted, we cannot expect our approximation of the density to be accurate beyond t¼0.1. To circumvent this problem, we evaluate the expectation in steps. Mathematically, this amounts to repeated use of the tower property where 0 rs r t while in our implementation this means inclusion of dummy layers, where we store to each grid point ðs 1 ; v 1 Þ the simple expectation worked out from the previous grid. The same principle applies to the pricing of more complicated options. In terms of implementation, the best solution is to store a constant indicating how large a time step is tolerated, Tol (e.g. 0.1). We then work backward from one observation point to the next as usual, except that if the time step t exceeds this constant then we divide the step into equally spaced steps and add a total of N À 1 dummy layers in between the observations. The need for the dummy layers, of course, adds to the computational cost. To gain an idea of how this works, consider European options with varying maturities NTol while keeping the step sizes constant in both s and v directions. The variance process is mean reverting, so we work with the same number of grid points in this direction for all the layers. As for the (log-)asset price, the ith has C ffiffiffiffiffiffiffiffiffiffiffi i=Tol p grid points for some constant C. In total, we end up with computations. This compares unfavourably with Oð ffiffiffiffi N p Þ that the plain Black-Scholes QUAD would achieve but is far from disastrous.

The SABR model
The SABR model (Hagan, Kumar, Lesniewski, and Woodward, 2002) has proven popular with practitioners and can be linked to the LIBOR market model (e.g. see Rebonato, McKay, and White, 2009). This model addresses the problem with local volatility models that, although they are constructed to fit the day's smile, they do not deal correctly with the dynamics of volatility surfaces.
The SABR model may be regarded as a stochastic volatility variant of the CEV process. More precisely, we require that where 0 rβ r 1 and the volatility satisfies where σ V Z 0 and the Brownian motions are correlated as before.
The SABR model does not appear to have a characteristic function in a closed form and so the approach in the present paper is again required. Thus, in the work of Henry-Labordère (2009), approximations for the density function are found and these can be used for QUAD type pricing as an alternative to Aït-Sahalia (2008). Formally, pricing becomes identical to the two-variable QUAD of Andricopoulos, Widdicks, Newton, and Duck (2007).

Numerical results
We are now in a position to calculate option prices with the same flexibility and variety of features as in Andricopoulos, Widdicks, Duck, and Newton (2003) and Andricopoulos, Widdicks, Newton, and Duck (2007) but with other underlying processes.
We illustrate via the Heston and SABR models (twodimensional processes). The method starts from an extremely fast base, with European options generally giving error terms less than order of 10 À 10 within seconds. Consequently, just as in the earlier papers, although the computational burden increases exponentially with the number of observation points, Bermudan and (by extrapolation) American options are still calculated with great speed and precision.
We calculate the root mean square errors of a bundle of eight options using second-order Aït-Sahalia approximation with the same setup as for local volatility models in Section 3.4, with σ 0 ¼ 0:2. Benchmarking is done using K¼ 128. For Heston, r¼0.05, κ¼2, θ¼0.3, σ¼0.6 and ρ¼ À0.75; SABR has r¼0.05, σ¼0.6, β¼0.5 and ρ¼ À0.75. We again use K as a proxy for accuracy. Sufficient accuracy is attainable with low values of K but we tabulate to higher values to show the slowdown at higher values (we also use high K to obtain benchmark prices).
Please cite this article as: Chen, D., et al., Advancing the universality of quadrature methods to any underlying process for option pricing. Journal of Financial Economics (2014), http://dx.doi.org/10.1016/j.jfineco.2014.07.014i In Tables 4 and 5, we report errors and computation times for Bermudan options and for down-and-out barrier options. In Tables 6 and 7, we demonstrate convergence of Bermudan prices toward American prices for put and call options with strike 10.5 and 9.8 and maturity 0.5, calculated against number of observations, with number of QUAD steps set at K ¼52 (the benchmark is calculated by Bermudan put option price extrapolated with 63 and 64 observation points using K ¼128 QUAD steps). Extrapolation against number of observation points is extremely effective in this casean error term of order 10 À 5 can be achieved within 40 observation points (the Bermudan prices obtained with K ¼52 themselves contain error term of order 10 À 5 for the Heston model and 10 À 6 for the SABR model).

Conclusion
The numerical techniques of quadrature introduced previously are applicable whenever the conditional probability density function is known, restricting the immediate use of the method to the Black-Scholes-Merton world, Merton's jump-diffusion model and certain interest rate models. The method can be extended via Fourier transform for processes without a known density function but with known characteristic functions but it then cannot handle the full range of option features. Worse, the road to full universality for underlying processes is blocked in this direction.
The distinguishing quality of QUAD has been its exceptional speed combined with flexibility in handling any option feature or combinations of features, and it is important while adding a new set of capabilities not to lose those previously developed. The universality of the method, promised in the title of the first paper (Andricopoulos, Widdicks, Duck, and Newton, 2003), was partly delivered by Andricopoulos, Widdicks, Newton, and Duck (2007) through extension to options involving simultaneously complex path dependency, early exercise features and multiple underlyings but remained severely limited in the range of underlying processes that could be handled. With the removal of this limitation, the exposition of the method is now complete. QUAD implementation is still straightforward (we provide code in an online appendix) and remains applicable to cases with complex combinations of features.
Gaining the capability to handle previously intractable underlying processes, while maintaining the full range of application previously established, involves returning to    Please cite this article as: Chen, D., et al., Advancing the universality of quadrature methods to any underlying process for option pricing. Journal of Financial Economics (2014), http://dx.doi.org/10.1016/j.jfineco.2014.07.014i the methods of Andricopoulos, Widdicks, Duck, and Newton (2003) and Andricopoulos, Widdicks, Newton, and Duck (2007) and finding techniques for the missing processes. At the core of this extension is the use of closedform approximations for the appropriate single or twovariable transition density functions. Just as plain QUAD works with any one of several quadrature methods as calculational engines, any sufficiently accurate approximation to a transition density function could, in principle, be used and we illustrate the method via the approximations of Aït-Sahalia (2008) and Henry-Labordère (2009), the former proving the better all-purpose choice. In this ultimate version of QUAD it can be the case that, even when a density function is available, the approximation route is superior.
Appendix A. Heston probability density, convergence and extrapolation As initial variance v 0 is set progressively closer to zero, the estimate of the probability density function for the Heston model eventually blows up (the extreme tail of the density surface explodes toward infinity). This is a wellknown property and a practical, if inelegant, solution is simply to set a lower bound for variance, so that the part of the density that has blown up is cut off in the lower tail of the density function. The cutoff value depends on how other parameters of the Heston model are set but is easily deduced using either Cþ þ or Mathematica. Percentage errors of the order of 10 À 5 are found for v 0 ¼0.02, for example, and so density estimates are sufficiently accurate. The only requirement for extra care would be with many observation points combined with particularly low v 0 , to check compounded cutoff errors, but in practice this can be avoided.
Turning to convergence and extrapolation, the total error under stochastic volatility models consists of two parts: quadrature error and density estimation error. Quadrature error depends upon the scheme chosen for the valuation (Trapezium, Simpson, Gaussian, etc.), with the rate of convergence varying in some order of the quadrature step size. Density estimation error is produced by Aït-Sahalia's algorithm (or a substitute, such as Henry-Labordère's) and, although its accuracy increases as step size decreases, it does not have a uniform convergence rate and increasingly becomes the dominant source of error.
Initially, convergence is very fast but slows down as the error term becomes almost entirely dominated by estimation error of the density function. Here, estimation error is of second-order in Aït-Sahalia's algorithm. Fig. A1 shows quadrature step range 350-400 where quadrature error can be neglected compared with density estimation error. The density estimation error shows convergence, though that convergence is not smooth. In this case, extrapolation performs poorly. In other words, the error term can be extrapolated with bigger step sizes when quadrature error is still dominant but, as step size becomes smaller, quadrature error becomes insignificant and extrapolation no longer works. Fig. A2 illustrates.