Quantum advantage of Monte Carlo option pricing

Quantum computers have the potential to provide quadratic speedup for Monte Carlo methods currently used in various classical applications. In this work, we examine the advantage of quantum computers for financial option pricing with the Monte Carlo method. Systematic and statistical errors are handled in a joint framework, and a relationship to quantum gate error is established. New metrics are introduced for the assessment of quantum advantage based on sample count and optimized error handling. We implement and analyze a Fourier series based approach and demonstrate its benefit over the more traditional rescaling method in function approximation. Our numerical calculations reveal the unpredictable nature of systematic errors, making consistent quantum advantage difficult with current quantum hardware. Our results indicate that very low noise levels, a two-qubit gate error rate below 10−6, are necessary for the quantum method to outperform the classical one, but a low number of logical qubits (ca. 20) may be sufficient to see quantum advantage already.


Introduction
Over the last decades, many algorithms have been proposed for quantum computers that offer a speedup over the best-known classical counterpart [1]. One of them is the Monte Carlo method, for which quantum computers promise a quadratic speedup over the classical Monte Carlo method [2]. Although this speedup is less impressive than the speedup in some other quantum algorithms, the broad applicability of Monte Carlo in physics, mathematics, finance, and many other fields makes it of great interest [3][4][5][6][7][8][9]. Hereafter we will refer to Monte Carlo methods utilizing quantum computers to speed up the convergence as 'quantum-accelerated Monte Carlo' or QAMC methods.
At the core of the QAMC method lies the Quantum Amplitude Estimation (QAE) algorithm, which can estimate the amplitude of a quantum state with an error of , where N denotes the number of samples [10]. Over the last years, many different realizations were proposed for QAE with decreasing quantum circuitry footprint [11][12][13][14][15].
Monte Carlo methods are widely used in the financial modeling industry for derivative pricing [5]. One of the simplest examples of a derivative is the European call option which has well-known properties, and an analytic solution under the Black-Scholes model [16]. Pricing options on quantum computers with Monte Carlo was proposed and implemented previously [5,6].
Errors in the QAMC method were previously analyzed to some extent by Chakrabarti et al [17], but they only considered the number of logical qubits needed and did not model the noise in detail. In other papers [18][19][20] the impact of hardware noise on the statistical errors of the QAMC was analyzed with the assumption of perfect depolarizing noise, but systematic errors of the algorithm were not taken into account.
In this paper, we combine earlier approaches by considering all kinds of noise effects. We propose a novel general metric to quantify the advantage of a specific algorithm over another. In this metric, we connect systematic and statistical errors, so when optimizing for this metric, we can determine and set free parameters in the quantum algorithm to boost performance and create earlier quantum advantage.
In a recent paper, Herbert et al [21] came up with an improvement on the previous function approximation techniques using a Fourier series based approach. This Fourier method has been implemented below and was compared to the more traditional 'rescaling' approach. We can demonstrate that the Fourier method leads to smaller errors with clear advantages in option pricing algorithm design.

Pricing European call options
Our chosen example is the European call option pricing problem. A European call option is a financial contract that gives the holder the right (without obligation) to buy a certain asset at a specific future maturity time T at a predefined strike price K. Financial theory says that the option pricing problem is equivalent to an expected value calculation and, in the general case, can be approximated with Monte Carlo methods [22]. In the industrystandard Black-Scholes-Merton model [23,24], the distribution comes out as log-normal, so there is a closedform formula for the price, and this can be compared to any wider scale Monte Carlo implementation to test the efficiency of the latter.
The Black-Scholes log-normal probability distribution of the underlying stock price at time t takes the form: where ν is called the volatility of the underlying, r is the risk-free rate, S 0 is the current price. The payoff function of the option is: The option pricing task is to calculate the price of the option, which is just the expected value of the payoff at maturity discounted with the risk-free rate:

Quantum-accelerated Monte Carlo method
We will briefly discuss the main steps of the QAMC method to introduce the reader to our notation. For a more detailed exposition, see [6].
The QAMC algorithm consists of 3 parts: state preparation, function application, and amplitude estimation.

State preparation
The problem has to be discretized for the QAMC method, and we have to truncate the probability distribution to fit into a finite number of bins. The distribution is truncated to the range S w S S Var , ( )]and discretized into 2 n bins, where w is a free parameter, n is the number of qubits. This discretization is one source of the systematic errors we will keep track of in the sequel.
Then the discrete probability distribution has to be loaded into the amplitudes of the quantum states as visualized in figure 1. In our work, this was done in an exact way with a state preparation circuit for arbitrary states [25].
where i n ñ | denotes the n-qubit basis states. Note that in a typical classical Monte Carlo application in finance, the probability distribution is not known a priori, but a stochastic process is posited to evolve the price from time 0 to time T. Usually, this is the hard part of Monte Carlo pricing classically. Here, in our discussion of the QAMC method, we assume that the distribution is known and available in functional form. A full-scale general QAMC implementation should include a fast quantum circuit to generate this distribution and create proper state preparation, but this is a preceding step that we do not consider here.

Function application
Apply the desired function f (i) on the probability distribution: The probability of measuring the last (payoff) qubit in the 1ñ | state will be equal to the expectation value of the function on the discrete distribution: The piece-wise linear behavior of the payoff function (2) is achieved with a comparator circuit on the quantum computer that sets one ancilla qubit to state 1ñ | only if the state qubits are in state iñ | , where S i > K. Our implementation of the comparator circuit [6] requires n ancilla qubits.
After the comparator, only the identity function needs to be applied to the payoff qubit. Because exact quantum arithmetic is unfeasible on NISQ devices, some approximation is needed. The 'rescaling technique' for the approximation of the linear part of the payoff function was proposed [6] based on the general method of Woerner et al [3]. This technique is based on the fact that the x sin 2 ( ) function can be applied with a linear number of rotation gates. If we rescale and shift the support of the distribution so that all function values are close to 1 2 , the first term of the Taylor series of x sin 2 ( ) can be used to generate a linear function: for small x. Rescaling is parametrized by a number c, such that only x values in the range [−c, c] are used. With repeated measurements of the payoff qubit, we could approximate the expectation value using the quantum circuit in (5). This is the point where we have to use the Quantum Amplitude Estimation algorithm to achieve (quadratic) speedup, as we will see shortly.
We would like to point out that, strictly speaking, the QAMC algorithm solves a different sampling task than its classical counterpart. The expectation value of the function on the discretized probability distribution is transformed into a Bernoulli distribution with the same expectation value, and this latter is the distribution sampled.

Quantum amplitude estimation
The circuit (5) gave us the desired probability on its last qubit. With the QAE algorithm, we can find this probability with the estimation error scaling O(1/N), i.e., quadratically faster than with traditional sampling.
The Grover operator is defined as: Application of the m k th power of the Grover operator on the original state gives: where 0 y ñ | and 1 y ñ | are n-qubit normalized quantum states coming from (4), and The Grover operator is applied in a sequence of {m k } that is called the 'schedule' of amplitude estimation.
The measurement probabilities of the payoff qubit in the computational basis P 1 are then sampled, and θ a is estimated with a method like quantum phase estimation or maximum likelihood.
As mentioned previously, we use n qubits for the state preparation, n qubits for the comparator circuit, and an additional qubit as the payoff qubit. Therefore, the total number of qubits necessary is 2n + 1.
If the number of quantum samples (shots) is N shot for every Grover step, the operators  and †  corresponding to the problem get evaluated N = N shot N q times in total, where N m )is the number of queries to the operator  (or †  ) per shot. The number N is interpreted as the number of samples for the quantum algorithm.
The two most costly and problematic parts are the qubit initialization into the desired discretized distribution and the application of the target function. The first is problematic because its computational cost grows exponentially with the number of qubits. The second one has to be done approximately. Since we use a finite number of qubits, these contribute to systematic errors beyond the usual statistical errors of Monte Carlo.

Quantum advantage metric
We can compare the relative efficiency of the quantum and the classical Monte Carlo methods by calculating the ratio of the pricing errors at a fixed sample size. Distinguishing statistical errors from systematic errors, we can write where ò c and ò q denote the full error (statistical and systematic) of the classical and the quantum method, respectively; σ c and σ q denote the statistical error only (standard deviation around the expected value of the measurements), while ò syst is the systematic error in the quantum method. We can reasonably assume that the classical method is free of systematic errors. Although later on, we always calculate the above quantity numerically, the behavior of the above metric can be studied analytically.
Consider a method that promises an algebraic (power ζ) speedup over another one at the expense of some systematic error. Then the above-defined metric 10 can be written in the following way: where N shot is the sample count for the quantum circuit in every Grover step, N q is the number of queries per shot, c s and q s are the standard deviations of the classical and quantum distributions being sampled. In the presence of depolarizing noise, there is an upper threshold for the number of Grover steps (m k ), where the first term in the previous formula (11) remains constant for increasing N q [18]. One can easily see that fewer quantum shots (N shot ) lead to a greater quantum advantage Q.
If we want to generalize this quantum advantage metric and make it more useful by selecting the same runtime instead of the number of samples, with known classical and quantum runtime per sample τ c and τ q , the modified formula reads: Where we omitted the time of the measurement and reset operations τ r , since generally τ r = τ q for practical problems. In the case of the QAE, it also means that the two projection operations in each step are not taken into account since, carefully engineered, they do not contribute much to the overall depth of the quantum circuit.

Depolarizing noise
The depolarizing noise on n qubits with 'coherence probability' η, or alternatively with depolarizing 'error parameter' ò = 1 − η, is defined as the following quantum error channel [26]: where ρ is the theoretically exact density matrix of the n perfect qubits after the gate operation in question. This noise channel acting on the readout qubit causes the amplitude amplification operator of m k iterations ( m k  ) to result in a noisy amplified probability where h  is the (effective) coherence probability of the Grover operator. With a bit of approximation, we neglected the small additional error of the  operation preceding the Grover operator . Following the hardware noise through every elementary gate operation would be more realistic. However, as shown in [20], this assumption leads to a very similar overall effect characterized by an effective η Q . This is demonstrated in figure 2, where we simulated the QAE noise on the gate level and then fitted the result with an effective h  on the Grover level with convincing accuracy. The one and two-qubit gate error parameters, ò 1 and ò 2 , in the depolarizing channel, can be related to IBM's randomized benchmarking measurements by some constant multiplicative factors. For instance, the reported 'Pauli-X error' on their platform is equal to 1 2 1  and the 'CNOT error' is 3

Fourier series based function application approach
A Fourier series based function approximation technique was recently introduced by Herbert et al [21]. The technique uses the easily calculable ax sin 2 q -( )function to calculate the ax sin( )and ax cos( )functions and constructs the payoff function (weighted with the distribution) as a Fourier series from them.
We adapted this technique to European call option pricing. The extensions include taking the effect of the comparator into account and making the linear part of the payoff smoothly periodic.
The main steps of the method are to construct distinct quantum circuits l  for the lth term in the Fourier series: where Θ is the Heaviside step function. This can be done easily with a linear number of rotation gates, giving us the probabilities of measuring the payoff qubit in the 1ñ | state: The method's feasibility relies on the fact that the Fourier coefficients decay as 1/n 3 . This is only true for functions that are continuous and have continuous first derivatives, with their second and third derivatives being piece-wise continuous and bounded. That is why we have to make the linear part of the payoff function continuously periodic and calculate the Fourier coefficients that satisfy the following equations (for the coefficients, see A6): where x u is the upper limit of the linear part, a, b, c, and d are the coefficients of the third-order polynomial that ensures the sufficiently smooth periodic continuation of the linear function, and W is the period of the function that should be strictly larger than x u . Following a short calculation (see section appendix A.1), we end up with the following formula of the expectation value for the option price with the above-mentioned Fourier coefficients: From the error analysis of the Fourier technique, it will be apparent that choosing the smallest possible x u will result in the smallest statistical error. By setting the shift , we minimize x u and ensure that only the linear part of the function gets evaluated.
Obviously, we have to choose a threshold l max where we truncate the Fourier series. This will be treated as a free parameter influencing one source of systematic error.

Statistical error
The noise-dependent variance of the estimates is calculated with the help of the Cramér-Rao error bound for the QAE described in [18]. In a short summary, they create a likelihood function from the measurement probabilities (14) and calculate its Fisher information matrix regarding the parameters η and θ. The Cramér-Rao inequality provides a lower bound on the variance of the estimates: where sin 2 q ( ) is the estimated probability,  is the Fisher information matrix, η is the parameter of the depolarizing channel, and {m k } is the schedule of the amplitude estimation.
The noise model is best interpretable in terms of qubit operation error parameters, and these are the parameters in our simulations too. We used the method described in section 3.1 to fit the coherence probability h  of the amplitude amplification step (Grover operator) on the simulated results with linear Grover schedule m k = {0, 1, 2, 3,K,20}.
When we use the Cramér-Rao bound in place of the variance, we implicitly assume that the estimate is efficient. This is only true in the N → ∞ limit. To take finite sample sizes into account, we made maximum likelihood simulations for a simple 2-qubit case with different numbers of samples. To calculate the standard deviation, the simulations were repeated 1000 times each. We could fit an exponential function on the ratio of the realized standard deviation and the lower bound that will be used later for correction (figure 3).

Rescaling technique
In the case of the rescaling technique (as its name suggests), the result of the QAE is rescaled, so the variance is amplified in most cases. If QAE 2 s denotes the variance of the estimated rescaled probability, the variance of the option price is: where x max is the highest value of the domain after truncation controlled by w, c is the rescaling parameter.

Fourier series based approach
In the case of the Fourier function application technique, we perform multiple amplitude estimation tasks for the P 0 (0, 0), P l (K, 0) and P l (K, π/4) with standard deviations of the estimated probability σ 0 , l Since the Fourier technique demands additional applications of the circuit , the number of queries and so the sample size is increased, which is the same as multiplying the variance with l 2 1 max + , i.e., the variance in our formulas will be:

Classical case
In the classical case, the variance was calculated from the Black-Sholes model in a closed form (section appendix A.3). The variance of the Monte Carlo estimate can be calculated from the variance of the Black-Sholes option price:

Systematic errors
As systematic errors are exempt from statistical uncertainty, they are easily calculated numerically. To mitigate the effect of large independent systematic errors canceling each other out randomly, we calculated the systematic errors coming from different sources separately and considered the quadratic sum. The discretization and truncation errors are not easily separated numerically. However, we can obtain an analytic formula for the truncation error (see section appendix A.4) and calculate the discretization error with it.
The total systematic error is calculated as follows: Where ò trunc , ò disc are the systematic errors coming from truncating and discretizing the probability distribution, and ò func is the systematic error coming from the function approximation. The above-defined systematic error is then added to the variance (20 or 21) to obtain the total error of the QAMC algorithm:

Simulation
All simulations were done with the IBM Qiskit SDK [28]. In noisy simulations, the depolarizing channel was used on one, and two-qubit quantum gates with all qubits treated equally. This noise model, although a simplification, has a good resemblance to the noise on real devices [18,29]. The effective coherence probability (h  ) was fitted to the simulated probabilities as discussed in section 3.1. Where possible, Qiskit's density matrix method was used to obtain exact expectation values even under noise. In our simulations, we used Qiskit's built-in transpilation with maximum optimization to transpile the circuits into a basis gate set of {CX, X, SX, RZ} defined by Qiskit. These are the standard native gates on IBM quantum computers. The depolarizing noise channel was applied after every gate operation except the RZ gate, similar to real quantum hardware. After one-qubit operations, a noise channel with depolarizing error parameter ò 1 , and after two-qubit operations (CX gates) with parameter ò 2 was applied.
Throughout the simulations, we fixed the ratio between one and two-qubit gate errors to ò 2 /ò 1 = 30, and the coupling map to all-to-all coupling. We checked different settings, but it had no substantial effect on the fitted noise model effective parameter (coherence probability) h  (see figure A1).
Due to increasing computational cost, we found it infeasible to simulate quantum circuits beyond 13 qubits. However, a higher number of qubits can be easily necessary for optimal quantum advantage. To overcome this issue, we used a simple formula (26) to extrapolate to more qubits. If we assume that every application of a CX gate with error parameter ò 2 introduces the same effective error to the Grover operator, we can approximate the effective coherence probability with: where n is the number of state preparation qubits. The exponent is the total number of CX gates in the Grover operator  (a n coming from state preparation, b n + c from the other parts).
The implementation of the option pricing circuit followed [6] with slight modifications. In the Grover operator, the 0  projection onto the all-zero state is a multi-controlled Toffoli (MCX) gate that can be very costly if implemented without ancilla qubits: its CX count scales exponentially with the number of qubits [30]. With ancilla qubits, linear scaling is achievable. Luckily we already have enough ancilla qubits because of the comparator, and with the 'v-chain-dirty' implementation of MCX, the dirty ancilla qubits can be reused [31]. Our code is available on GitHub (https://github.com/udvzol/option_pricing).

Results
We show results that were made with the exponentially increasing Grover schedule m k with a base of two, i.e., m 0 = 0, m k = 2 k−1 , ∀k > 0. The rationale behind this is that this schedule provides the largest quantum speedup, and its fast increase makes numerical optimizations easier. We have to note that other schedules can be more robust against noise [18], but from our metrics point of view, they performed similarly. The maturity of the options was chosen to be T = 1 for simplicity. This does not constrain the analysis since by changing the parameters ν and r, the same behavior can be achieved. Also, for simplicity, we consider at-the-money options, i.e., where the current spot price and the strike are the same, which is usually the most relevant setup in finance. In particular, we use S 0 = K = 2.

Numerical optimization of the quantum advantage
We aimed to identify the gate error threshold that makes QAMC advantageous over the classical one. The quantum method has several parameters such as the Grover schedule {m k }, the number of state preparation qubits n, the truncation parameter w, and the number of Fourier coefficients l max or, alternatively, the parameter c in the rescaling technique. Optimization for the quantum advantage metric Q made it possible to eliminate the dependence on many parameters and only retain the dependence on the gate error. Of course, there is a trade-off between systematic and statistical errors, but during the optimization, we treat them on equal footing.
In fact, if we were optimizing for the quantum advantage metric Q alone, which is a ratio of error bars, there would be no control over the absolute magnitudes of these errors. To circumvent this, we maximize Q while restricting the individual errors below a certain threshold ò thr : We investigated and optimized quantum advantage on the following parameter grid: n from 1 to 14 with step size 1; 30 evenly spaced values for w in [0.5, 10]; 30 logarithmically spaced values for c in [10 −5 , 10 0 ]; l max from 0 to 59 with step size 1, and the maximum of the exponential schedule went from 0 to 2 23 . We worked with relatively small sample sizes, e.g., N shot = 100, changing this parameter to achieve our ò thr error constraint. Note, however, that due to systematic errors, the precision cannot be increased beyond a certain point by increasing N shot . Therefore, our interest was rather how to get the maximum out of a small, optimized implementation with a reasonable (finite) number of quantum samples.
The optimal number of state preparation qubits n changes with varying gate errors. In agreement with our expectations, the lower the noise is, the more qubits are favored since it decreases the discretization error. This behavior can be seen in figure 4. For large noise, there is no point in decreasing the discretization error because that would require more qubits and larger circuits and thus would increase the overall error level of the algorithm elsewhere.
To compare the Fourier and the rescaling methodologies and quantify the quantum advantage across several option parameters, we made a grid of options with r = {0.05, 0.1, 0.2} and ν = {0.1, 0.2, 0.3, 0.4}. The grid choice was inspired by real-world options. Our results show great variability in the achievable quantum advantage due to the unpredictable and erratic impact of systematic errors. Nevertheless, from figure 5 the following conclusions can be drawn: • For most of the options tested, quantum advantage is achieved when the CX error rate goes down to 10 −6 or 10 −7 .
• The Fourier technique performs better on every option across a wide range of CX errors.
• The relative advantage of the Fourier approach is pretty stable and does not depend on the gate error.  Typical optimal parameters in the region where we see quantum advantage (Q > 1) are 5-10 for n, around 10 −2 for c, 5-10 for w, 10-30 for l max , greatly varying, and 10-20 for the length of the schedule.

Systematic errors
We found that systematic errors have high variability and can show unexpected behavior. To assess the importance of systematic errors, we calculated the ratio of the systematic to statistical errors in the optimum, see figure 6. The statistical error (standard deviation around the mean) is generally larger than the systematic error, but the ratio remains relatively high, around 0.7 for the rescaled function application technique. For the Fourier technique, the ratio is lower almost by one order of magnitude. The fact that it was possible to decrease the ratio to these levels highlights the benefit of the technique.

Conclusion
We examined the efficiency of the QAMC method in a realistic implementation of quantum options pricing. We found that systematic errors can produce large variations as parameters of the options or the quantum algorithm are changed, and this greatly influences the method's feasibility on a small noisy quantum computer.
We found clear evidence that the Fourier function approximation technique works better in the option pricing algorithm. This finding coincides with the limited comparison done by Herbert et al [21].
According to our calculations, a CX error level of around 10 −6 will be needed for quantum advantage to show up in European option pricing. Current IBM computers have a CX error rate of around 10 −2 − 10 −3 . Our target error rates can be achieved most probably with error correction only.
Our finding that about 20 qubits might be sufficient to achieve quantum advantage may contradict the 7500 qubits Chakrabarti et al [17] deemed necessary at first sight. However, given the much more complicated problem and objective they considered, the two results complement each other.
The state preparation step has a high impact on both the noise level and the errors from discretization and truncation. The problem can be tackled by quantum generative adversarial networks, which can reduce the computational cost of the step, but introduce another type of systematic error due to its approximate nature [32].
If quantum arithmetic becomes feasible, both the probability distribution evaluation and function evaluation could be done on the quantum computer eliminating most of the systematic errors at the expense of larger noise.
The quantum Fischer information could provide a lower error bound than the classical Fischer information calculated here. However, it is only obtainable through adaptive measurements. Equivalently, modified versions of the Grover operator could lead to higher classical Fischer information [33]. These types of modifications were out of scope for this paper; only the standard algorithm was considered.
It would be worth exploring other types of noise models, but unfortunately, they can be treated only numerically.
We use these formulas in the Fourier series to reach equation (18).

A.2. Coefficients of the third-order polynomial
The following coefficients satisfy the conditions required of the third-order polynomial 17: