Non-asymptotic analysis of quantum metrology protocols beyond the Cram\'er-Rao bound

Many results in the quantum metrology literature use the Cram\'er-Rao bound and the Fisher information to compare different quantum estimation strategies. However, there are several assumptions that go into the construction of these tools, and these limitations are sometimes not taken into account. While a strategy that utilises this method can considerably simplify the problem and is valid asymptotically, to have a rigorous and fair comparison we need to adopt a more general approach. In this work we use a methodology based on Bayesian inference to understand what happens when the Cram\'er-Rao bound is not valid. In particular we quantify the impact of these restrictions on the overall performance of a wide range of schemes including those commonly employed for the estimation of optical phases. We calculate the number of observations and the minimum prior knowledge that are needed such that the Cram\'er-Rao bound is a valid approximation. Since these requirements are state-dependent, the usual conclusions that can be drawn from the standard methods do not always hold when the analysis is more carefully performed. These results have important implications for the analysis of theory and experiments in quantum metrology.


I. INTRODUCTION
Quantum metrology employs quantum resources to enhance the estimation of unknown parameters of interest that are not directly measurable [1][2][3][4]. Its final aim is to find the strategy that can extract information with the greatest possible precision for a given amount of physical resources, and thus it is an optimization problem. To solve it, first we need to define a mathematical quantity that acts as a figure of merit and informs us about the error of the estimation process. We then minimize that quantity with respect to the elements that we can typically control, that is, the physical state of the system, the measurement scheme and the statistical functions employed in the analysis of the experimental data.
A widely used method to compare estimation schemes consists in minimizing the mean square error by approaching the Cramér-Rao bound, where the latter is defined in terms of the Fisher information [5][6][7]. Although this procedure has its merits and significantly simplifies the analysis of a given strategy, in general it is only suitable when the available prior knowledge is enough to adopt a local approach and the number of experimental observations is asymptotically large [4,[8][9][10]. The latter limitation has been addressed in the context of the maximum-likelihood strategy [11,12], and more recently with the quantum Ziv-Zakai and Weiss-Weinstein bounds [13,14], which also incorporate the effect of the prior information. Nevertheless, the previous restrictions are somestimes not taken into account, in spite of the fact that a naive use of the Fisher information can predict schemes with an apparent infinite precision [15][16][17] which are inefficient in practice [4,13,16,18,19]. Since * J.Rubio-Jimenez@sussex.ac.uk in general it is not possible to foresee when and how the Cramér-Rao bound is going to fail in a concrete practical scenario from the asymptotic theory itself, a closer analysis of those schemes that are asymptotically optimal is needed.
The aim of this work is to investigate the regime of validity of the quantum Cramér-Rao bound for specific strategies that are commonly employed in the context of quantum metrology. Moreover, we provide quantitative results to understand what happens in practice with the conclusions extracted from the Cramér-Rao bound in the regime where it is not a valid approximation. This is achieved by utilising a versatile numerical framework that combines different known Bayesian techniques in a pragmatic way to answer the following question: if we have designed a quantum experiment using the criteria of the asymptotic theory, what is the impact of this simplification on the overall performance when the number of observations is not large enough?
The paper is organised as follows. We start by reviewing the Cramér-Rao bound as an asymptotic approximation for the Bayesian error and the basic tools of quantum estimation theory in Section II. Section III develops the methodology that we have followed, and our main results are presented and discussed in Section IV. In particular, we have selected several states commonly used in optical interferometry and we have obtained the mean square error for an asymptotically optimal scheme. This gives us the exact value of the uncertainty for any number of observations. Secondly, we have studied the deviations from the asymptotic approximation and the number of observations needed such that the relative error between the quantum Cramér-Rao bound and the exact Bayesian error is small. In addition, we have shown that the numerical approximation of the exact calculation is consistent with the quantum Ziv-Zakai and Weiss-Weinstein bounds.
Our results verify that both the number of observations and the minimum prior knowledge needed to achieve the asymptotic regime are state-dependent. This has allowed us to show how the conclusions about the relative performance of different states change in the non-asymptotic regime for optical schemes. As a consequence, in general we can say that maximizing the Fisher information alone does not always guarantee the best precision for experiments with a limited number of observations.

II. BASIC THEORY
This section includes a summary of the context needed to understand in which sense the Cramér-Rao bound can be seen as an approximation and how this motivates our analysis. A more comprehensive introduction to estimation theory and its application to quantum metrology problems can be found in [4], and a reader already familiar with these ideas can skip straight to the methodology in Section III and our main results in Section IV.

A. Uncertainty in single-parameter estimation
Given an experiment where n = (n 1 , n 2 , ..., n µ ) are the outcomes of µ independent observations, an estimation function g(n) can be constructed to estimate the unknown parameter θ. The precision of this procedure is expressed with an error function [g(n), θ], and the uncertainty averaging over the different values the underlying parameter can take as well as the different measurement outcomes that can be obtained is defined as where p(n, θ) is the joint probability density function for the variables of the experiment. In addition, the product rule implies that p(n, θ) = p(θ)p(n|θ). The function p(θ) is the prior probability density, and it encodes what is known about the parameter before the experiment is performed. This information can be given, for instance, by the results of previous experiments, and it will typically include the domain a θ b in which we can expect to find the parameter. The information about the outcomes of the actual experiment is encoded in the likelihood function p(n|θ), and for a quantum system, the Born rule establishes that where we have considered the following protocol: 1. A probe state ρ 0 is prepared.
3. A positive-operator valued measure E ni is used to model the measurement scheme.
4. The previous three steps are repeated µ times.
When the parameter to be estimated is periodic, as is the case for optical phase shifts, a periodic error function is the most suitable choice. The simplest option that satisfies the requirements of this symmetry is [4] [g(n), θ] = 4 sin 2 g(n) − θ 2 .
However, since sin 2 (x) ≈ x 2 when x is small, for a parameter domain less than one period Eq. 1 can be approximated as which is the mean square error [20]. The limitations of this approximation are discussed in Appendix A for the specific scenarios considered in Section IV.
Assuming that the prior of the experiment is given and the encoding operator is known, the optimization of the metrology protocol is achieved by minimizing Eq. 4 with respect to the estimator, the measurement scheme and the probe state. If we look at Eq. 4 as a functional of g(n), then the optimal estimator is determined classically by solving the variational problem [10] δ¯ mse [g(n)] = δ dn L [n, g(n)] = 0, where L [n, g(n)] = dθp(n, θ) [g(n) − θ] 2 . As a result we have that where is the posterior density function defined by means of the Bayes theorem. Hence, Eq. 4 becomes with p(n) = dθp(θ)p(n|θ) and Note that Eq. 9 is the variance of the parameter with respect to the posterior for the experimental data n [4]. The calculation of Eq. 8 is still very challenging in general, and therefore it is important to identify further approximations that simplify the problem in practice. To accomplish that task, let us imagine a hypothetical scenario where the likelihood p(n|θ) as a function of θ becomes narrower and concentrated around a maximum whose value is the unknown parameter θ when µ 1. In addition, the prior knowledge is enough to identify a region of the parameter domain in which θ can be found, although the experimental information dominates in this regime. In that case, the posterior function p(θ|n) can be approximated by the Gaussian density [7,10] where is the Fisher information and n is the outcome for a single observation. Moreover, we further assume that the Fisher information does not depend on the parameter, so that F (θ) = F for all θ. Thus we are able to approximate Eq. 8 as¯ This result is known as Cramér-Rao bound in the context of local estimation theory [4,6,8], although here we are using it as an approximation under certain circumstances to the Bayesian uncertainty defined by Eq. 1 and Eq. 3, and not as a proper bound. More concretely, Eq. 12 holds when the number of observations µ is very large and the prior information is enough to localize the relevant domain. These properties define the asymptotic regime.
The details of this known heuristic argument are reviewed in Appendix B. Furthermore, a more rigorous approach based on the theory of local asymptotic normality can be found in [21,22]. According to Eq. 11, the Fisher information only depends on the likelihood function, which is constructed out of the measurement scheme and the transformed state. By maximising it over all the positive-operator value measures, it is possible to prove the inequality [5,[23][24][25] where F q (θ) is the quantum Fisher information and the symmetric logarithmic derivative L(θ) satisfies This bound is saturated if the measurement scheme is given by the projections onto the eigenstates of L(θ) [24,25].
Since the parameter is encoded with a unitary transformation, the quantum Fisher information will not depend on θ explicitly [4]. In that case, the saturation of Eq. 13 implies that the approximation in Eq. 12 becomes which is known as quantum Cramér-Rao bound in the local approach [4,8]. From this we can conclude that the asymptotic optimal precision is a function of ρ(θ) alone and that to find optimal probes in this regime we just need to maximize the quantum Fisher information. Nevertheless, from a physical perspective the number of observations is always limited by the available resources. In consequence, whenever two strategies are being compared in terms of the quantum Cramér-Rao bound, in general it is also necessary to indicate how large µ needs to be such that Eq. 15 is a good approximation. Moreover, if the likelihood reaches its maximum for several values of the parameter, then we need enough prior knowledge to select a single peak. The verification of the fulfilment of these crucial restrictions is not always done in the literature, a problem that can be overcome by using the framework of the next section.

III. METHODOLOGY
The procedure described in Section II does not specify the order of magnitude of µ nor the minimum prior knowledge that this strategy requires. Although the early proposal of [12] answers to the former question by generalizing the likelihood equation in the local context and [13] catches the influence of the prior probability to some extent, there is not a method that takes into account the combined action of these restrictions simultaneously and exactly in practical scenarios. This motivates the search of a more general approach. A solution to this problem is provided by combining different known Bayesian techniques into a pragmatic methodology.

A. Experimental configuration and prior knowledge
Let us consider that we arrange an experiment such that a system described by ρ(θ) is measured with a scheme that is optimal with respect to the quantum Cramér-Rao bound. This configuration is then summarized with p(n|θ) through Eq. 2.
On the other hand, in Section II we discussed that the likelihood function needs to be concentrated around its highest peak in order to be able to use the approximation in Eq. 12 (see also the construction reviewed in Appendix B). This local behaviour implies that, for a given scheme, the width of the parameter domain must be such that the solution to the problem ∂p(n|θ)/∂θ = 0 includes an asymptotically unique absolute maximum. Hence, we introduce the quantity W int , which we call intrinsic width, and we define it as the width that fulfils the above criterion on average. Notice that if W 0 > W int , where W 0 is the initial width of our scheme, then the experiment cannot distinguish between two or more equally likely values, and the mean square error tends to a constant when µ 1. In practice, the prior information is determined by the experimental configuration under consideration. We will see that different states are associated to a different W int ; consequently, only those states with a value for W int that is greater than or equal to the width imposed by the experiment would be useful in a real scenario.
For optical phases, and assuming that the only information known a priori about the parameter includes the length of the relevant domain, a flat prior is a reasonable choice, since it does not modify the information of the likelihood in the region where it becomes narrower. In addition, it simplifies the calculations. Therefore, we will consider that this probability distribution is the uninformative intrinsic prior of our particular strategy, and we will use it for our analysis [26].
To find W int we can plot the posterior probability p(θ|n) as a function of θ directly, since its relative extremes coincide with those of the likelihood when the prior is flat. This procedure depends on the simulation of several random outcomes n for different values of the parameter, and thus the solution is necessarily probabilistic. However, this is enough for our purposes because our analysis only requires that this is satisfied in the asymptotic regime, where µ is large.

B. Numerical strategy
We now have all the pieces that are necessary to calculate Eq. 8 exactly, which is the next step of our strategy. Since this integral has (µ + 1) dimensions and we are interested in studying its behaviour as µ increases, in general we can only compute it numerically. While this is a purely numerical problem that arises in the Bayesian literature [4,10] and can be treated with well known numerical techniques [27,28], we believe that giving an explicit scheme of calculation in terms of physical arguments as part of the methodology offers conceptual clarity and insight. In particular, we have followed a three-step method: 1. If a collection of µ experimental outcomes n was originated from the unknown parameter θ , and assuming the knowledge of p(n|θ) and W int previously discussed, then the error of the estimation based on that particular experiment will be given by Eq. 9, that is, by the variance of the posterior probability p(θ|n). Moreover, this uncertainty is understood in [29] as the error that arises from gathering and processing data in a real experiment. The integral that defines this quantity can be calculated with a standard deterministic method after the simulation of n for a given θ , which implies that Eq. 9 depends on θ through the values of the outcomes.
2. According to Eq. 9, different uncertainties (n) can be associated to the estimation depending on the particular values n. Therefore, if our aim is to simulate experiments whose performance is optimal on average, we need to calculate the average of the errors for all the possible experimental outcomes associated with θ weighted by their likelihood, i.e., This is precisely what is done in [29]. The multidimensional integral in Eq. 16 can be solved using Monte Carlo techniques [27,28].
3. The previous quantity still depends on θ , which is not known. However, by taking the average weighted over our prior knowledge of θ we finally obtain the mean square error, which is independent of the values of both the parameter and the outcomes. Following the previous discussion,¯ mse represents the uncertainty on average about the knowledge that we can acquire in principle with the experimental configuration that is being studied, and as such it is the suitable figure of merit to design experiments from theoretical considerations. The integral over θ can be calculated by a deterministic numerical method once (θ ) is known for different values of θ from the second step.
Although there are other ways of implementing this calculation [30], the reason to choose the strategy described above is twofold. Firstly, it offers a clear physical motivation for the use of the measure of uncertainty defined in Eq. 1 as the figure of merit. Secondly, its numerical implementation is relatively straightforward, and it has turned out to be robust against small variations of the numerical parameters for a reasonable number of iterations.

C. Classical approximation threshold
Our final goal is to quantify the deviation of the quantum Cramér-Rao bound as a function of the number of observations. Once we know the exact value of Eq. 8 for our particular scheme, a simple way of achieving this is to introduce the relative error for¯ mse = 0. This will give us the minimum number of observations µ τ that is needed such that the approximation in Eq. 15 is valid for a given threshold ε τ , which should be chosen according to the requirements of the specific experimental configuration that is being analysed.

D. Bayesian quantum bounds
A different approach that can also identify the situations in which the Cramér-Rao bound fails is based on deriving alternative quantum bounds that are valid for all µ. This idea was precisely explored in [13,14], where the two main families of classical Bayesian bounds [31] were extended to the quantum regime. According to their results, the quantum Ziv-Zakai bound for a flat prior between a = 0 and b = W 0 is [13] where f (θ) = ψ 0 |ψ(θ) , |ψ 0 is a pure state and |ψ(θ) encodes the parameter with a unitary transformation. In addition, the quantum Weiss-Weinstein bound establishes that [14] mse There also exists a Bayesian version of the Cramér-Rao bound based on the van Trees inequality [32]. Unfortunately, its derivation requires that the prior satisfies the boundary conditions p(a) → 0 and p(b) → 0, and this excludes the case of the flat prior between a and b.
In spite of the utility of this method, the key advantage of using the direct calculation of the mean square error instead is that then we are evaluating the validity of the Cramér-Rao bound exactly. Nevertheless, we will still make use of these bounds as a consistency test for the numerical evaluation of Eq. 8.

IV. RESULTS AND DISCUSSION
The methodology that we have described is general enough to accommodate a wide range of estimation problems, and in this section we explore its application to phase estimation in optical interferometry [4,33]. These results constitute the main contribution of this work.
Let us assume that we are working in the number basis of a two-path interferometer, and that the parameter θ is encoded as a difference of phase shifts by means of the unitary transformation U (θ) = exp[−i(a † 1 a 1 − a † 2 a 2 )θ/2], where a i , a † i are the creation and annihilation operators for the modes i = 1, 2. Here we focus on a collection of states that together represent the common techniques currently used in quantum metrology [4,29,34,35]. Concretely, we consider: where U BS = exp[−i(a † 1 a 2 + a † 2 a 1 )π/4] is a 50:50 beam splitter and D(α) = exp(αa † 1 − α * a 1 ) is the displacement operator.
Since coherent states present no quantum correlations, their precision is asymptotically given by the standard quantum limit. Contrarily, NOON states have intermode and intra-mode correlations and can achieve the Heisenberg limit, although the twin squeezed vacuum also achieves a Heisenberg scaling having intra-mode correlations only [4,34]. Finally, the squeezed entangled states, which have both types of correlations, constitute a precision improvement over the previous states [29]. Note that we have selected pure states for the sake of simplicity, but our methods would be also applicable to mixed states. A common property of these configurations is that they belong to the family of path-symmetric states introduced in [36]. Therefore, their classical Fisher information will reach the bound imposed in Eq. 13 by its quantum counterpart if we implement a photon-counting measurement after the action of a 50:50 beam splitter. This implies that any discrepancy between Eq. 8 and Eq. 15 must necessarily come from the approximation that we discussed in Section II B.
The first step to apply our numerical strategy is to identify the intrinsic width W int of each state for a given mean number of particles per proben. Some of the random simulations that are required to achieve that goal are shown in Figure 1, which allow us to deduce the size of the maximum width by direct examination [37]. For a twin squeezed vacuum and a squeezed entangled state we have found that W int = π/2, while coherent states have W int = π. The latter value was also determined by a different method in [38]. Note that those results hold for anyn. On the contrary, with NOON states we have that W int = π/n or W int = π/(2n) depending on whether the value for N in Eq. 22 is even or odd. It can be observed that none of the states allows us to uniquely identify the relative phase shift when we have no information about its possible values, that is, if W 0 = 2π. Moreover, the NOON states present an intrinsic width smaller than 2π/n, which is their natural periodicity. We conclude then that the scheme that we are employing introduces some limitations to the estimation of the parameter, in spite of the fact that the measurement is optimal according to the quantum Cramér-Rao bound criterion.
Once W int is known, we calculate Eq. 8, Eq. 15 and Eq. 18 with the uniform prior and p(θ) = 0 otherwise. The results are shown in Figure 2.a and Figure 2.b, where we have assumed that the experiment can only be repeated µ = 10 3 times as an extra constraint. For this number of observations, the mean square error of coherent, NOON and twin squeezed vacuum states is close enough to the result predicted by the quantum Cramér-Rao bound. In particular, their relative error is smaller than the selected threshold ε τ = 5. However, the minimum number of observations that are needed in order to reach that threshold is different for different states, and the squeezed entangled state does not even reach it in the regime that we are studying. This state-dependent phenomenon, whose concrete values are indicated in Table I, has important consequences.
If we consider first the comparison between a NOON state and a twin squeezed vacuum withn = 2, W int = π/2, we can see that the latter is a better choice according to the Fisher information, but its error is higher for µ < 20. Even if we focus on the results of the asymptotic regime, the twin squeezed vacuum requires µ ∼ 10 3 observations to achieve it, while the NOON state only needs µ ∼ 10 2 . Thus a state whose Fisher information is maximum with respect to other probes can still produce a larger error if the experiment is operating outside of the asymptotic regime. Moreover, although it was shown that only the intra-mode correlations are crucial to surpass the standard quantum limit in the regime where the Fisher approach is valid [34,39,40], this comparison between a NOON state, which includes both types of correlations, and a twin squeezed vacuum, that has intra-mode correlations only, suggests that the role of quantum correlations in metrology should be revisited for the non-asymptotic regime.
On the other hand, a coherent state withn = 2, W int = π is less precise than a NOON state withn = 1, W int = π/2 when µ ∼ 1. This implies that there is a region in which a probe with fewer resources can still beat a scheme with more photons if the prior knowledge of the former is higher. By combining these observations with those extracted from the previous probes we conclude that the Cramér-Rao bound can both overestimate and underestimate the precision outside of its regime of validity. It is particularly relevant to draw attention to the latter case, since the fact that NOON and coherent states display a mean square error which is lower than their respective Cramér-Rao bounds for low values of µ demonstrates that the unbiased estimators of the local theory are not always optimal [41].
The analysis of the squeezed entangled state provides further details of the properties of the non-asymptotic regime. In particular, its performance is worse than all the previous cases for µ ∼ 10, and it only becomes the best choice when the number of repetitions is greater than µ ∼ 10 2 . Surprisingly, this result is showing that while states with an indefinite number of photons can do better than the optimal choice for a finite number of quanta, NOON states have the best absolute precision among the cases that we have studied if the number of observations is less than µ ∼ 10.
To have a fairer comparison, we have also repeated the calculation with a common width W 0 = π/3 andn = 2. Figure 2.c and Figure 2.d show that, while the numerical values are slightly different, the qualitative conclusions are the same. Nonetheless, there is an important difference given that the prior knowledge is now higher. For the NOON and coherent states, µ τ has increased with respect to the previous calculation, since the starting difference between the mean square error and the bound is now greater. On the other hand, for the twin squeezed vacuum there is a point where now the mean square error crosses the Cramér-Rao bound before a stable saturation is reached. This happens because for W 0 = W int the mean square error approached the bound from above, while for W 0 = π/3 the error begins below the bound and then crosses it to achieve the asymptotic regime from above. This suggests that if we keep increasing our prior information and we make the width of the parameter domain very small, then the number of observations needed to approach the Cramér-Rao bound will grow.
It is possible to formalize the previous phenomenon and derive an intuitive and informative relation that detects states that are not well-behaved. Firstly, we note that the uncertainty of an estimation that is made before we perform the experiment is represented by the variance of the prior probability which is W 0 2 /12 for a flat distribution of width W 0 . On the other hand, we know that the precision is given by the Fisher information when µ 1; consequently, an estimation protocol is only worthwhile when is asymptotically satisfied, where we have made explicit the dependence on the state to indicate that the values of µ and ∆θ 2 p guarantee that the Cramér-Rao regime can be reached. If Eq. 27 were not fulfilled, then the experiment would not be telling us more than what we already knew. By reorganizing the terms we finally arrive to Probe staten Wint µτ (Wint) µτ (W0 = π/3) |α/ √ 2, −iα/ √ 2 2 π 3.9 · 10 4.97 · 10 2 NOON state (even N ) 2 π/2 1.15 · 10 2 2.67 · 10 2 NOON state (odd N ) 1 π/2 5.26 · 10 2 -S1(r)S2(r) |0, 0 2 π/2 8.74 · 10 2 5.95 · 10 2 N (|r, 0 + |0, r ) 2 π/2 > 10 3 > 10 3 Table I. Numerical values of Wint and µτ obtained in Figure 1 and Figure 2, respectively, for an asymptotically optimal strategy and a threshold ετ = 5. The representation of the posterior probability p(θ|n) for the squeezed entangled state that provides the value of its intrinsic width was very similar to that of the twin squeezed vacuum, and therefore it has been omitted in Figure 1 for brevity. In addition, note that we have chosenn = 2 for most of our schemes in order to detect a significant improvement over the standard quantum limit.
which is a constraint based on practical requirements. According to Eq. 28, the number of required observations will increase when the Fisher information is fixed and the prior knowledge is improved, which is consistent with the results of Figure 2. Furthermore, we have seen that the prior width cannot be arbitrarily large if we want to employ certain states in an experiment. Thus, if we maximize the Fisher information at the expense of decreasing the maximum prior uncertainty, and the latter phenomenon is faster, then the number of observations will tend to infinity [42]. This is precisely the case of the family of one-mode states that was considered in [43], where 0 < δ < 1, N =n and N/δ is an integer. To see it, we notice that the analysis of its periodicity for the unitary transformation U (θ) = exp[−i(a † a)θ] indicates that W int 2πδ/n, which implies that ∆θ 2 p π 2 δ 2 /(3n 2 ), and the quantum Fisher information is F q = 4n 2 (1 − δ)/δ. Hence, we have that The Fisher information suggests that we can get an infinite precision in the limit δ → 0 for a fixed number of resources per observationn, but Eq. 30 shows that this conclusion only holds if the total number of resources is actually infinite, which is consistent with the results of [13,16]. From a physical point of view we conclude that it is not advantageous to use states for which the majority of our resources have to be employed in making our scheme as sensitive as the prior uncertainty that we already had.
To implement the last step that verifies the consistency of our numerical strategy, we need to calculate the alternative bounds that were introduced in Eq. 19 and Eq. 20. Figure 3 shows the results of this procedure. As we expected, both the quantum Ziv-Zakai and Weiss-Weinstein  NOON state withn = 2 and Wint = π/2, c) NOON state withn = 1 and Wint = π/2, d) twin squeezed vacuum withn = 2 and Wint = π/2, and e) squeezed entangled state withn = 2 and Wint = π/2. This shows that the alternative bounds are valid for any µ. Interestingly, the Ziv-Zakai bound is tighter when µ ∼ 1, although the best choice in the asymptotic regime is the Weiss-Weinstein bound. In addition, the Weiss-Weinstein bound and the Cramér-Rao bound overlap for the squeezed entangled state, although they are different in the low observation number limit of the other probes. bounds are lower than the numerical mean square error, including the regions where the quantum Cramér-Rao bound fails. The reason is that these bounds are valid for both biased and unbiased estimators [13,14,31], and as such they correctly lower-bound the uncertainty for low values of µ, in contrast to the Cramér-Rao bound. Moreover, the Weiss-Weinstein bound is tight when µ 1, as proven in [14]. However, its rate of convergence is different from the exact rate obtained in Figure 2.b and 2.d, and the Ziv-Zakai bound is not perfectly tight in any regime. This justifies the use of the direct calculation of the mean square error as a more suitable strategy for this problem.

V. CONCLUSIONS
We have explored the limitations of approximating the Bayesian mean square error by the quantum Cramér-Rao bound for practical scenarios that are relevant in quantum metrology. This study has been performed by simulating and calculating the mean square error exactly, a process that involves an analysis of the prior knowledge required by a given state and that provides an estimation for the number of observations that are needed to reach the asymptotic regime. Furthermore, we have shown that these results are consistent with the quantum Ziv-Zakai and Weiss-Weinstein bounds, which are always valid. This has allowed us to improve our understanding of both the non-asymptotic regime and the impact of the deviations that the asymptotic theory introduces in the overall performance.
We have applied this strategy to coherent, NOON, twin squeezed vacuum and squeezed entangled states for the estimation of phase shifts in optical interferometry, verifying that the conditions for approaching the Cramér-Rao bound crucially vary with the state of the system. Moreover, we have proposed a simple criterion to detect states whose required number of observations is infinite.
From the results of our simulations we can conclude that maximizing the Fisher information alone is not always enough to find the best precision in general. For instance, while a twin squeezed vacuum outperforms NOON states according to the Fisher information, we have found that this conclusion does not hold when the number of observations is low. Similarly, a squeezed entangled state is asymptotically better than the previous examples, but it is the worst choice for small values of µ. In fact, a coherent state with no correlations and a NOON state with less photons per observation outperform it when µ ∼ 10. An additional lesson extracted from Section IV is that future work should revisit the role of inter-mode and intra-more correlations and the use of states with an indefinite number of quanta to enhance the precision in the non-asymptotic regime.
As a consequence, for a real experiment either we need to perform a fully Bayesian analysis or we must estimate explicitly the number of observations that are required to guarantee that we are operating in the asymptotic regime if we want to follow the path of the Fisher information. This practice will improve the quality and fairness of the comparisons between states, helping us to understand the fundamental limits of estimation theory and aiding the design quantum sensing protocols for quantum technologies.
Periodic error Mean square error Figure 4. Comparison between the prior uncertainty (µ = 0) given by a periodic error function and that associated to the mean square error as a function of W0. Most of our results in Section IV are calculated using the values W0 = π/2 and W0 = π/3. which is the prior uncertainty that we would have found using the mean square error directly.
In Section IV we calculated the mean square error for NOON, twin squeezed vacuum and squeezed entangled states with W 0 = π/2, and W 0 = π/3 was also employed with both the previous states and for a coherent beam. According to Figure 4, which compares Eq. A5 and Eq. A6 as a function the width W 0 , the approximation is reasonable for these configurations when µ = 0. Moreover, |g(n) − θ| will not be greater than W 0 for µ > 0, and therefore a similar reasoning can be applied to Eq. 4. The only scheme for which this approximation is cruder is a coherent state with W 0 = π.
As a consequence, overall we can conclude that the results of Section IV are a reasonable numerical approximation to those that we would have obtained should we have used the periodic error function instead, and they certainly constitute an improvement with respect to the usual asymptotic theory. Future work should provide an exact analysis of the non-asymptotic regime for phase estimation.
in Eq. 9, and we arrive to (n) ≈ 1 µF (θ ) . (B7) Finally, we notice that the states employed in this work satisfy F (θ) = F for all θ. Combining this fact with both Eq. B7 and dnp(n) = 1 we conclude that the optimal mean square error in Eq. 8 can be approximated by the Cramér-Rao bound in Eq. 12.