Probabilistic metrology or how some measurement outcomes render ultra-precise estimates

We show on theoretical grounds that, even in the presence of noise, probabilistic measurement strategies (which have a certain probability of failure or abstention) can provide, upon a heralded successful outcome, estimates with a precision that exceeds the deterministic bounds for the average precision. This establishes a new ultimate bound on the phase estimation precision of particular measurement outcomes (or sequence of outcomes). For probe systems subject to local dephasing, we quantify such precision limit as a function of the probability of failure that can be tolerated. Our results show that the possibility of abstaining can set back the detrimental effects of noise.

Most quantum metrology schemes found in the literature and their corresponding bounds are deterministic. That is, these schemes are optimized in order to provide a valid estimate for each possible measurement outcome, in such a way that the average precision is maximized. Recently it has been shown that for a fixed probe state, and in the absence of noise, the precision of some particular (favorable) outcomes can be greatly enhanced, well beyond the limits set for deterministic strategies [32][33][34][35]. The possibility to post-select, i.e., to abstain from providing an estimate some times, can even change the uncertainty from SQL to Heisenberg scaling. It has also been shown that the limit on the precision of these probabilistic metrology strategies agrees with that found for deterministic strategies when the optimization over probe states is performed. So, for pure states, probabilistic metrology can compensate a bad choice of probe state, or in other words, it can attain the optimal precision bounds in situations where the probe state is a given.
Here we study the performance of probabilistic metrology in the presence of noise. We will show that probabilistic metrology can substantially lessen the effects of local dephasing noise, although not enough to overcome the infamous loss of asymptotic Heisenberg scaling [17,18]. In addition, and in contrast to the noiseless ideal case, the ultimate precision bounds for probabilistic metrology will be shown to exceed those attained by deterministic strategies optimized over probe states.
To put these results into context, we recall that in most quantum metrology schemes the probe is a composite made up of a large number of elementary quantum systems . We then envisage the following situation: An ensemble of fifty-thousand two-level atoms has been prepared in a known probe state ρ and is awaiting for a signal coming from a supernova, such as a gravitational wave or some byproduct of a gamma burst. The experiment is designed in a way that the signal will leave an imprint on the state of the atoms, ρ → ρ θ , that will depend on the value of some relevant physical parameter θ of the signal. The experimenter will perform a measurement on the atoms and will try to infer from the arXiv:1610.08272v1 [quant-ph] 26 Oct 2016 outcome the unknown value of θ. The experimental setup is perfectly calibrated and characterized. At a certain time the long-awaited event occurs. Conditional on the obtained measurement outcome, the experimentalist reports a value of θ = 2.23 with a mean square error of σ 2 = 10 −7 . How should the community react if the reported error is smaller than the (deterministic) bounds found in current literature, e.g. , σ 2 < 1/n = 2 · 10 −4 ?
The direct answer is that the community should celebrate the result without reservation. The error obtained in a single outcome can be smaller than the corresponding limits found in the literature, which are based on bounds on the average error over all outcomes. It is no surprise that the precision depends on what particular observation one happens to obtain: some observations are better, more informative, than others. The apparent contradiction disappears when one realizes that ultra-precise outcomes can only occur if infra-precise outcomes also exist (so as to respect the deterministic bounds).
Here we show that by a suitable choice of measurement it is possible to obtain ultra-precise outcomes, whose precision is still limited, but goes beyond the (average) bounds established for deterministic quantum metrology protocols. We also show that such ultra-precise outcomes can only occur with a small probability.
In a scenario where each outcome can have a different precision, the criterion for optimality is by no means unique. In a classical setting there is no compromising choice to be made. One can find the optimal estimator and precision for each measurement outcome independently. However, in the quantum case one needs to fix the POVM through some criteria. The ultimate quantum limit is obtained by choosing a POVM that produces an outcome with the highest possible precision; deterministic protocols are optimized in order to produce the highest possible precision on average (over all possible outcomes). The protocols that we study here under the name of probabilistic quantum metrology interpolate between these two cases, by optimizing the precision of successful outcomes with the constrain that they occur with some prescribed probability.
Understanding the power of probabilistic operations in general quantum tasks is a highly non-trivial and relevant problem. Probabilistic operations introduce through normalization a very particular non-linearity that is in stark contrast with the linearity of quantum deterministic operations. Many no-go theorems stem from linearity and probabilistic operations might revoke them, turning the once-thought impossible into possible. Notable examples include unambiguous state discrimination, whereby nonorthogonal states can be distinguished with no error [36] (see [37,38] for generalizations); probabilistic cloning [39] the KLM scheme [40], whereby Bell measurements can be realized by linear optical elements [41]; also, related to the current work we find probabilistic amplification [42][43][44][45][46][47] and weak-value amplification [48][49][50][51][52].
Although the probability of attaining the ultimate bounds is often small, at a fundamental level it is im-portant to distinguish between ultimate versus de facto quantum limits. No matter how unlikely an event is, once it occurs it is a certainty; and certainties cannot violate ultimate bounds [53]. This fundamental distinction has also motivated the definition of a complexity class in quantum computing [54]. All in all, post-selection can be considered a resource per se in quantum information tasks, and this paper is devoted to the study of its power for metrological tasks in realistic noisy scenarios.
Our work becomes particularly relevant in applications where: a) There are high demands on precision. We are already at a stage where quantum metrology is required to push the limits of precision. Hence it might well be that a known optimal deterministic protocol, e.g., the phase covariant measurement, fails to provide the precision required for a specific task, whereas a probabilistic scheme does not. There are tasks for which having an estimate below a certain precision is at least as bad as having no estimate at all. For instance, when locating a tumor in radiation therapy, or some deeply buried magnetized material for its extraction, missing the true position of the target by more than certain threshold value can have disastrous consequences. b) Resources are fixed. As in the first example above, it might be impossible to repeat an experiment (for a given instance of the unknown parameter).
In real life applications, we care about the precision attainable in absolute terms. We wish to add consistent error bars to our estimates, not just demonstrate a particular scaling of the uncertainty as the number of resources increases. Our results hold both for finite and asymptotically large number of resources. Fixing the number of resources is not only necessary to state optimality in a clear-cut way. In many situations the limitations on the available resources are patent. It might be a given, as in the measurement of the magnetization of a particular magnet; or in the measurement of a parameter that changes rapidly with time (a requirement in some feedback schemes). Last but not least, there are situations in which the experiment is impossible to reproduce because the event under study is uncontrollable, e.g., a supernova or the arrival of a gravitational wave. It is precisely in observational astronomy where the (classical) bayesian approach to statistical inference, on which our analysis relies, is widely used in current studies (see for instance [56]). This paper is organized as follows. Section II describes the theoretical framework of this work. We state the general working assumptions and discuss the statistical approach(es) used throughout the paper. Section III contains the core findings of our work. To ease the presentation, we have divided it into different subsections that contain different results. We give general expressions for the uncertainty and probability of success for covariant measurements, closed expressions for asymptotically large number of resources as well as the optimal probe states and the ultimate precision bounds. We also address the issue of the information left in the system after a discarded event. In the last section we state our main conclusions and discuss possible implementations of our scheme. More technical results are presented in the appendices. In particular, A introduces specific notation that is used in the derivation of some of these results.

II. FRAMEWORK
In this section we introduce our framework in detail. We set out our physical assumptions and goals, and discuss why, in view of these assumptions, the Bayesian formulation suits our purpose better, while also allowing for a straightforward extension to the probabilistic case. Alternatively, the minimax approach, or worst case scenario, is also well suited. For the problem at hand, the later is shown to be equivalent to the Bayesian formulation. The relationship with the frequentist approach is discussed at the end of the section.
Our framework consists of the following: a) A model. We assume there exists a model that formalizes the "state of the world" and the measurement. In our case the former is given by the quantum state ρ θ (see Sec. III) of a physical system of interest, where θ is a real parameter. It can, of course, take into account noise sources and other experimental imperfections. The measurement outcomes and the state of the world are related through a measurement model (Born's rule in our case) that gives the probability distribution of the outcomes for a given θ. The true value of θ is assumed to be unknown, while the rest of the model parameters are known with high precision. Our goal is to infer the value of θ from the observed measurement outcome.
b) A fixed amount of resources. We view the size of the probe state ρ θ as the amount of resources of the problem. More precisely, we quantify the amount of resources by the number n of qubits that the estate ρ θ describes. Accordingly, if an experiment consists of repeating a measurement on independent copies of a system of N qubits a number ν of times, the total number of resources used is n = νN . c) An optimization over measurements. To establish ultimate bounds on the precision that can be achieved with given resources, we optimize over all measurements, the most general being a single collective measurement on the whole available resources, i.e., on ρ θ . Hence, in our framework the inference protocols are single-shot. Namely, in each instance of the problem the state of the world is labeled by a particular (unknown) value of θ, and the measurement returns a single outcome χ out of the various possible outcomes of the measurement. Based on χ, an estimate θ χ of the true value of θ is produced.
Note that this characterization is fully general and includes strategies where each qubit or subsystem is mea-sured individually, in which case every collective outcome is labelled by a sequence of individual measurement output labels. d) A report of precision. After a successful completion, the scheme should return an estimateθ of the true parameter θ, together with suitable error bars. Error bars are essential in any scientific or technological discipline, as they quantify the confidence one should place in conclusions drawn from existing data. Such an assesment of the precision should be quantified bearing in mind that the whole set-up is single-shot, i.e., the experiment will not be necessarily repeated, and that the true value of the parameter is unknown. Hence, the precision assessment can only be based on the measurement outcome and on the precise knowledge of the model in a).
To carry out c) and d) we need to introduce a so-called cost or loss function (θ,θ) that quantifies how well our estimateθ agrees with the true value of θ. There are a priory infinitely many such functions, but two common choices in metrology are the quadratic loss function q (θ,θ) = (θ −θ) 2 , and p (θ,θ) = 4 sin 2 [(θ −θ)/2] if θ belongs to a periodic domain. Note that they are equivalent to leading order inθ − θ, when the estimate approaches the true value,θ ≈ θ.
The Bayesian formulation offers a very natural and rigorous way to assign a quantitative precision measure to a particular outcomeθ. In this formulation the unknown parameter θ is treated as a random variable and it is assigned a (prior) probability π(θ). This probability reflects the knowledge we have on the state of the world prior to the measurement. After performing the measurement, the observed outcome and our knowledge of the model is used to update the prior π(θ) to the posterior probability distribution p(θ|θ). Using Bayes' rule, we can write p(θ|θ) = p(θ|θ)π(θ)/p(θ), with p(θ) = dθ p(θ|θ)π(θ). Then the uncertainty of an outcomeθ is defined as Thus, Lθ quantifies how the unknown value θ is scattered around its estimateθ, in the light of the information gathered by the measurement. In general, the various outcomes of a given measurement have different precision. Hence, to quantify the overall performance of a metrology scheme by a single figure of merit we take the average uncertainty over all possible measurement outcomes, where the joint probability is p(θ,θ) = p(θ|θ) π(θ). Eq. (2) is the expected loss L, given by the average over all possible values of the unknown parameter θ and over all measurement outcomes.
With this in mind, we now focus on probabilistic metrology. As discussed in the introduction, we can improve performance if we give up on the idea of deterministic protocols, by allowing for failures to perform the tasks they have been designed for. Accordingly, probabilistic metrology protocols will either succeed and provide a precise estimateθ, or warn of failure (abstain). Following these premises, the figure of merit for such protocols are given by the average uncertainty of the successful outcomes, i.e., by Conditional expectations such as this are the cornerstone of bayesian estimation. Their use is wide-spread and established in a number of disciplines, such as control theory or signal processing, where an accurate and rigorous assessment of the precision is required -see for instance [57]. In order to give a complete characterization of the probabilistic protocol, one should supplement the attained uncertainty L s , with the corresponding probability of success, S. We will derive the tradeoff curve L s (S) that gives the minimum uncertainty for every fixed value of the success probability S. In particular, by computing lim S→0 L s (S) we will show that there is an ultimate quantum limit in the precision of an estimate inferred from any outcome of a quantum measurement.
At this point, it should be clear that a probabilisitic protocol, as defined above, is not meant to be repeated until it succeeds [58]. Obviously, such a strategy would be ultimately deterministic (it will always end up providing an estimate) and, thus, it could not outperform the optimal deterministic protocol for the same total amount of resources. Only with some pre-establish success probability can probabilistic metrology provide a guaranteed precision for a given amount of resources.
We next outline an alternative approach that is often used in quantum metrology and point out the differences with the global single-shot framework defined above. The so-called pointwise approach aims at minimizing the dispersion of the estimatesθ that results from the noise inherent to quantum measurements. It assumes that the true value of the parameter θ is fixed (i.e., it is not a random variable), and that the metrology protocol can be repeated an arbitrary number of times; it is a frequentist framework. It is customary to quantify the precision of the protocol by the mean square error, which indeed gives a measure of how the estimates would scatter around the true value if the protocol is repeated many times. Note that if a prior π(θ) were supplied, one could compute the average over θ, thus recovering the expression of the Bayesian expected loss in Eq. (2) for the quadratic loss function. The celebrated quantum Cramer-Rao bound [59] provides a lower bound to MSE θ that can be readily com-puted. In addition, one can often argue that the Cramer-Rao bound can be attained in the asymptotic limit of large number of resources by a suitable two-step adaptive protocol. However, the assumptions under which the quantum Cramer-Rao bound holds, and the conditions under which the bound is attained entail some subtleties that are often ignored and that can lead to erroneous conclusions [60,61] and misleading accounting of resources (see for instance [62,63]). In the particular case of probabilistic metrology, the direct application of the pointwise approach can lead to unphysical results, as pointed out in [64].
The Bayesian approach has been widely used to assess the performance of quantum information protocols such as teleportation, state estimation, universal cloning and quantum memories. Despite its many advantages, which include a straightforward accounting of resources and its validity even for a small (non-asymptotic) number of resources, it also has some drawbacks: optimal bounds are usually hard to compute and there is no general prescription to choose the prior π(θ). In the case at hand (estimation of a phase θ) these drawbacks can be easily evaded, as the symmetry of the problem simplifies the calculations significantly while providing a valid justification (Laplaces principle of insufficient reason) to choose a uniform prior on (−π, π].
There is still another approach that suits our framework and does not require a prior distribution: the minimax approach, whereby the average over the unknown parameter θ is replaced by its worst-case value: where the optimization is over all possible quantum measurements. As shown in F, for phase estimation the optimal worst-case loss, Eq. (5), and the expected loss, Eq. (3) are equivalent.

A. Optimal probabilistic measurement for n-qubits
In the scope of this paper, metrology aims at estimating the parameter θ that determines the unitary evolution, U θ := u ⊗n θ , of a probe system of n qubits in the presence of local decoherence, where u θ = exp(iθ|1 1|).
As depicted in Fig. 1, the initial n-partite pure state |ψ ψ| = ψ (this shorthand notation will be used throughout the paper) is prepared and is let evolve. The state is affected by uncorrelated dephasing noise, which can be modeled by independent phase-flip errors occurring with probability p f = (1 − r)/2 for each qubit. Its action on the n-qubits is described by a map D that commutes with the Hamiltonian, so that it could as well be understood as acting before or during the phase imprinting process.
Pictorial representation of a probabilistic metrology protocol with n qubits (depicted by small Bloch spheres). The probe state |ψ , which needs not necessarily be a product of identical copies, undergoes an evolution U θ = u ⊗n θ controlled by the unknown parameter θ. Experimental noise D decoheres the system before a collective measurement on all qubits is performed. The measurement apparatus either returns an ultra-precise estimateθ of the parameter or shows a failure signal. In the event of a failure, some information could be in principle scavenged (see last subsection in Results).
Next, the experimentalist performs a suitable measurement on ρ θ = D(U θ ψU † θ ) and, based on its outcome, decides whether to abstain or to produce an estimateθ for the unknown parameter θ. Note that this decision is based solely on the outcome of the measurement as, naturally, the actual value of θ is unknown to the experimentalist. Our aim is to find the optimal protocol, e.g., the measurement that gives the most accurate estimates for a given probe state and for a given maximum probability of abstention.
Motivated by the periodicity of the phase, we quantify the uncertainty of the estimated phaseθ by the periodic loss function p (θ,θ) defined in Sec. II, and to assess the performance of the protocol we use the expected loss defined in Eq. (3) and Eq. (1).
where the success probability is S = dθ dθ p(θ,θ, succ). Throughout the rest of the paper we will refer to σ 2 as the uncertainty for brevity. The uncertainty and the probability of success S will fully characterize our probabilistic metrology strategies. In adition, in the asymptotic limit of large number of resources the distribution p(θ|θ, succ) becomes peaked around the true value θ and the uncertainty (expected loss) approximates the mean-square er-ror (expected loss for quatratic loss function q ). The set {ρ θ } is a so-called covariant family of states [65], as it is generated by the action of a group of unitaries; {U θ } θ∈(−π,π] in our case. We also note that p (θ,θ) is invariant under the same group action, namely, p (θ + θ ,θ + θ ) = p (θ,θ) for all θ ∈ (−π, π]. Because of this, there is no loss of generality in choosing the measurement to be covariant [65]. Such covariant measurements are defined by where Ω is the so-called seed of the measurement. In addition, we have the invariant measurement operator Π = 1 1 − π −π dθ/(2π)UθΩ U † θ ≤ 1 1 that corresponds to the abstention event. With this, finding the optimal estimation scheme reduces to finding the operator Ω that mimimizes the uncertainty, for a fixed success probability [33] In deriving Eq. (7) we have used covariance to fix the value of θ to zero and thereby get rid of the integral over θ in Eq. (6), and have defined ρ = D(ψ) accordingly.
We now focus on probe states consisting n-qubits that are initially prepared in a permutation invariant state. This family includes most of the states considered in the literature, our case-study of multiple copies of equatorialstates, and also, as we will show below, the optimal probe-state for probabilistic metrology. The input state is given by, where J = n/2 is the maximum total spin angular momentum (hereafter spin for short) of n qubits and the set of states {|J, m } J m=−J spans the fully-symmetric subspace. Given the permutation invariance of the noisy channel, the state ρ = D(ψ) inherits the symmetry of the probe, and can be conveniently written in a block diagonal form in the total spin bases [66,67] where the state ρ j has unit trace, p j is the probability of ρ having spin j, and 1 1 j stands for the identity in the ν j -dimensional multiplicity space of the irreducible representation of spin j. The sum over j in Eq. (10) runs from j min = 0 (j min = 1/2) for n even (odd) to the maximum spin J. Similarly, the measurement operators, can be taken to have the same symmetry and thus be of the The minimum uncertainty σ 2 (S) for a fixed probability of success S can hence be expressed in terms of the uncertainty σ 2 j (s j ) in each irreducible block and its corresponding success probability s j , where σ 2 j (s j ) [s j ] is defined by Eq. (7) [Eq. (8)] with Ω, ρ and Uθ projected onto the subspace of total angular momentum j.
one can easily integrateθ to obtain This formulation of the problem allows for a natural interpretation of the probabilistic protocol as a two step process: i) a stochastic filtering channel that coherently transforms each basis vector as |j, m → f j m |j, m , so that it modulates the input to a state with enhanced phase-sensitivity, followed by ii) a canonical covariant measurement with seed Ω = j m,m |j, m j, m | ⊗ 1 1 j performed on the transformed state from which the value of the unknown phase is estimated.
By defining the vector ξ j with components given by ξ j m = f j m (ρ j m,m /s j ) 1/2 and introducing the tridiagonal symmetric matrix H j , with entries we can easily recast the former optimization problem as, subject to ξ j |ξ j = 1 and 0 ≤ ξ j m ≤ (ρ j m,m /s j ) 1/2 .
Note that a j m , and in turn H j , depend on the strength of the noise but they take the same values for all symmetric probe states, since ρ j m,m ∝ c m c m . For deterministic strategies (S = 1, i.e., s j = 1 for all j) no minimization is required and one only needs to evaluate the expectation values of H j for the 'state' ξ j m = (ρ j m,m ) 1/2 .
For large enough abstention, the problem becomes an unconstrained minimization, so σ 2 j is the minimal eigenvalue of H j , and |ξ j its corresponding eigenstate. From Eq. (16), we find that the corresponding filtering operation only succeeds with a probability We will refer to S * as the critical success probability, since the precision will not improve by decreasing the success probability below this value: σ 2 (S) = σ 2 (S * ) for S ≤ S * .
C. Asymptotic scaling: particle in a potential box In order to compute the scaling of the uncertainty as the number of resources becomes very large we need to solve the above optimization problem in the asymptotic limit of n → ∞. We start be analyzing the uncertainty σ 2 j (s j ) for blocks of large j. As shown in G, for each such block we define the ratios x = m/j, m = −j, −j+1, . . . , j, that approach a continuous variable as j → ∞. In this limit, { √ jξ j m } approaches a real function of x, √ jξ j m → ϕ(x), and the expectation value in Eq. (15) becomes, where we have dropped some boundary terms that are irrelevant for this discussion, and where H j := −d 2 /dx 2 + V j (x) plays the role of a 'Hamiltonian', with a 'potential' Furthermore, in Eq. (18) the function ϕ(x) must be also differentiable and must satisfy the conditions where for a given large j we defineφ(x) through It is now apparent from Eqs. (18) through (21) that our optimization problem is formally equivalent to that of finding the ground-state wave-function of a quantum particle in a box (−1 ≤ x ≤ 1) for the potential V j (x) and subject to boundary conditions that are fixed by the probe state, the strength of the noise, and the success probability. Other equivalent variational formulations can be found in [33,34,68] for pure states and in [69] for the pointwise approach.
Although our methods apply to general symmetric probes, for the sake of concreteness we study in full detail the paradigmatic case of a probe consisting of n identical copies of equatorial qubits: Decoherence turns this symmetric pure state to a full rank state with a probability of having spin j given by where this approximation is valid around its peak, at the typical value j 0 = rJ. For each irreducible block and before filtering we have a signal that peaks at x = 0 with variance x 2 = (2rj) −1 . For deterministic protocols (S = 1) the constraints completely fix the solution: ϕ(x) =φ(x). The corresponding uncertainty is obtained by computing the 'mean energy' σ 2 j = H j φ /j 2 , in Eq. (18). For large j it is meaningful to use the harmonic approximation and ω 2 j = j(1 − r 2 ) 2 /(4r). The leading contribution to σ 2 j comes from the 'kinetic energy' [i.e., the first term in Eq.(18)], which gives p 2 φ = (1/4) x 2 −1 = jr/2, whereas the harmonic term gives a sub-leading contribution. One easily obtains σ 2 j = (2jr) −1 . The leading contribution to the uncertainty of the deterministic protocol is given by σ 2 j at the typical spin j 0 : σ 2 det = (2Jr 2 ) −1 = (nr 2 ) −1 , in agreement with the previous known (pointwise) bounds.
For unlimited abstention in a block of given spin j (s j very small) the minimization in Eq. (18) is effectively unconstrained and the solution (the filtered state) is given by the ground state ϕ g (x) of the potential V j (x). Within the harmonic approximation, we notice that the effective frequency of the oscillator grows as √ j, and the corresponding gaussian ground state is confined around x = 0 with variance x 2 = (r/j) 1/2 (1 − r 2 ) −1 . In this situation both the kinetic and harmonic contributions to the 'energy' are sub-leading, and so are the higher order corrections to V j (x). Thus, the uncertainty σ 2 j for spin j is ultimately limited by the constant term V j 0 of the potential. Up to sub-leading order one obtains σ 2 j = (1−r 2 )(2jr) −1 [1+(r/j) 1/2 ]. The filtering ofφ(x) to produce the gaussian ground state ϕ g (x) succeeds with probability s * j ∼ e −2j log(1+r) (see D). Note that in the absence of noise (r = 0) the potential V j (x) vanishes and the ground state is solely confined by the bounding box −1 ≤ x ≤ 1. Then, ϕ g (x) = cos(πx/2), which results in a Heisenberg limited precision (the ultimate pure-state bound): σ 2 = π 2 /n 2 [33,68].
If the optimal filtering is performed on typical blocks, j ≈ j 0 , one obtains σ 2 = (1 − r 2 )/(nr 2 ), which coincides with the ultimate deterministic bound found in [17,69]. This shows that a probabilistic protocol that uses the uncorrelated multi-copy probe state |ψ cop can attain the precision bound of a deterministic protocol, for which a highly entangled probe is required. This bound is attained for a critical success probability S * s * j0 ∼ e −nr log(1+r) . More interestingly, we can push the limit further by post-selecting on the block with highest spin (by choosing f j m ∝ δ j,J ) to obtain with a critical probability given by S * = p J s * J ∼ e −n log 2 , independently of the noise strength. We note that the leading order is a factor r smaller than the previously established (deterministic) bound, σ 2 = (1 − r 2 )/(nr 2 ) [17,69]. This important enhancement in precision results from post-selection of high-angular momentum, which does not commute with the noise channel. Hence, in contrast to the noiseless scenario, post-selection is not equivalent to a suitable choice of input state.
Having understood the two limiting cases of no abstention (the deterministic protocol) and unlimited abstention, we can now quantify the asymptotic scaling for an arbitrary success probability σ 2 (S). We use the Karush-Kuhn-Tucker optimization method to minimize Eq. (18) under the constraints in Eq. (20). For a given value of s j , the so-called complementary slackness condition [34,70] guarantees that the solution ϕ(x) to Eq. (18) saturates the inequality in Eq. (20) for x in a certain region called coincidence set, while it coincides with an eigenfunction of the Hamiltonian H j , defined after Eq. (18), for x outside this region. The continuity of ϕ(x) and its derivative provide some matching conditions at the border of the coincidence set and a unique solution can be easily found.
As shown in Figure 2, in the case of multiple copies the tails of ϕ(x) coincide with the gaussian profile in Eq. (24) scaled by the factor s −1/2 j for |x| > x c (in the coincidence set), while for |x| < x c the filter takes an active part in reshaping the peak into the optimal profile. Clearly, the wider the filtered region, the higher the precision and the abstention rate. A simple expression for the leading order can be obtained if we notice that with a finite abstention probability one can change the variance of the wave function in Eq. (24) but not its 1/j scaling. Hence, as for the deterministic case, only the kinetic energy and the constant term V j 0 of the potential play a significant role. The solution can then be easily written in terms of the pure-state solution [34], which corresponds to a zero potential inside the box −1 ≤ x ≤ 1: '(x) V (x) x = 2m/(n r) xc xc

FIG. 2. Potential box equivalence:
Computing the action of the probabilistic filter and its precision is formally equivalent to computing the ground state and energy of a particle in a one-dimensional potential box. The stateφ(x) (empty circles) before the probabilistic filter and the state ϕ(x) (solid circles) after the filter are represented together with the potential V (x) (diamonds), corresponding to j = nr/2 [see Eq. (19)], for success probability S = 0.75, noise strength r = 0.8 and n = 80 probe copies. The unfiltered state (empty circles) has been rescaled so that it coincides with the filtered state in the region |x| ≥ xc = 9/32. The effective potential depends on the noise strength, as illustrated by the two additional dashed curves: for r = 0.2 (above) and r = 0.6 (below). Numerical (symbols) and analytical results (lines) are in full agreement.
whereS := 1 − S is the probability of abstention, σ 2 pure is the uncertainty for pure states (r = 1) and for an effective number of qubits n eff = 2j 0 . The pre-factor r takes into account the scaling of the variance of the state Eq. (24) as compared to the pure-state case. The first equality of Eq. (26) uses the fact that only abstention on blocks about the typical spin j 0 is affordable for finite S. This also fixes the value of S to be approximately s j0 . The simple expression on the right of Eq. (26) is not an exact bound, but does provide a good approximation for moderate values ofS (see Figure 3). We notice that for low levels of noise (r ≈ 1) one can have a considerable gain in precision already for finite abstention.
E. Finite n.
Up to this point, we have given analytical results for asymptotically large n, the number of resources. In order to get exact values for finite n we need to resort on numerical analysis. The main observation here is that our optimization problem can be cast as a semidefinite program: ⊕ j H j . Semidefinite programming problems, such as this, can be solved efficiently with arbitrary precision [70]. Figure 3 shows representative results for moderate, experimentally relevant number of qubits n. We plot the uncertainty as a function of the abstention probabilityS and noise strength r = 0.8. We observe that for small values of n the precision increases (nσ 2 decreases) quite rapidly until the critical value S * is reached. Past this point the precision cannot be improved. For larger n, the initial gain is less dramatic, but the critical point (or plateau) is reached for higher abstention probabilities, hence allowing to reach a higher precision. We see that for moderately large n, abstention can easily provide 60% improvement of the precision. When n is large enough, e.g., n = 20 (see the figure), there is a sharp improvement in precision as the success probability approaches the critical value. In the asymptotic limit, n → ∞, it gives rise to a critical behavior that interpolates between the ultimate precision limit, Eq. (25), and the precision for finite values ofS, Eq. (26). Figure 4 shows the scaling of the uncertainty with the amount of resources, n, for low levels of noise r = 95% and for different values of the abstention probabilityS. For low n all curves exhibit a similar (n −1 scaling, i.e., SQL). As we increase n, very soon the curve corresponding to unlimited abstention (solid line) shows a big drop with a quantum-enhanced transient scaling given by n −(α+1) , where α > 0 depends on the noise strength. For very large n (∼ 500) this curve saturates the ultimate asymptotic limit in Eq. (25) (blue dashed line), which has again SQL scaling. The numerical results for finite S (circles, squares, diamonds) display the optimal scaling up to the point where they meet the asymptotic (dashed) straight lines given by Eq. (26). Past this point, they fall on top of the corresponding straight lines, which display SQL scaling. The larger the abstention probability, the later this transition takes place. In addition, the figure shows the ultimate scaling for r = 99% to illustrate the fact that for weaker noise levels the transient is more abrupt (α is larger).  (26). The solid blue line corresponds to the ultimate limit (S arbitrarily close to unity) for the same value of r, obtained via exact diagonlization of Hj in (15). Its asymptotic expression, given by the leading order in Eq. (25), is the straight line plotted in dashed blue.The ultimate limit for lower level of noise, r = 0.99, is also plotted: the yellow dotted [dashed] line corresponds to the exact ultimate limit [its asymptotic leading-order expression in Eq. (25)].

F. Ultimate bound for metrology
So far we have studied the best precision bounds that can be attained for a fixed input state. A very relevant question of fundamental and practical interest is whether this bounds can be overcome by an appropriate choice of such state. We answer this question in the negative: the precision bound given by the uncertainty in Eq. (25) is indeed the ultimate bound for metrology in the presence of local decoherence and can only be attained by a probabilistic strategy.
To this aim, we first show in B that for any probe state and any measurement that attain certain precision (or, equivalently σ 2 ) with success probability S, we can find a new probe lying in the fully symmetric subspace (j = J) and a permutation invariant measurement that attain the very same precision with the very same success probability. This shows that the formulation that we have introduced, with probe states in the fully symmetric subspace, is actually completely general.
We now recall that the Hamiltonian H j is independent of the choice of probe state and that such choice determines only the shape of the stateφ(x) before filtering, and the probability p j of belonging to the subspace of spin j. Since the bound in Eq. (25) is attained by the ground-state ϕ g (x) of the potential V J (x), the choice of probe cannot further improve the precision, but only change the success probability. In particular one might increase S by choosing a probe state that gives rise to a profileφ(x) = ϕ g (x) for j = J, without any filtering within the block. In this case the critical success probability becomes S * = p J = e −n[log 2−log(1+r)] (see E), which is larger than that attained by |ψ cop .
At the other extreme, for deterministic strategies, the calculation of σ 2 opt (1) can be easily carried out by performing first the sum over j and then optimizing over the (n + 1)-dimensional probe state. In the continuum limit (large n) such calculation can again be cast as a variational problem formally equivalent to that of finding the ground-state of particle in a box with the harmonic potential V (y) = nr −2 (1 − r 2 )(1 + y 2 ), −1 ≤ y = m/J ≤ 1. The corresponding ground state wave function and its energy provide the optimal probe state and uncertainty respectively: and These results agree with their pointwise counterparts in [17,69]. The presence of noise brings the pointwise and global approaches in agreement, as to both the attainable precision and the optimal probe state are concerned. This agreement between global and pointwise approaches has been recently showed to be a generic feature in noisy scenarios with shot-noise limited precision [21]. This is in stark contrast with the noiseless case, where the probe ψ(y) = cos(yπ/2) is optimal for the global approach and gives σ 2 opt = π 2 /n 2 , while the NOON-type state |ψ = 2 −1/2 (|J, J + |J, −J ) provides the optimal pointwise uncertainty σ 2 opt = 1/n 2 . It remains an open question to find the optimal probe state given a finite values ofS. As argued above, a finite S will only be able to moderately reshape the profile without significantly changing the scaling of its width. Therefore, we expect the optimal state to be fairly independent of the precise (finite) value of S, and hence very close to that obtained for the deterministic case (S = 1). Numerical evidence (optimizing simultaneously over probes and measurements) suggests that this is indeed the case provided S is not too small. With this we are lead to conjecture that the optimal probe state is given by independently of S (finite), which agrees with Eq. (28) for asymptotically large n. Note that the cosine prefactor guarantees that the solution converges to the optimal one for r → 1 and it keeps the state confined in the box for all values of n and r. Such states continue to have a dominant typical value of j = j 0 and in those blocks both the kinetic and harmonic contributions to the energy are of sub-leading order. Hence, for probes of the form in Eq. (30) the enhancement due to abstention is very limited, up until very high abstention probabilities where one can afford to post-select high spin states to reach the ultimate limit in Eq. (25).

G. Scavenging information from discarded events
The aim of probabilistic metrology is twofold. First, it should estimate an unknown phase θ encoded in a quantum state with a precision that exceeds the bounds of the deterministic protocols. Second, it should assess the risk of failing to provide an estimate at all (e.g., it should provide the probability of success/abstention). Probabilistic metrology protocols are hence characterized by a precision versus probability of success trade-off curve, or equivalently by σ 2 (S). As such, no attention is payed to the information on θ that might be available after an unfavorable outcome. Here, we wish to point out that one can attain σ 2 opt (S) and still be able to recover, or scavenge, a fairly good estimate from the discarded outcomes (see Fig. 1).
The optimal scavenging protocol can be easily characterized in terms of the stochastic map F in Eq. (13), which describes the state transformation after a favorable event, and that associated to the unfavorable events: where the weightsf j m are defined through the equa- The addition of the two stochastic channels,F + F , is trace-preserving, i.e., it describes a deterministic operation, with no post-selection. The final measurement is given by the seedΩ defined after Eq. (13) for both favorable and unfavorable events. Thus, we can easily computeσ 2 (S) for the the latter, as well as σ 2 all (S), where all outcomes are included. Clearly, we must have that σ 2 all (S) ≥ σ 2 det [58], as σ 2 det refers to the optimal deterministic protocol.
As shown in Figure 5, a protocol that is optimized for some probability of abstentionS, performs only slightly worse when forced to provide always a conclusive outcome. In particular, we notice that if such protocol is designed to work at the ultimate limit regime, with uncertainty σ 2 ult , which requires a very large abstention probability (S → 0) [58], its performance coincides with that of the optimal deterministic protocol. Actually, this observation follows (see I) from Winter's gentle measurement lemma (Lemma 9 in [71]), which states that a measurement with a highly unlikely outcome causes only a little disturbance to the measured quantum state. This is in contrast to the claims in [58], where a random estimate is assigned to the discarded events.

IV. DISCUSSION
We have shown that abstention or post-selection can counterbalance the adverse errors in a noisy metrology task. Our results are theoretical and concern abstract systems of n qubits. However, they apply to different quantum metrology implementations, ranging from Ramsey interferometry for frequency standards [11,12], atomic magnetometry [5,6], and quantum photonics (single or multi-mode setups), where the number operator introduced here will play the role of number of photons.
Post-selection is already widely used for preparing quantum information resources, e.g., single photons from weak coherent pulses, heralded down-conversion for EPR-type states, or NOON states for metrology applications. Although some degree of post-selection is common in experiments, its tailored optimised use is not fully exploited. Only recently there have been important developments in this direction in the context of weak value amplification [48,49,72]. We note on passing that these schemes can be considered a particular instance of our general set-up, and hence are subject to our bounds.
The optimal probabilistic measurement presented here can be understood as a filtering process selecting the total angular momentum followed by a modulating filter, and a final standard covariant phase measurement. The latter can be implemented by the (almost) optimal adaptive scheme proposed in [55,73]. The modulation could be implemented by sequential use of amplitude-damping channels taking inspiration from recent experiments in state amplification [42,44,45]. In implementations that allow for an individual control of the qubits, such as ion traps, the projection onto the angular momentum basis can be efficiently carried out [74]. For implementations with less degree of control, the projection onto the fully symmetric sub-space can, as a last resort, be implemented by post-selecting outcomes with this symmetry. For instance a simple Stern Gerlach measurement could lead to outcomes (m = J) with a precision beyond the deterministic limits.
Regarding the implementation of our conjectured optimal probe state one can use available non-linear N 2type two-body interactions to turn the multi-copy gaussian profile to the wider optimal gaussian. Although, our case-study focuses on local dephasing noise, our methods can be adapted and similar, if not greater, benefits are expected for more general and implementation-specific noise models, including correlated noise.
In conclusion, we have shown what are the ultimate limits in precision reachable by any (deterministic or stochastic) quantum metrology protocol in a realistic scenario with local decoherence. We have derived the optimal bounds that can be reached when a certain rate of abstention is allowed and hence provided a full assessment of the risks and benefits of the probabilistic strategy. The benefits are clear for finite and for asymptotically large number of copies, and the precision is strictly better than that attained by deterministic strategies, which include optimal preparation of probe states. The ultimate quantum metrology scaling limit is only reached with a large abstention rate. However, in that case we have shown that it is possible to obtain estimates with standard (deterministic) precision from the discarded events. In this sense, seeking ultra-sensitive measurements is a low-risk endeavour.

V. ACKNOWLEDGEMENTS
This research was supported by the Spanish MINECO contract FIS2013-40627-P and the Generalitat de Catalunya CIRIT, contract 2014-SGR966. the standard Pauli matrix σ z = diag(1, −1). For states of n qubits, this, so called dephasing channel D, is most easily characterized through its action on the operator basis where the parameter r is related to the error probability p f through r = 1 − 2p f . The effect of D on a general n-qubit state = b,b b,b |b b | can then be written as the Hadamard (or entrywise) product where D := b,b r |b+b | |b b | and hereafter we understand that the sums over sequences run over all possible values of b (and b ) unless otherwise specified. Note that Hadamard product is basis-dependent.
and the multiplicity is given by, and a j m in Eq. (14) becomes .

(C7)
Appendix D: Relevant expressions for the multi-copy state.
If the input state is of the form given in Eq. (22) The probability to find the state in the fully symmetric subspace (j = J) is important when assessing the success probability of the the ultimate bounds. Since the multiplicity for the maximum spin J is equal to one, it can be readily seen that p J scales as The critical probability s * j within a block can also be computed in the asymptotic limit j 1 from Eq. (17) where ξ j m is the gaussian ground state, with (ξ j j ) 2 ∼ exp(−(1 − r 2 ) j/4r). For m = m = j Eq. (C2) gives D j j,j = (1 − r 2 ) J−j which together with Eqs. (D1) and Eq.(D2) gives ρ j j,j ∼ exp[2j log(r +1)]. This scaling dominates over that of ξ j m , and hence determines the scaling of s * j . From here we obtain critical value for the overall success probability S * = p J s * J ∼ e −n log 2 . (E3) In the asymptotic limit the probability p J can be estimated by noticing that the optimal distribution (ξ j m ) 2 is much wider than n J−m /D J m,m and can be replaced by (ξ J 0 ) 2 . Around m = 0, we can use the asymptotic formulas They can be derived using the Stirling approximation and saddle point techniques. Eq. (E4) also requires the EulerMaclaurin approximation to turn the sum over k in Eq. (C2) into an integral that can be evaluated using again the saddle point approximation. Retaining only exponential terms, S * = p J ∼ (1 + r) n /2 n = e −n[log 2−log(1+r)] .
where |ψ sym := n β=0 ψ β |β ∈ H ⊗n + . It follows from these results and Eq. (H1) that the very same uncertainty and success probability attained by any pair (|ψ , Ω) of probe state and measurement seed is also attained by the state |ψ sym ∈ H ⊗n + and the fully symmetric seed Ω sym . This completes the proof. Now that we have learned that no boost in performance can be achieved by considering probe states more general than those in the symmetric subspace H ⊗n + (in the subspace of maximum spin j = J), we may wonder if entangling the probe with some ancillary system could enhance the precision. Here we show that this possibility can be immediately ruled out, thus extending the generality of our result. For this purpose we take the general probe-ancilla state |Ψ PA = b ψ b |b |χ b , where |χ b are normalized states (not necessarily orthogonal) of the ancillary system. The action of the phase evolution and noise on the probe leads to a state of the form ρ PA (θ) = b,b r |b+b | e iθ(|b|−|b |) ψ b ψ b |b b | ⊗ |χ b χ b |. This state could as well be prepared without the need of an ancillary system by taking instead an initial probe state |ψ = b ψ b |b and performing the trace-preserving completely positive map defined by |b → |b |χ b before implementing the measurement. This map can, of course, be interpreted as part of the measurement. It would correspond to a particular Neumark dilation of some measurement performed on the probe system alone, and hence it is included in our analysis.
Appendix I: Scavenging at the ultimate precision limit.