Incompatibility in Quantum Parameter Estimation

In this paper we introduce a measure of genuine quantum incompatibility in the estimation task of multiple parameters, that has a geometric character and is backed by a clear operational interpretation. This measure is then applied to some simple systems in order to track the effect of a local depolarizing noise on the incompatibility of the estimation task. A semidefinite program is described and used to numerically compute the figure of merit when the analytical tools are not sufficient, among these we include an upper bound computable from the symmetric logarithmic derivatives only. Finally we discuss how to obtain compatible models for a general unitary encoding on a finite dimensional probe.


Introduction
Quantum metrology [1][2][3][4][5] is a special branch of quantum information theory that focuses on the possibility of using quantum effects for improving the accuracy of conventional estimation procedures. Thanks to the huge variety of potential applications (which among other include the probing of delicate biological systems [6], optical interferometry [7,8], gravitational wave detection [9,10], magnetometry [11][12][13][14][15] and atomic clocks [16][17][18]), this research field is likely to play a fundamental role in the looming quantum technology revolution. As evident from the seminal works of Holevo [19] and Helstrom [20], this research field can be thought as a quantum counterpart of Experimental Design [21,22]. Specifically the main goal of quantum metrology is to efficiently plan different types of experiments by minimizing the invested effort to overcome noisy fluctuations that originate by fabrication errors, external fields, microscopic degrees of freedom that are only statistically taken into account, and intrinsic limitations related to the formal structure of the quantum theory itself (e.g. the Heisenberg uncertainty principle). In recent years many significant results have accumulated in the domain of multi-parameter quantum metrology [23,24], i.e. processes where an agent tries to recover two or more attributes of a physical system (modeled by real numbers) via properly chosen measurements. The first studies that lit up the experimental interest in this subject have been done on the joint estimation of phase and phase diffusion [25][26][27][28][29], on quantum imaging [30][31][32][33][34][35][36][37], and on magnetometry [38,39]. What makes the problem intriguing is that in a purely quantum setting, due to constraints ultimately related to the incompatibility of non-commuting observables [40], it could be that an efficient experiment for the determination of one specific parameter leads to poor results in the precision of the others (while this may also be true in classical mechanics, since here the phenomenon is related to the technological limits of the experimenter, there is no reason to believe it to be fundamental). Aim of the present work is to quantify the genuine quantum incompatibility associated with the estimation task of multiple parameters. The analysis is then applied to some simple systems of qubits and qutrits in order to track the effect of a local depolarizing noise on the incompatibility of the estimation task. A semidefinite program is described and used to numerically compute the figure of merit when the analytical tools are not sufficient. Finally we notice that the strategies that allow us to codify information without incompatibility in the two-qubits scenario can be generalized to the case of a general unitary encoding on a finite dimensional probe. Before proceeding with the presentation, we add here a terminology clarification: with "quantum parameter" estimation we denote the task of extracting a parameter encoded on a certain given fixed state of a quantum system, while if we use "quantum metrology" it means that we have the possibility of choosing the probe that will undergo the encoding process. In this perspective the problem of parameter estimation is hence a sub-problem of quantum metrology. In this paper we will take the probe to be fixed and therefore we will be dealing with parameter estimation.
An outlook of the manuscript follows. In section 2 we introduce the setting of quantum metrology, and isolate the form of incompatibility that we will characterize later on. In section 3.1 the incompatibility figure of merit for quantum estimation is defined and its well-definedness is proved in section 3.2. The geometric interpretation of the figure of merit is presented in section 3.3 and in section 3.4 we express it in terms of the Holevo-Cramér-Rao bound [19,41], proved to be achievable thanks to the quantum central limit theorem and the quantum local asymptotic normality (QLAN) [42][43][44][45][46][47][48]. This allows us to compute the incompatibility via the semidefinite program (SDP) reported in Appendix C, which is derived from the one presented in [49]. In section 3.5 an analytic upper bound for the incompatibility is presented, and in section 3.6 a version of the figure of merit for separable measurements is given. Section 4 is dedicated to some examples with systems of qubits and qutrits subject to local depolarizing noise, here we put at work the linear program and some peculiar behavior of the incompatibility is observed. In section 5 we describe three strategies to build a compatible statistical model for a quantum metrological task involving D-dimensional probe states. The mathematical environments Definition, Theorem, and Corollary will be used to highlight the most important concepts that we introduce.

Setting and definitions
A prototypical example of multi-parameter quantum metrology is provided by magnetometry [11][12][13][14][15]50] where a spin particle is used as a probe for evaluating the three components of a magnetic field B := (B x , B y , B z ). In the most basic scenario the evolution of the particle is given by the unitary transformation U B := exp [i (B x S x + B y S y + B z S z ) t] where S i for i = x, y, z are the components of the spin. By measuring the evolved state of the probe we can hence try infer the values of B x , B y , B z , following the post-processing of the measurements output. What makes this procedure truly quantum in nature is that, fixing the number of experimental repetitions, due to the non-commuting nature of the generators S i , any attempt to improve the estimation accuracy of one of the cartesian components of B will have a negative impact on the accuracies of the other two [51]. An exact formalization of this problem can be obtained by considering a more general model where one is asked to determine d parameters θ := (θ 1 , θ 2 , . . . , θ d ) ∈ Θ (an open subset of R d ) that have been encoded in the input state ρ of a probing quantum system via a mapping of the form where now E θ is a completely positive, trace-preserving (CPT) transformation [52] which parametrically depends on θ and which, at variance with the simplified scenario detailed at the beginning of the section, might include a noise disturbing the process. Given N copies of ρ θ we can now try to recover the needed information by performing on them some (possibly joint) positive operator valued measure (POVM) θ }θ whose elements are labelled by a classical outcome variableθ that, without loss of generality [53], can be assumed to belong to the same set Θ of θ. Accordingly M N can hence be thought as a operation which, starting from ρ ⊗N θ , induces a measure on Θ, defined by the conditional probability distribution with the stochastic outcomeθ := (θ 1 ,θ 2 , . . . ,θ d ) ∈ Θ playing the role of the estimator of θ. The two most important properties of the estimatorθ are the bias vector and the mean square error with E representing the statistical average computed with the probability measure in (2). Ideally we would like to deal with estimators that are unbiased, meaning that b (N ) (θ) = 0 for all θ ∈ Θ, but this may not always be possible. Accordingly in what follows we shall focus on sensing, i.e. we shall measure small variations of the parameters θ around a known value and assume that we are allowed to employ locally unbiased POVMs at such special point, that is measurements which bias vector b (N ) (θ) satisfy the following conditions For these measurements the quantum Cramér-Rao (QCR) bound [3,54] gives a limit on the precision of the sensing task, formulated as a lower bound on the associated MSE matrix, i.e.
In this expression F (θ) is the so called quantum Fisher information (QFI) matrix which no longer depends on the selected POVM M N and whose elements can be computed as with L i (θ) the symmetric logarithmic derivative (SLD) [54] associated to the ith component of the parameter vector θ, i.e. the operator (possibly dependent on θ) fulfilling the identity For a pure state ρ θ = |ψ θ ψ θ | the above equation admits as solution while in general a solution is [54] Throughout the paper we will assume the QFI to be limited (i.e. F (θ) < ∞) and non-singular (i.e. F (θ) > 0). In particular the last requirement imposes that the maximum value of d (the number of parameters) we can allow in our study is upper bounded by D 2 − 1 with D being the dimension of the Hilbert space associated with the probing system (indeed values of d greater than such limit will necessarily force a linear dependence between the SLD operators L i (θ), leading to a singular QFI matrix).

Achievability of the multi-parameter QCR bound
In general the multiparameter QCR bound (6) cannot be saturated, meaning that there is no locally unbiased POVM M N with a Σ (N ) (θ) matrix equal to F −1 (θ)/N or, equivalently, which is capable of saturating the inequality for all choices of a positive weight matrix G ≥ 0. This is the form of metrological incompatibility that will be extensively studied in this paper. In order to better appreciate the meaning of this, suppose that we are interested in the estimation of an analytic function f ∈ C ω (Θ) of the unknown parameters vector θ. The function f will be evaluated on the estimatorθ extracted from the observations. By expanding to first order the expectation value of f (θ) − f (θ) we get the expression for the error which can be equivalently written as: where we introduced the rank- Written in this form we can now use (11) to cast a bound on the accuracy of the estimation of f (θ). As a matter of fact a rank-1 G can always be thought as the weight matrix of some function f (θ). We will see that according to our definitions a rank-1 G manifests no incompatibility, indeed we will see that the error associated to a single tangent vector |∂f (θ) on the statistical manifold can saturate the ultimate QFI (this can be understood e.g. from the upper bound (28) discussed in section 3.5 below, which, for G rank-1, collapses to C S (G, θ)). On the contrary the gap manifests itself when the weight matrix G is at least rank-2. This situations arises as we try to estimate at the same time multiple functions of the parameters θ, named , which could also just be the components θ 1 , θ 2 , . . . , θ d of the vector θ. To each of the functions we associate a weight g i ≥ 0, then the total error is the weighted sum of the errors for the estimation of each f i (θ), i.e.

Incompatibility measure
In this section we introduce a figure of merit to gauge the incompatibility of multiparameter estimation procedures, which is based on the assumption that the agent is allowed to perform on the probes arbitrary locally unbiased POVMs. After showing its well-definedness we clarify its interpretation in the framework of information geometry. We then provide a linear program to compute this incompatibility measure and an analytical upper bound. The figure of merit is then generalized to separable measurements.

Definition of the figure of merit.
Given the encoding (1) and a generic weight matrix G, from (11) it follows that a bonafide evaluation of the precision attainable with a locally unbiased POVM M N can be obtained by considering the ratio where Σ (N ) (θ) is the MSE matrix (4) associated with M N . As indicated by the notation the quantity (16) exhibits an explicitly functional dependence on G and M N which we remove by considering the term where now M to compute the best worst case scenario, as in a typical min-max definition [55]. The figure of merit r N (θ) quantifies the competition between optimal measurements for different parameters, and has a clear operational meaning. Because of the QCR bound in (16) which always exists and from r N (θ) inherits the property r(θ) ≥ 1. In particular in this case we have r(θ) = 1 if and only if there exists a sequence of M N ∈ M (LU) N , which, for all G ≥ 0, allows us to saturate the inequality (16) asymptotically in N . It is worth noticing that the incompatibility figure of merit could be defined for locally asymptotic covariant measurements (LAC) [42,48] as well, and it would be exactly equal to r(θ), see Appendix A for the details.

Well-definedness of the figure of merit
We now briefly show that r N (θ) in (17) is invariant under reparametrization and therefore a well-defined quantity. This translates to r(θ), which is therefore a well defined property of the statistical manifold (see section 3.3). Consider a reparametrization θ = θ (η) having an invertible Jacobian J ij := ∂θ i (η) ∂η j . Then the MSE matrix for the parameters θ, defined in (4), can be written is the MSE matrix for the parameters η. Similarly we write the inverse of the QFI matrix as F −1 (θ) = JF −1 (η)J t , with F (η) computed from the symmetric logarithmic derivatives L i (η), which differ from the definition in (8) in the derivatives, which are taken with respect to η i . Its now easy to show that r N (θ) = r N (η). The action of the Jacobian matrix on the MSE matrix and on the QFI can be moved on G, that becomes J t GJ both at numerator and at denominator of the ratio in (16), while multiplying respectively Σ (N ) (η) and F −1 (η). Then we observe that the set of positive matrices is invariant under congruence for an invertible matrix, i.e. J t {G ≥ 0}J = {G ≥ 0}, and therefore we get It worth stressing that, by construction the quantity r(θ) only depends on the input probe state ρ, the encoding E θ , and the specific point of interest θ. It is an intrinsic property of the statistical manifold defined by the trajectories (1). The need for a reparametrization invariant measure of incompatibility was already pointed out in [56], in which the figure of merit r(θ) was independently discovered.

Geometric interpretation
The parameters θ ∈ Θ can be interpreted as coordinates defining via the map (1) a submanifold of the space of states S (H), called the statistical manifold. The QFI matrix, being a positive semidefinite matrix can be thought as a Riemannian metric on this manifold. This metric is generally non trivial as it explicitly depends on the coordinates θ and may have intrinsic curvature. The QFI is said to be a distinguishability metric [57,58]: given two very near states ρ θ and ρ θ+dθ , their infinitesimal distance in the QFI metric is which is negatively correlated with the fidelity F (ρ θ , ρ θ+dθ ) between ρ θ and ρ θ+dθ , defined as F (ρ, σ) := Tr √ ρσ √ ρ 2 [59]. In order to gain information about θ it is thus better to choose the probe state ρ such that in the statistical manifold the codified state ρ θ is highly distinguishable from its neighbors ρ θ+dθ , and has therefore the highest statistical distance from them as possible. This picture clarifies why the inverse of the distinguishability metric, i.e. F −1 (θ), gives the precision to which a single point θ can be identified in Θ, given the quantum state ρ θ . For ρ ⊗N θ the relevant metric is F −1 (θ)/N . When a measurement is performed and an estimatorθ is chosen there is a new Riemannian metric insisting on the statistical manifold: the positive semidefinite Σ (N ) (θ) matrix. The key question is if one can find a POVM M N ∈ M (LU) N with a MSE metric that fully adapts to the underling quantum metric F −1 (θ)/N of the manifold, i.e. if the inequality (6) can be saturated (at a certain point θ). In general this is not possible. Let us introduce a representation of G as a sum of projectors |v i v i |, each weighted with g i ≥ 0, i.e. G : According to the above expression, the information content is a weighted combination of the distinguishability of the manifold in different directions defined by |v i , see figure 1 for a 2D representation. This has to be compared with the experimental weighted distinguishability, i.e.
given by a particular measurement. The whole point of the non commutative nature of the manifold is the impossibility to saturate the distinguishability in more than one direction at the same time. By taking we measure the worst case fitting of the Σ (N ) (θ) matrix on the metric F −1 (θ)/N at a point θ, spanning all possible sets of tangent vectors and weights. Then we minimize on the classical metric (and hence on the POVM) to find the most adapt one. By taking the asymptotic limit of infinitely many probes (through the lim inf) we have completed the analysis of the definition (18) from the geometrical point of view. We sum up everything and say that r(θ) measures, in the asymptotic scenario, the failure of finding a metric on the statistical manifold, stemming from a measurement, which fully adapts to the underlying quantum metric (in all directions) at a specific point θ.

Computation of the figure of merit
We would like to apply the existing results in local estimation theory to compute the incompatibility figure of merit. This requires the exchange of the sup and the inf in (17).
In Appendix B we do that and show with C S (G, θ) defined in (11), and with the numerator given by the quantity In [42] it has been proved that C(G, [41,60]. Exploiting this facts we can use (24) to deduce the following equality In [61] the upper bound C H (G, θ) ≤ 2C S (G, θ) is given, which implies r(θ) ≤ 2. Because of this it makes sense to introduce as a proper quantifier of incompatibility: by construction it belongs to the interval [0, 1] with I(θ) = 0 indicating full compatibility, while I(θ) = 1 maximal incompatibility. The Holevo-Cramér-Rao bound can be computed via the semidefinite linear program in [49], which can be adapted to compute I(θ), as reported in Appendix C.

Upper bound on r(θ)
In this section we propose an upper bound on r(θ) that relies only on the computation of the symmetric logarithmic derivatives defined in (8). It is essentially based on , a well know upper bound on C H (G, θ), which reads where A(θ) contains the expectation values of the commutators of the SLDs: In writing (28) we used Tr Abs (28) and (26) we get The above inequality shows that a sufficient condition to have compatibility is A(θ) = 0. In Appendix D we compute explicitly sup G≥0 in (30) and obtain where · is the operator norm. This translates to an upper bound on I(θ), i.e.
This strengthen the interpretation of A(θ) as a measure of incompatibility [51]. The upper bound I (θ) was first defined in [63] and called R. It has already been used as a measure of incompatibility and "quantumness" and applied to qubits [56] and many-body systems [63,64]. By defining I(θ) we offer a more informative definition of incompatibility. It is noteworthy that for a D-invariant model [65] this bound is saturated and I(θ) = I (θ).

Incompatibility for separable measurements
We now go back to the first definition of a figure of merit presented in (18), but consider the minimization in (17) to be performed only on the locally unbiased separable which operate locally on ρ ⊗N θ . This brings to the definitions and Definition 3.4 (Incompatibility figure of merit for separable measurements) Now we apply the result of [66], which gives us a lower bound on the precision of the estimation with N probes when we use a measurement M N ∈ M (LU-S) where Σ (N ) (θ) is the MSE matrix of M N and D is the size of the Hilbert space of the single probe ρ θ . This translates to a lower bound on r s which propagates to the definition of r s (θ), giving Theorem 3.2 (Lower bound for separable measurements) where we have compute explicitly sup G≥0 using the AM-QM inequality and its saturation. Observe that the inequality (37) bares no reference to the details of the encoding process (1)  , because only the local unbiasedness is required in their proof. Therefore we can write where now At least for the case of a qubit probe (D = 2) the above expression allows us to exactly compute r s (θ). Indeed as shown in [60,66] for this model one has leading to which shows that in the case of a single qubit, multi-parameter estimation always exhibit incompatibility for separable locally unbiased measurements (remember that our analysis is explicitly restricted to the cases where d ≤ D 2 − 1 = 3).

Hierarchy of incompatibility measures
Whether a certain estimation process is compatible or not depends on the set of measurements M N that we are allowed to perform. Consider a hierarchy of POVM sets we define the figure of merit r (i) N (θ) as in (17) which carries over to when taking the proper N → ∞ limits (18). For example the space of separable locally unbiased measurement is a subset of the set of all locally unbiased measurements, i.e.

Incompatibility of a noisy estimation task
In this section, by using the previously defined figures of merit r(θ) in (26) and r s (θ) in (34), we study the incompatibility of the estimation process in a few simple cases concerning the sensing of two phases θ 1 and θ 2 encoded by the unitary transformation acting on individual qubits. The probes will be states of one and three qubits subject to local depolarizing noise, which is given by the map with λ ∈ [−1/3, 1] being a characteristic parameter of the model [67]. The transformation Λ λ induces a shrinking of the qubit Bloch vector by a factor given by the modulus |λ| which can be used to gauge the intensity of the noise. In particular for λ = 1 the map (47) corresponds to the noiseless evolution, and for λ = 0 to the complete depolarization process, while negative values of λ indicate the presence of an inversion of the Bloch sphere with respect to the origin [68]. We are interested in investigating if the noise can force the system to a more classical behavior and therefore ensure compatibility in the estimation scenario, as it does for measurements [40]. We then turn to D-dimensional system, and with the opportune generalizations of (46) and (47) we explore the upper bound I (θ) in (27) for a generic system and the incompatibility I(θ) in (32) for a qutrit. Notice that the chosen noise is covariant and therefore in all our examples it could be applied before or after the encoding without changing the final output ρ θ . Table 1 contains a recap of the improvements and observed phenomena in the following examples.

Incompatibility for a one-qubit probe
First of all we analyze the case of a single qubit probe. The fact that the figure of merit is parameterization invariant allows for an elegant exact solution of the qubit model for whatever probe state and encoded phases under depolarization noise. In this example System Known results Improvements/Observed phenomena 1 qubit Computation of I(θ) (R) for qubit tomography and two phase estimation with pure states [56].
Computation of I(θ, λ) for two phase estimation with depolarizing noise. Incompatibility with separable measurements.

qutrit
Asymmetry around λ = 0 of the incompatibility measure I(θ, λ). the measure I(θ) and its upper bound I (θ) will coincide. After the encoding by U θ in (46), the probe undergoes the action of the noise map Λ λ in (47), so that its final state ρ θ is described by the mapping (1) with E θ given by The purity of the encoded state ρ θ = E θ (ρ) is independent on θ, this makes the statistical model D-invariant [62,65], and allows us to conclude that the Holevo-Cramér-Rao bound C H (G, θ) coincides with C Z (G, θ) defined in (28), therefore the inequality (30) is saturated (I(θ) = I (θ)), and the incompatibility can be computed from the symmetric logarithmic derivatives only. We consider an arbitrary qubit probe state ρ := 1 2 (1 + a · σ). Its Bloch vector is a := (a x , a y , a z ), with Tr ρ 2 = 1 2 (1 + a 2 ). After the encoding the Bloch vector of ρ θ = E θ (ρ) is a θ := λ (a x (θ), a y (θ), a z (θ)). We can perform an implicitly defined change of variables (θ 1 , θ 2 ) → (α, β), that brings us to a (α,β) = λ 2 Tr ρ 2 − 1(cos α cos β, cos α sin β, sin α). For this model [69] we compute the matrices F (α, β) and the A(α, β), which are that substituted in (32) give Equation (49) reveals that the noise level intensity controls directly the compatibility. Indeed for fixed input the value of I(θ, λ) reaches its maximum in the noiseless scenario (λ → 1) providing full incompatibility I(θ, λ) → 1 for pure input states. On the contrary as the noise sends ρ θ to the completely mixed state (λ → 0) the codified information is dissipated and the compatibility increases, indeed I(θ, λ) → 0. Fundamentally the same result was discover in [56] for qubit tomography. We finally remind the reader that, as anticipated at the end of section 3.6, for a single qubit we get r s (θ) = d = 2 independently on the noise. Again this result is valid ∀ θ and for every input probe ρ.

Incompatibility for three entangled qubits
Consider now the scenario in which we have at disposal multiple copies of three entangled qubits and we codify them through U θ ⊗ U θ ⊗ U θ , with U θ given in (46). This more complicate scenario gives us the opportunity to compute I(θ, λ) with the SDP and show the presence of a gap between I(θ, λ) and I (θ, λ). In this example we won't be able to compute r(θ) for every probe state, therefore we will concentrate on with where |φ + and |φ − are the eigenvectors of σ y corresponding to the positive and negative eigenvalue respectively. In [38] it is proved that the analogous state for the estimation of three phases with N entangled qubits reaches Heisenberg scaling in the QFI in all the three parameters. At difference with the previous example, here we are able to compute the figure of merit for the probe |ψ only at the point θ = 0 through numerical evaluations via the semidefinite program reported in Appendix C, these indicate a nonnull I(θ = 0). We add a local depolarization noise Λ λ on each qubit and compute I(θ = 0, λ) and its upper bound I (θ = 0, λ) as functions of λ to see if the noise increases compatibility, the results are reported in figure 2. I(θ = 0, λ) and I (θ = 0, λ) have been computed for 100 values of λ uniformly distributed in (−1/3, 1). The addition of noise does not necessarily diminish the incompatibility, on the contrary I(θ = 0, λ) and I (θ = 0, λ) both display a non-monotonic behavior with respect to |λ|. This behavior of the incompatibility has already been observed in [49]. We notice that as the noise destroys the information codified in ρ θ both the compatibility and its upper bound I (θ, λ) go to 0, but this doesn't seem to be a universal behavior [56]. We confirm a separation between I (θ, λ) and I(θ, λ), that has been evidenced in [56], and we conjecture that I(θ, λ) − I (θ, λ) shrinks to zero as the amount of encoded information diminish, as it happens in this example for λ → 0. In this model also the relative gap (I(θ, λ) − I (θ, λ))/I(θ, λ) shrinks to zero as λ → 0. For a generic noise this phenomenon depends on the behavior of A(θ, λ) as the disturbance is increased. The figure of merit I(θ, λ) appears to be not correlated with the information quantities F −1 11 (θ, λ) and F −1 22 (θ, λ) or with the purity of the encoded state, as these measures are all monotonic in the noise λ. Also because of this we think of I(θ) as a genuine non trivial new property of the estimation process. Notice that for λ = 0, the state is unable to codify information (F (θ) = 0).

Estimation on D-dimensional probes
In this section we study the incompatibility for a generic unitary encoding of d parameters on a D-dimensional probe in S(H), i.e.
where H j are null-trace hermitian operators acting on H. For the estimation around θ = 0, these operators are the infinitesimal generators of the encoding. However for a generic point θ = 0 this is not necessarily true. As explained in Appendix E, for a given probe state, the sensing procedure around a point θ = 0 can however be described in terms of an effective set of new generators H eff j (θ). Accordingly, since the results of the preset section are valid for estimations around θ = 0 for all possible choices of H j , we can conclude that they hold true also ∀ θ encoded by (51). Finally as for the noise model we replace (47) with which for λ ∈ [−1/(D 2 − 1), 1] is a proper generalization of the depolarization channel for a D-dimensional system [67,68].

Incompatibility for a D-dimensional probe
Let us consider a single-probe scenario where the state of the system is described by the density matrix with |ψ being the pure input state of the system, and with |ψ θ := U θ |ψ . If we now call L i (θ) the symmetric logarithmic derivative associated to the parameter θ i in the absence of noise, i.e. the SLD of |ψ θ , given in (9), then it can be seen that for λ = 1 is the SLD in the noisy scenario. We obtain this expression by substituting ρ θ defined in (53) in (10). From this result the QFI matrix F (θ, λ) and the commutator matrix A(θ, λ) are both found to be proportional to their noiseless counterparts F (θ) and A(θ) computed from L i (θ), i.e.
Replaced into (32) the above expressions lead to with I (θ) being the upper bound on the noiseless incompatibility figure of merit defined in (32). Notice that this expression is not symmetric around λ = 0, i.e. I (θ, λ) = I (θ, −λ) for λ ≥ 0. We define Definition 4.1 (Asymmetry factor for I (θ, λ)) The presence of an asymmetry in the properties of the D-dimensional depolarizing channel around λ = 0 was already pointed out in the context of communication in [68]. For a qubit model D = 2 =⇒ κ (λ) = 0. We show through a numerical example that this asymmetry exists not only for the upper bound I (θ, λ) but also for the actual figure of merit I(θ, λ). Consider the encoding of two near-zero phases (d = 2) on a qutrit (D = 3) via the unitary operator (51) where the generators are chosen to be which means I (θ, λ) − I (θ, λ) → 0 for λ → 0.

Design of compatible models for quantum metrology
The following section is somewhat disconnected from the previous discussions on the incompatibility measure I(θ, λ). Here we want to analyze some strategies that have been proposed in the past and some generalizations that allow to produce a fully compatible statistical model in quantum metrology. We will only need to asses the condition A(θ) = 0 to claim compatibility, according to the bound (32), and therefore we won't need the linear program for I(θ). We first review what is already known for 2 qubits, each codified via (46), and then generalize it for the D-dimensional encoding (51) when a couple of D-dimensional systems are available.

Known results for a two-qubits probe
In this section we analyze the compatibility of three different two qubit encoding scenarios in the absence of noise and for some special instances of the input states. We do not claim paternity of these results, they are only reviewed here in order to be generalized in the next section. Consider first the ancilla-aided model, in which only one of the two qubits is subject to the unitary encoding (46). This means that the total evolution of the two qubits is 1 ⊗ U θ . As input state for the probes we take a Bell state, which is known in the literature to be optimal for the estimation of SU (2) operations [70], for example where |0 and |1 are the eigenvectors of σ z . From a direct computation we see that for this state A(θ) = 0 ∀ θ, which from (32) gives leading to compatibility. This result was first reported in [70]. Interestingly enough compatibility can also be obtained when operating on the maximally entangled state (61) with the encoding U θ ⊗ U θ . Indeed by explicit computation we get again A(θ) = 0 ∀ θ that leads once more to (62). Such result can be found in [51]. We will see in section 5.2 that these effects are just a special instance of a more general trend since a maximally entangled state of a finite dimensional probe always gives full compatibility, both for one and two uses of the encoding unitary channel. We now give a last example, which we here name "anti-parallel spin strategy" for future reference. Take the input state going through the encoding U θ ⊗ U θ to be | + n ⊗ | − n , where | + n and | − n have opposite Bloch vectors + n and − n. This state has the same QFI of the state of two parallel spins | + n ⊗ | + n , but has A(θ) = 0 ∀θ (in contrast to | + n ⊗ | + n ), which means that it is fully compatible and a superior probe for the sensing task. This result can be obtained from direct computation or thanks to the observation of section 5.3, where we generalize this ideas to finite dimensional probes. The superiority of the anti-parallel spin state was already observed by Gisin and Popescu in [71] and in the context of parameter estimation in [72]. In all these three examples, being the encoded state pure, a measure entangled across two qubits only is sufficient to get compatibility [73].

Compatibility of the maximally entangled states
In this subsection we will show that the results of section 5.1 are only a particular case of a general observation, by proving that the use of an ancilla, maximally entangled with the probe, can completely remove the incompatibility, leading to the identity Consider hence as input the following pure state on which the evolution 1 ⊗ U θ acts to produce the output state (i.e. the Choi-Jamio lkowski state of the channel U θ [74,75]). Following section 4.3 and Appendix E, the associated symmetric logarithmic derivatives (9) of ρ θ := |ψ θ ψ θ |, can be expressed as which lead to the following expressions for the F (θ) and A(θ) matrices: where in the last identity we used the fact that H eff l (θ), H eff m (θ) is a traceless operator. Accordingly the upper bound (32) imposes (63), hence the thesis: the addition of a sufficiently large ancilla permits to remove entirely the quantum incompatibility for the LAC measurements.
A similar result holds true also when we let evolve the maximally entangled state (64) through U θ ⊗ U θ . In this case (65) gets replaced by which leads to which gives again the full compatibility condition (63).

Generalized anti-parallel spin strategy
Now we generalize the "anti-parallel spin" strategy of section 5.1. Suppose that we only have two parameters to estimate (d = 2) and we take for probe the separable input state |ψ 1 ⊗ |ψ 2 that evolves through the mapping induced by U θ ⊗ U θ . The sufficient condition for compatibility A(θ) = 0 can be expanded as The operator [H eff 1 (θ), H eff 2 (θ)] is skew-hermitian and therefore is diagonalizable and has purely imaginary eigenvalues ±ia j , where a j > 0, for j = 1, 2, . . . , D/2 , each associated with an eigenvector | ± ia j . If the dimension D is odd, then we have an extra unique zero eigenvalue. Let's denote with V the unitary operator that performs such diagonalization, i.e. V † H eff 1 (θ), H eff 2 (θ) V = diag (±ia 1 , ±ia 2 , · · · ). Let's define S ⊆ {1, 2, . . . , D/2 }, then a couple of states that guarantees compatibility is where ϕ 1 j and ϕ 2 j are arbitrary phases, s j ∈ {0, 1}, and |S| is the cardinality of S. Notice that we are also free to add in the definition of whichever |ψ 1 or |ψ 2 the state V |0 , with |0 being the eigenvector with null eigenvalue (in case there is one). With the above choice of probes the compatibility condition (72) is realized. Notice also that the QFI matrix of |ψ 1 ⊗ |ψ 2 is the sum of the QFIs of |ψ 1 and |ψ 2 . This two states taken individually manifest incompatibility, but when measured jointly they gain full compatibility. This construction is the analogue of the "anti-parallel spin strategy" of section (5.1). We observe that a state |ψ , being an equal superposition of |ψ 1 and |ψ 2 is also fully compatible. The condition (72) justifies also the compatibility of | + n ⊗ | − n ∀ θ in section 5.1. These two states are an orthogonal basis for the qubit Hilbert space, therefore (72) is satisfied and (63) holds. Incidentally we observe also that for a d-parameter estimation, the state |ψ 1 ⊗ |ψ 2 ⊗ · · · ⊗ |ψ D is compatible when is base for the Hilbert space H, because then we would have

Conclusions
One of the defining properties of quantum mechanics is the intrinsic incompatibility between the possible experiments that could be carried out on a quantum system. This causes the appearance of information limits on the precision to which different characteristics of a certain quantum system can be known. Formally, this comes always from the non-commutativity of quantum operations. The main role of this paper is to give a theoretical foundation to the measure of incompatibility in the quantum estimation task. For this purpose we define in section 3 a figure of merit having a well defined operational and geometrical meaning. The figure of merit r(θ) in (18) is built to be easily liked to the asymptotic results of estimation theory [42]. This allows us to easily compute it, as explained in Appendix C, and to give the upper bound in section 3.5.
The definition of incompatibility depends on the operations that we are able to perform, which is our level of control over the system. If we are only able to perform separable measurements on the probes then the relevant incompatibility measure is r s (θ) defined in (34). In section 4 the estimation is studied in the scenario where a depolarizing noise acts, this is a form of disturbance which is often used to model the decoherence dynamics in metrology [76,77]. We observed some interesting phenomena like the asymmetry of the incompatibility for inversion in the space of states in section 4.3. In section 5 we discuss some strategies able to produce compatible models for quantum metrology with generic D-dimension systems, which are for example the use of maximally entangled states. As a further development it would be interesting to determine which state gives the maximum and minimum incompatibility for a certain encoding at a fixed point θ. This optimization is hard because the figure of merit is a complicated non linear function of the state. In a sense the probe that maximizes incompatibility is the one which captures at most the quantum properties of the encoding process. Finally we would like to look for a link between the incompatibility and the Heisenberg scaling. In this context the only relevant measures are the one that assume no constraints on the separability of the input, because a single giant entangled probe would be used.

Appendix A. Figure of merit for LAC measurements
In this appendix we will define a version of the incompatibility figure of merit for local asymptotic covariant (LAC) measurements [42,48]. Consider the sequences of POVMs (M k ) k∈N 0 ∈ M (LAC) C that satisfy local asymptotic covariance (LAC) at the point θ ∈ Θ, as defined in [42,48]. The kth measurement M k of a sequence in M (LAC) C acts on k probes and has Σ (k) (θ) as associated MSE matrix. Because of the definition of LAC, (M k ) k∈N 0 ∈ M (LAC) C admits a limiting MSE matrix, i.e. lim k→∞ kΣ (k) (θ) := Σ(θ) (we also ask Σ(θ) < ∞). The new figure of merit is hence defined as .
In [48] the authors have proven the validity of the Holevo bound for the LAC measurements and its achievability in the same class (for a full rank ρ θ with non degenerate spectrum), that is We now prove that th inf and sup of (A.1) can be commuted. It is easy to prove that the set M (LAC) C is convex. The set of POVMs acting on S H ⊗N and having θ as outcome set can be thought as a convex subset of a certain dual vector space V [78]. The set containing the infinite sequences of V is a vector space, and the set of sequences thereof that are sequences of POVMs is convex. Furthermore the defining properties of LAC [48] is stable under convex combination of two measurement sequences. In the Minimax theorem of [79] the both spaces are required to be be locally convex. Banach spaces like G and V ⊃ M N are locally convex, and the countable infinite product space of multiple V (which is the space of sequences) is also a locally convex space. Furthermore given Σ 1 (θ) and Σ 2 (θ) the limiting MSE matrices of two sequences M 1 k k∈N 0 and M 2 k k∈N 0 , the asymptotic MSE of the convex combination . Consequently, just like in Appendix B, the Minimax theorem of Kneser [79,80] can be applied to swap the order of inf over M (LAC) C and sup G≥0 . It is understood that the argument of (A.1) can be cast into the same form of (B.6), from which the linearity in (M k ) k∈N 0 and G, and the continuity in G easily follow. We arrive therefore at

Appendix B. Exchanging sup and inf in the figure of merit definition
We will arrive at (24) through a series of lemmas.
Its important to assume that the set M N ∈ M (LU) N contains only measurements with bounded MSE matrices, i.e. Σ (N ) (θ) ≤ C. Therefore we have which means the function r N (G, M N , θ) is Lipschitz continuous with constant N C F (θ) and therefore continuous. It will be useful in the latter to notice that r N (F (θ) ] is also continuous in G ∈ G, because is the composition of the continuous functions r N (G, M N , θ) and G → F (θ)  In this paper only the upper semicontinuity of r N (G, M N , θ) is actually used.
Lemma 2. The figure of merit r N (θ) defined in (17) can be equivalently expressed as This Lemma is based on the application of the Minimax theorem of Kneser [79,80]. First of all we need to cast r N (θ) in a suitable form. We start from the observation that the set of positive weight matrices G is invariant under congruence for the positive matrix F (θ) This means that the sup can be taken on the matrices F (θ) In the second line, in the domain of the supremum, we have again used that every F (θ) can be thought as a (non-empty) subset of a dual vector space [78,81], which is a convex set because the locally unbiased measurements are stable under convex combination. The function N Tr [G · F (θ) 1 2 Σ (N ) (θ)F (θ) 1 2 ] is linear in both its arguments. The linearity in G is self evident, so we only show linearity in the measurement. Suppose that we are given two POVMs denoted by M 1 and M 2 , characterized respectively by the effects E 1 θ and E 2 θ associated to the outcomeθ. We have dropped for simplicity the subscript N in M 1 and M 2 and we will also drop the superscript (N ) in the MSE matrix Σ(θ). Consider the POVM being the linear combination M := λM 1 + (1 − λ) M 2 . By definition its effects are The Σ(θ) matrix associated to M is computed as expectation value on the probability distribution The linearity of p(θ) translates to the linearity of Σ(θ), i.e.
This means that the whole argument of the inf sup in (B.6) is linear in the POVM, and it is additionally upper semicontinuous in G at fixed M N (In Lemma 1 we proved continuity, which implies upper semicontinuity). We have now proved all the required hypothesis for the Minimax theorem of Kneser [79,80], which allows us to write It is worth stressing that without such argument the quantity is by construction always smaller than or equal to Given N copies of the probe we can organize them into N 2 distinct subgroups, each of them containing N 1 probes. We now perform exactly the same measurement M N 1 on each group and use the N 2 outcomes to estimate θ by taking their arithmetic mean. Calling this measurement M N it follows that its MSE matrix Σ N (θ) corresponds to Σ N 1 (θ)/N 2 , being Σ N 1 (θ) the MSE matrix of M N 1 . This holds true because the estimators are unbiased at θ. Therefore we have We now need to introduce a new quantity: where the supremum on G ≥ 0 has been removed. We can always take M ε N 1 such that with ε > 0. Then we have where r N (G, θ) has been defined in (B.18). Because of the arbitrariness of ε we have r N (G, θ) ≤ r N 1 (G, θ). Taking sup G≥0 on both the members of this inequality gives finally the thesis. which is by construction non-decreasing in n, i.e.
Our goal is to show that due to Lemma 3, the inequality in the above expression is always saturated, or equivalently that which will lead automatically to r(θ) = r 1 (θ) = inf N ≥1 r N (θ). We can prove (B.23) by contradiction: assume that there exists m such that r m (θ) > r 1 (θ). This implies that there must exist k < m integer such that This however can't be true because thanks to Lemma 3 we must have is explicitly decreasing, i.e.
therefore its limit exists and it is easy to show it coincides with inf N ≥1 r N (θ) as we see in the following. Take ε > 0, then ∃ N ε such that therefore ∀ ε, ∃ n ε := N ε such that Consider next the supremum over G of r 1 (G, θ), this can be evaluated as  [80,82]. In order to use this result it must be proved that is Ky Fan concave-convex on G × N 0 . This condition is equivalent to having Ky Fan concavity in G for every fixed N and Ky Fan convexity in N for every fixed G. Let us fix an arbitrary N ∈ N 0 . Given the combination G α : finally, substituting r N (G, θ), we get Putting together (B.43) and (B.45) gives which for → 0 is the (Ky Fan) concavity condition for r N (G, θ). Let's now prove the Ky Fan convexity in N . Consider N 1 , N 2 ∈ N 0 and an arbitrary G ∈ G, we have This is true because thanks to Lemma 3 we have r N 1 N 2 (G, θ) ≤ r N 1 (G, θ) and r N 1 N 2 (G, θ) ≤ r N 2 (G, θ). Lemma 5 proves that r N (G, θ) is upper semicontinuous in G for every fixed N , this concludes the hypothesis check for the application of the Ky Fan Minimax theorem, according to which Expanding this expression we arrive at (24), with C(G, θ) defined in (25).

Appendix C. Formulation of the semidefinite program
We start from (26) and write The semidefinite program for C H (G, θ) reported in [49] is See the work [49] for the definitions of all the objects appearing in this program, it is not necessary to understand them in order to follow our derivation. Equation (C.1) becomes The objective function Tr [F (θ) 1 2 GF (θ) 1 2 · V ] is linear and continuous in both G and V . The domain of the sup and the min are both convex, with G being compact. We can therefore apply again the Mimimax theorem of [79] as done in Appendix B. Having sup G∈G as the innermost operation we can solve it and write Now the objective of the minimization is the spectral norm of F 1 2 (θ)V F 1 2 (θ). We can introduce a dummy variable t and write the program as The condition on F (θ) 1 2 V F (θ) 1 2 can be written as Because of the Schur complement condition for the positive semidefinite matrices [83] the optimization becomes (C.6) From which we compute I(θ) according to (27). This semidefinite program is solved by means of the modeling system CVX developed on Matlab [84].
Appendix D. Explicit computation of r (θ) In this section we prove that sup G≥0 in the definition (30) of r (θ) can be computed exactly and we obtain the explicit expression for r (θ) in (31). First of all we define A (θ) := F (θ) − 1 2 A(θ)F (θ) − 1 2 . This means we can write (30)  The maximum of the above expression is λ 2 1 , realized for a G having G 11 + G 22 = 1 and all the other matrix elements null. Notice that F this is realized for G 11 = G 22 = 1 2 . Therefore we have build implicitly a matrix G that saturates the sup and gives (31). Now, the idea is to perform the rotation U −θ on the probe after the encoding with U θ+δθ , in such way we compensate for the know component of the rotation U θ+δθ and leave only a term depending on the new unknown variables δθ. In the last expression we have collected the terms multiplied by θ 1 and θ 2 respectively, which have been named H eff 1 (θ) and H eff 2 (θ). Notice that the commutator of two skew-hermitian operators like iH and iδH is again skew-hermitian, this applies to all the elements of the exponentiated sum in (E.4), and means that the right hand side of (E.4) is a unitary operator even if we have neglected higher order terms in δθ. The exponentiated sum is either equal to iδθ 1 H eff 1 (θ) or to iδθ 2 H eff 2 (θ) when we set either δθ 2 = 0 or δθ 1 = 0, therefore the effective generators are also hermitian operators. Consider a probe ρ codified by U θ+δθ , i.e. ρ θ+δθ U θ+δθ ρU † θ+δθ . All the informational quantities remain the same if a know unitary is applied to the state, indeed its effects can be always absorbed at the measurement stage (if the selected measurements set allows to do so). By choosing U −θ to be this unitary we get U −θ U θ+δθ ρU † θ+δθ U † −θ = U δθ ρU † δθ , with U δθ := U −θ U θ+δθ . We observe that the traces of H eff 1 (θ) and H eff 1 (θ) can be neglected without consequences, indeed they contribute only to a global phase. Also if the gate U θ is used multiple times on an entangled state, so that the encoding is U θ ⊗ U θ ⊗ · · · ⊗ U θ , the traces of the generators only give an irrelevant global phase. We now further manipulate the encoding and look for a parameterization in which the generators are orthogonal. Two qubits operators H 1 and H 2 are said to be orthogonal if {H 1 , H 2 } = 0. As null-trace hermitian operators on a qubit H eff 1 (θ) and H eff 2 (θ) can be written H eff 1 (θ) = α 1 (θ)σ x + β 1 (θ)σ y + γ 1 (θ)σ z , (E.6) H eff 2 (θ) = α 2 (θ)σ x + β 2 (θ)σ y + γ 2 (θ)σ z , (E.7) with α(θ) := (α 1 (θ), α 2 (θ), α 3 (θ)) ∈ R 3 and β(θ) := (β 1 (θ), β 2 (θ), β 3 (θ)) ∈ R 3 . The orthogonality condition is then {H eff 1 (θ), H eff 2 (θ)} = 2α(θ) · β(θ) 1. We can decompose H eff 2 (θ) in a term proportional to H eff 1 (θ) and one orthogonal as following (E.10) A rotation of the reference system can align H ⊥eff i (θ) with σ y and σ z , remember thought that the probe state must also be transformed. Let us introduce the unitary V θ such that V θ H ⊥eff 1 (θ)V † θ = σ y and V θ H ⊥eff 2 (θ)V † θ = σ z , then V θ U δθ V † θ = e i(δθ 1 σy+δθ 2 σz) , while the probe state becomes V θ ρV † θ .