Entropic measurement uncertainty relations for all the infinite components of a spin vector

The information-theoretic formulation of quantum measurement uncertainty relations (MURs), based on the notion of relative entropy between measurement probabilities, is extended to the set of all the spin components for a generic spin s. For an approximate measurement of a spin vector, which gives approximate joint measurements of the spin components, we define the device information loss as the maximum loss of information per observable occurring in approximating the ideal incompatible components with the joint measurement at hand. By optimizing on the measuring device, we define the notion of minimum information loss. By using these notions, we show how to give a significant formulation of state independent MURs in the case of infinitely many target observables. The same construction works as well for finitely many observables, and we study the related MURs for two and three orthogonal spin components. The minimum information loss plays also the role of measure of incompatibility and in this respect it allows us to compare quantitatively the incompatibility of various sets of spin observables, with different number of involved components and different values of s.

In this work, our aim is to develop entropic MURs for all the infinite components of a spin s in the case of an approximate measurement of the full spin vector. The idea of formulating MURs for all the components of a generic spin s was introduced in [8]: the measurement of a spin vector is seen as an approximate joint measurement of its infinite components and the aim is to have a quantitative bound on the accuracy with which all these observables can be jointly approximated by such a device. In [8] the approximation error is quantified by Wasserstein distances between target and approximating distributions. Our approach instead is to see a measurement approximation as a loss of information and to quantify it by the use of the relative entropy [25][26][27]. In information theory, the relative entropy is the notion which allows to quantify the loss of Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. information due to the use of an approximate probability distribution instead of the true distribution. This quantification is independent of a dilation of the measurement units and of a reordering of the possible values. In this context it is possible to arrive to MURs for any set of observables and to quantify their amount of incompatibility.
In [25] we succeeded in formulating state independent MURs for any set of n general observables taking a finite number of possible values. The lower bound appearing in these MURs was named entropic incompatibility degree, and it was shown to play the role of an entropy-based measure of incompatibility. The generalization to position and momentum was given in [26]. However, the formulation given in these two articles does not extend to infinitely many observables. In [27] we treated the case of all the infinite components of a spin 1/2 system, by an approach based on a mean on the directions. However, this approach cannot be extended to sets of observables for which a natural mean does not exist, and, in any case, it is very difficult to apply it to higher spins.
In this article we show how to quantify the 'inaccuracy' in an approximate measurement of the full spin vector, for any value of s, by introducing the notion of device information loss (section 3.1). Then, by optimizing on the measuring apparatus, we define the minimum information loss (section 3.2), by which the entropic MURs for a spin vector can be expressed, in a state independent form (section 3.3). A key point in the formulation of the MURs is the characterization of the class of approximate joint measurements of all the components of the spin vector (section 2.2). The main difference between the present approach and the one introduced in [25] is that now our focus is on the worst loss of information per observable, while previously it was on the total loss of information.
An important point is that the construction we propose for the spin case allows to formulate MURs also for finite and infinite sets of target observables on the same footing, always in a way that ensures independence from the measurement units, as invariant information theoretical quantities are involved. As a byproduct, this approach will produce also a 'normalized quantity of incompatibility' (the minimum information loss) for different choices of the target observables; this index can be used to compare sets of different numbers of observables from the point of view of incompatibility. So, after the construction of MURs for all the spin components in a measurement of the full spin vector, we study also the case of an approximate joint measurement of only 2 or 3 orthogonal spin components and show how the minimum information loss allows the quantitative comparison of the various cases (different numbers of components, different values of s). As already stressed in [8], a joint measurement of three orthogonal components is not equivalent to a joint measurement of all the components, in arbitrary directions, and only the case of infinite components respect the rotation symmetry of an angular momentum. So, it is meaningful to enlighten the differences between the case of the spin components in all directions and the case of orthogonal components.

Scheme of the article
In section 2 we present the approximate joint measurements of all the spin components that we are going to analyze. These are based on approximate measurements of a spin vector, that is generalized observables on the sphere (section 2.2): given a positive operator valued measure (POVM) on the sphere, we process it into an approximate joint measurement of all the spin components by a projection and discretization procedure of its output (section 2.2.1). After a general analysis of the rotational covariant approximate measurements of a spin s, more explicit results are given for small spins in section 2.3. In section 3 we introduce the minimum information loss associated to any approximate measurement of a spin vector. Such a quantity is the lower bound in the state independent MURs for all the spin components, formulated in remarks 9 and 11. We also show that the information loss is minimized in the family of rotational covariant POVMs on the sphere. In section 3.4 we show the connections between our entropic quantity and the incompatibility measures based on generalized noisy versions of the target observables. The numerical values of the minimum information loss are computed in section 3.5 for = s 1 2, in section 3.6 for s=1 and in section 3.7 for = s 3 2. In section 3.5 we present also a state dependent form of MURs in the special case = s 1 2. The MURs for two and three orthogonal components and the corresponding bounds for these cases are introduced in section 4. We show also that the minimum information loss has the role of figure of merit to quantify the incompatibility. The ordering from the least incompatible set to the more incompatible one is given in section 4.3, for different number of spin components (including the case of infinite components) and different spin values s. Section 5 presents conclusions and outlooks.

Approximate joint measurements of all spin components
In this section we introduce the general notations we shall use, our target observables (the set of all spin components) and the class of their approximating joint measurements.
We fix a Cartesian system x y z , , determined by the orthogonal unit vectors i j k , , . Let º S S  3 2, . The corresponding state space (the space of all the statistical operators on H) will be denoted by s S. In particular, in some discussions, we shall need the maximally mixed state, given by X the projection valued measures associated with the self-adjoint operators S x , S y , S z (respectively) and by X the set of possible eigenvalues m: , we denote by m n A ( )the eigen-projections of the spin component in the direction n: . As usual we shall identify n S · and n A by calling both them 'spin component'.
The set of observables which we are going to approximate by joint measurements (the reference or target observables) consists of all the spin components (the full spin vector): Let us introduce now the usual polar angles q f , in the fixed reference system and denote by q f n , ( )the unit vector in the direction determined by the polar angles θ and f: sin cos sin sin cos In the following we shall need the rotation operator corresponding to a counterclockwise rotation of an angle θ around the unit vector p f p + n 2, 2 ( ), see appendix A. Such a rotation brings the k axis to the q f n , ( )one, so that Finally, the spin components enjoy the covariance property where U(R) is the (projective) representation of SO 3 introduced in appendix A.

Approximate joint measurements
We are interested in a measurement of a spin vector, which can be only an approximate measurement otherwise it would be a joint measurement of its components which are all incompatible. Then, an approximate measurement of a spin vector will be seen as an approximate joint measurement of its infinite components. In some sense, this is even an equivalence if one follows the idea of [8, Section 4.1] that a joint measurement of all components of a vector is a positive operator value measure (POVM) whose output is a vector. We shall come back on this point in remark 4 and in section 5. For a presentation of POVMs, called also resolutions of the identity, see [10,Sections 4.6,9.3

M X
A ( ). The distribution of an observable A in a state ρ will be denoted by r A . The first step is to introduce the set of the approximate measurements of the spin vector. As formally the length of a spin is constant, we normalize it to 1 and we consider POVMs on the unit sphere  2 in  3 , We denote by  2 F ( )the set of all the POVM on  2 . The second step will be to approximate the target observables n A with compatible observables n M that share the same output space X as ; n A this will be done in section 2.2.1 by processing the output of a POVM on the sphere.
On the physical ground ( [19,Chapter 4], [8,Section 4.4]), an essential physical property of a measurement of an angular momentum vector is its covariance under the rotation group. Moreover, also when any POVM on the sphere is considered to model a possible measurement of an angular momentum, even if it is not rotational covariant, one could expect that covariance emerges naturally from any reasonable optimality requirement; it happens in [8] and the present paper does not make an exception. Of course, the special properties of rotational covariant POVMs on  2 will be the basis of some of our results. So, here we introduce the covariant POVMs on the sphere and give their properties. Remark 1. We denote by  2 F( ) the set of all the rotation covariant POVMs on  2 . The covariance of a POVM Î  2 F F ( )means that, for any Borel subset of the sphere Ì  B 2 and any rotation Î The structure of the POVMs in  2 F( ) has been completely characterized in [19,Section 4.10], [8, p 24]; any covariant POVM on  2 can be expressed as In particular, the normalization of the measure l F for any choice of the λʼs implies the normalization of the measures F ℓ , which means ò ò Let us note that the choice of the z-axis is arbitrary.

Post-processing
By a natural post-processing procedure, we are now able to construct the compatible observables n M on X, approximating the spin components n A . Let x be the result obtained from a measurement on the system of Î  2 F F˜( ). Beingξthe observed value, for every direction n we want a value for the ideal spin component n S · , obtained by a suitable discretization of x n · . This discretization could be based on different criteria, such as angles of the same amplitude, or projections on n of the same length. In order to have a sufficiently large class of approximate measurements, we do not ask for such a restrictions; we ask only to have symmetry with respect to positive and negative values, so that we can identify n S · with -n S · up to a change of sign in the output value m.
Let us consider a set of angles dividing the interval p 0, [ ]into + s 2 1pieces, symmetrically placed with respect to p 2: In other terms, let C m n ( ), Î m X, be the + s 2 1parts of the sphere obtained by using this discretization procedure around n; by construction we have This expression defines a POVM belonging to k M X ( ).
Remark 3. By the construction we have followed, the POVMs (15) enjoy many properties; the most relevant properties are the following ones.
(i) When k and ¼ n n , , k 1 vary, the POVMs (15) are all compatible, because they are obtained by classical postprocessing from a unique measure F.
(ii) By the fact that we have a measure on the space of the directions (the set  2 ) and that the post-processing is described by the intersections in (15), the introduced POVMs are invariant under any permutation of the couples ¼ n n m m , , , , (iii) Again by the structure (15), the introduced POVMs vanish any time the corresponding intersection among the sets C m (14) implies also the symmetry property The set of all these compatible POVMs implicitly defines a measure M F on  2 X for all the spin components; then, the measures (15) are k-dimensional marginals of M F . We denote by ¥ M the class of POVMs we get by this procedure: Î  2 F F˜( )followed by the post-processing described above.
. Exactly for this reason, here we follow [8] in starting from measures on the sphere, and we study only POVMs belonging to ¥ M .  ; as a matter of fact, we will prove that they allow to minimize the information lost in the approximation.

Inside
⌊ ⌋free parameters: s 2 parameters from the lʼs and s ⌊ ⌋from the angles q; s ⌊ ⌋is the integer part of s.

The structure of the covariant approximating spin components
We study now the structure of the covariant POVMs in M [ ] represents the admissible approximation of n A and its expression turns out to be The compatible univariate POVMs l n , M [ ] will be central in our formulation of the MURs and we shall call them 'approximate spin components'.
In the same way, we have In order to study the MURs for spin observables (section 3), we need a more explicit form for l m n , M ( ) [ ] , for which the following probabilities are needed.
which is the probability of getting the result m in a measurement of k , M ℓ [ ] when the system is in the eigen-state r h of S z . The vectorqis the set of the discretization angles (12), defining k , (22).
As stated by the following theorem, the q-coefficients involve the Wigner small-d-matrix [28, Section 3.6], defined by is the normalized eigen-vector of S z of eigen-value m.

Theorem 1. Each admissible approximate measurement of n S · (21) is diagonal in the basis of the eigen-vectors of n S;
· indeed, the approximate spin components (23) have the form where the q-coefficients (24) appear. Moreover, these coefficients turn out to be given by is the Wigner small-d-matrix defined in (25). Finally, the following properties hold: Proof. By using the expressions (22) and (7) inside the probabilities (24) we get . By (23) this proves (26).
which follows from (27) and the fact that q q h • , ℓ ( | ) is a probability. Therefore the strict positivity in (28) holds. The second property in (28) follows from (A.6) and the symmetry of the angles in the discretization (12).

Noise and compatibility
Definition 1 says that the q-coefficients are probabilities with respect to m; then, the quantities q q • , • ℓ ( | ) and ℓ ℓ are transition matrices, independent of the system state ρ. Then, equations (26) and (31) can be interpreted by saying that, given the direction n, each covariant approximating spin component n , M [ ] could be obtained by measuring exactly the target observable n A and then by perturbing the result with some classical noise through a one-step stochastic evolution given by one of the transition matrices just introduced. As we have seen in remark 3, the univariate POVMs l n , M [ ] are all compatible because they are obtained by a classical post-processing from the unique POVM l ; F the compatibility is not implied by the structure (26) alone. The use of classical transition matrices (Markov kernels) to transform incompatible observables into compatible ones has already been exploited in related problems [8,10,29].
A different approach [9,10,30,29,31,32] to the construction of compatible observables is to consider noisy versions of the target observables.

Definition 2.
If D is an observable and N another POVM with the same value space, the mixture is said to be a noisy version of the observable D with noise N and visibility h.
Given the target observables j D , Î j I , and the class of permitted noises, the problem considered in the quoted references is to see how much noise has to be added to the target observables in order to get compatible POVMs of the form ) . The various approaches in the literature differ for the classes of admissible noises; often only classical noise is considered, i.e.
where p j is a classical probability, independent of the system state [9,30,31]. A review of some choices for the noise classes introduced in the literature is given in [32]; the typical choices are: (a) classical noises, (b) noises represented by compatible POVMs, (c) general POVMs.
The marginals of an approximating joint measurement in ¥  M( )can be expressed as noisy versions of the corresponding target observables in a way which will be useful for comparisons, as stated in the following remark.
It is easy to see that l q m n , N ( ) is positive and that l q n , N is indeed a POVM; then, the proof of the decomposition (32) is trivial. This simple expression is due to the fact that each target POVM n A and its approximating POVM l n , M [ ] are diagonal on the same basis. Due to covariance, the visibility h l q , does not depend on n. Due to the strict positivity (28) of the q-coefficients, the visibility is strictly positive; moreover it cannot be 1, which is possible only when the target observables are already compatible.
In expressing l n , M [ ] as a mixture of n A and some 'noise', the decomposition is not unique. In writing the decompositions (32) we have decided to have the maximum possible value for the visibility h l q , , without imposing conditions on the class of allowed noises. As we remarked above, the last class of noises discussed in [32] is indeed the one of general POVMs. If the class of noises is restricted, the value of the visibility could diminish, as we can see in the example of = s 1 2, section 2.3.1.

Unbiased measurements
Sometimes, not only symmetries are used to restrict the class of possible approximate joint measurements of some incompatible target observables. In [4,33,34] spin measurements with unbiased marginals are considered; by this they mean that the outcomes of the measurement are uniformly distributed when the system is in the maximally mixed state. Note that in the field of inferential statistics this term has a different meaning, cf [19,Chapter 6]. By taking into account that our target observables n A are indeed unbiased in this sense, it could be reasonable to ask this restriction also for the approximating observables. In the case of covariant approximate joint measurements, by (31) and (30), to ask the uniform distribution , in the maximally mixed state r 0 (1), implies immediately the strong restriction This choice corresponds to discretize x n · by dividing the interval -1, 1 [ ]into subintervals of equal length. By using the minimization of information loss as criterium of goodness, as done in section 3, the best approximate joint measurement not always satisfies this restriction (see sections 3.6, 3.7) and we do not ask for unbiasedness. Also in other contexts, biased measurements turned out to be optimal [31].
2.3. Covariant approximate joint measurements for spin 1/2, 1, 3/2 For small spins we can get explicit results by particularizing the discretization procedure of section 2.2.1 and using the q-coefficients computed in appendix A.2.

Spin 1/2
In this case only three angles appear in the post-processing and they are completely determined by (12): . So, no free parameter is introduced by the discretization of the directions and a single free parameter remains, coming from the λʼs, see remark 6. These angles automatically satisfy (35) and this means that for = s 1 2 any observable in ¥  M( )is unbiased in the sense of section 2.2.4. The most general expression of the approximate spin components (21), (22) has been already obtained in [27], Section 5, but it can be computed also from the explicit form of the q-coefficients given in (A.9): note that S 2 is the vector of the Pauli matrices. Then, by (7) and ( To get unbiased marginals, according to (35) we would have to take = a 1 3; as we already wrote in section 2.2.4 we do not ask for this and we leave free the parameter a. and it introduces a single free parameter: q a cos 1 ≔ , Î a 0, 1 ( ). Other three free parameters come from the λʼs, see remark 6. The q-coefficients are computed in appendix A.2.3; then, the approximate spin components are given by (26), (21) and the probability distribution by (31) (we to not write explicitly them, because the formulae are very long). To get unbiasedness, according to (35) we would have to take = a 1 2.

Entropic MURs for the set of all the spin components
A spin vector can not be exactly measured, as its components are incompatible observables and a joint measurement can only approximate them. In information theory [35][36][37] the relative entropy is the quantity introduced to measure the error done when one uses an approximating probability distribution in place of the true one. Let us stress that the relative entropy is an intrinsic quantity: it is independent of the measure units of the involved observables and from renaming or reordering the possible values. Such a property does not hold for non entropic measures of the error.
In [25] we used as error function the sum of the relative entropies, each one involving a single target observable, because this sum represents the total loss of information; however, this approach can not be extended to infinitely many observables. To overcome this difficulty, instead of the sum, we shall consider the maximum of the relative entropies over all target observables: this maximum represents the loss of information for the worst direction. Then, we consider the worst case also with respect to the system state. Finally, we shall optimize with respect to all approximating joint measurements. This is indeed the procedure used in [4,5,8], apart from the starting point (distances between distributions for them).

The device information loss
Let us recall that ¥  (3) is the set of all the spin components (our target observables), that ¥ M is the class of the approximate joint measurements for all the spin components, and that where the logarithm is with base 2: º log log 2 . Recall that the form 0 log 0 is taken to be zero and that the relative entropy can be +¥ when the support of the second probability distribution is not contained in the support of the first one. When a covariant measurement is considered, by using the expression of l As all the q-coefficients are strictly positive (28), the relative entropy (45) is always finite.
The relative entropy (44) depends on the state and on the choice of the observable (the direction n). To characterize an information loss due only to the measuring device, represented by the multi-observable M approximating all the observables in ¥  , we consider the worst case of (44) with respect to the system state and the measurement direction. So, we define the device information loss by D Î

 
This quantity is the analogue of the entropic divergence introduced in [25, definition 2]); to use the worst case on the directions instead of the sum of the relative entropies, as done there, allows to consider also infinitely many target observables. Alternatively, in [27] we started from the mean of the relative entropies made over all the directions, but this approach gives rise to computations intractable outside the case = s 1 2, and without possible extensions in cases in which an invariant mean does not exist.
Theorem 2. The device information loss (46) is always strictly positive: In the case of a covariant measurement, the double supremum in the definition (46) of the device information loss is a maximum, and we have

 
Moreover, the maximum over the states is realized in an eigen-projection of the spin component: To prove (48), we need the notion of symmetrized version of a generic POVM on the sphere. The . By (53), (8), and the convexity of the relative entropy, we get ò ò ò ò Then, the device information loss (46) can be written in the form (51), which is finite because of the strict positivity (28) of the qʼs. ,

The minimum information loss
By optimizing over the class ¥  M( )of the physical approximating measurements we get a lower bound for the device information loss we call it minimum information loss. An analogous quantity can be defined also for the larger class ¥ M : The two minimum information losses turn out to be equal, as shown in theorem 3. The quantity has interesting properties; in particular, as shown in theorem 3, it is strictly positive. Moreover, in the spin definition given in section 2.1 we have used =  1, but (54) is independent of this choice, because of the invariance properties of the relative entropy. The minimum information loss will appear in the formulations of the MURs (section 3.3) and it can be used as a measure of the incompatibility of the set of the target observables. The expression (54) can be elaborated and a more explicit form can be obtained.
Theorem 3. The two information losses (54) and (55) are equal: The minimum information loss (54) can be expressed in terms of the q-coefficients (24) as where q is the set of angles satisfying the discretization conditions (12) and involved in the expression (27) of the qcoefficients. Moreover, the following bounds hold: . The opposite inequality is implied by (48); so, equality (56) is proved. To prove the first inequality in (58) we relay on the results of [25]. The entropic incompatibility degree for two target observables, defined in equation (10) of [25], is strictly positive when the two observables are incompatible [25], Theor. 2, point (v). Moreover, the class of the POVMs on 2 X , Î

Entropic MURs
By the strict positivity of the minimum information loss proved in theorem 3, the definitions (54), (55), and the equality (56), we get a first formulation of the MURs, in a state independent form, which is analogous to that given in [8, (11)].
Remark 9 (MURs, first version). For every approximate joint measurement M of all the spin components, the device information loss (46) is greater than a strictly positive lower bound: The upper bound in (58) is surely non tight, as it has been obtained by starting from the uniform distribution on the sphere; this can be checked in the explicit cases of small spins given below. However, the role of this bound is at least to say that, when we have a device information loss greater than that, the approximating measurement is not optimal.
By the fact that the device information loss of a covariant approximation is a maximum and has the form (50), we have immediately the following formulation of the MURs for covariant approximate spin measurements.

Remark 11 (MURs for covariant measurements, second version). The state independent MURs are
such a state ρ is one of the eigen-projections of n S · .
So, in a physical approximate joint measurement M of all the spin components n A , Î  n 2 , the loss of information per direction n can not be arbitrarily reduced. It depends on the state ρ and on the direction n, but for every n it can be potentially as large as . We shall compute analytically the minimum information loss in the cases of = s 1 2, 1, 3 2. For higher spins, a numerical approach is possible, as the computation has been reduced to the optimization problem (57) over a finite number of real parameters, appearing in integrals (27) of known polynomials related to the Wigner small d-matrix (appendix A.1). This equation gives a simple relation between the maximal visibility (61), (33) and the minimum information loss (54), (57) in the case of the spin vector. By our construction, we have also obtained that, inside the class of covariant measurements ¥  M( ), to maximize the visibility or to optimize the information loss gives the same optimal measurement. Let us note that this result is due to the fact that the target observable and the approximating POVM are jointly diagonal.

Minimum information loss and noisy versions of the target observables
Our aim in introducing the device information loss (46) and the minimum information loss (54), (55) was to have uncertainty measures, based on information theory, by which MURs could be expressed in a simple way, section 3.3; this construction produced also an incompatibility measure, the minimum information loss. The result above gives a link with the robustness measures [9,10,[29][30][31][32] which quantify the incompatibility by maximizing the visibility; in other terms, these measures are based on the ability of the target observables to maintain incompatibility against noise.

Spin 1/2
In this case no free parameter comes out from the angle discretization and the approximate spin components (37) are very simple.
which is an unbiased noisy version of n A (cf section 2.3.1).
Directly from the definition (54) and the expression (63) we have By the facts that there is no freedom in the choice of the θʼs and that the infimum is reached for l = 1 1 2 , we get that 1 2 M is the optimal measurement. Then, by (37) we get the form of the marginal (65). , In (65) we have written the marginal of the optimal measurement in two different ways. Firstly, we have written the noisy version with classical noise, with visibility 1/2. Then, we have used the expression (32) with general noise and visibility 3 4; it is this last visibility which is related to our minimum information loss, see (62).
Let us remark that, actually, 1 2 M enjoys a useful additional property. By using the state representation (39) and the explicit expressions (40) for the probabilities, we have l = l r r n r S s 1 2, , 66 The parameter r is the Bloch vector characterizing the state ρ. By taking the c-derivative, we see that it is strictly negative, which implies that s c x , ( ) decreases when c increases. This means that 1 2 M minimizes (66) for any state ρ. This peculiarity of the case = s 1 2 makes possible to state that 1 2 M is optimal even when we know the system state ρ and to easily formulate also a form of state dependent MURs.

Spin 1
In this case there is a single parameter (41) coming from the angle discretization; then, the minimum information loss and the optimal measurement can be computed.
The first term in the minimum decreases with a and the second one increases; this means that the supremum over a is reached when these two terms are equal, which happens when (70) holds. This proves (69). It is possible to check that (71) is the unique real solution of (70) and that this gives the properties (72).
Equation (69) implies also that the optimal measurement is with a 0 given in theorem 6. Also the expression of the optimal noise could be obtained, but it would be involved and we do not give explicitly here. By direct computations one can check that the optimal measurement is biased and that on the maximally mixed state r 0 it gives

 
Let us note that there is an optimal POVM, the one with = c 1 r , the same of the one appearing in [5,9,25], where different optimality criteria where used. By using this measurement it would be possible to give a state dependent version of the MURs as done in remark 12.

The bounds from optimal cloning
For > s 1 2 we can get a bound on the minimal information loss by using the POVM obtained from optimal cloning, because by construction we have The visibilities above have been obtained by allowing for general noises, not only classical ones. Inside the noise robustness approach to incompatibility, the two visibilities h 1 2 r for spin 1/2 have already been obtained in [32]; they are in the class called incompatibility generalized robustness, which means that general POVMs are allowed for noises. By comparing with section 3.4, we can say that we have shown how to generalize this approach to the case of infinitely many observables, such as the spin vector. Moreover, by using information loss measures, we have shown how to link this problem with the one of uncertainty measures and MURs. Let us also stress that formulae like (100) and (62) hold in this particular cases; they have not a general validity. The case of non-orthogonal spin components [25,32] could be a promising test to see the differences. In principle, our minimum information loss does not relay on the noisy versions of the target observables.

Conclusions
The entropic formulation of MURs has the advantage of being well based on information theory (in particular on the notion of information loss) and independent of the measurement units of the observed physical quantities and from a reordering of their possible values [25][26][27]. By using the case of the spin components, in this article we have shown that the approach based on the relative entropy can be extended so to treat on the same footing finitely or infinitely many observables and that a quantitative uncertainty bound can be constructed. By introducing the worst information loss with respect to the target observables and the system states, we have defined the device information loss in the various cases (46), (86), (88). Then, by optimizing with respect to the approximating joint measurements we have defined the minimum information loss (54), (87), (89). These two quantities allow for a clear formulation of state independent MURs, see sections 3.3 and 4.2.1.
To realize the minimum information loss one needs also to optimize the approximating measurement; an interesting point is that the 'best' approximating measurement of a target spin observable is not necessarily a noisy version of the target, with classical noise, but most general noise structures can be involved, as discussed in sections 2.2.3, 3.4, 4.4.
Moreover, the lower bound appearing in the state independent MURs, the minimum information loss, plays also the role of measure of incompatibility and allows to order different sets of target observables according to increasing incompatibility, as done in the inequalities (78), (91), (99).
However, the computations of the two 'information losses' need to solve difficult optimization problems and we have done these computations only for small values of s, sections 3.5, 3.6, 3.7, 4.2.2. To compute the minimum information loss for other values of the spin also numerical computations should be surely involved.
Another open problem is the conjecture given after inequality (78): is it true that the minimum information loss grows with s? For the cases of two and three orthogonal components we proved that the minimum information loss is upper bounded by a value independent from s, see (98). However, for the case of infinitely many components we proved only the existence of the upper bound (58), which grows with s; the problem of the asymptotic behaviour of even with classes of measurements larger than ¥ M . Indeed, one could consider post-processing procedures different from our, or even general POVMs on  2 X that are not even constructed by post-processing of a POVM on  2 . Our conjecture is that even these more general POVMs cannot give a lower information loss.
) then we have the covariance relations