General measure for macroscopic quantum states beyond"dead and alive"

We consider the characterization of quantum superposition states beyond the pattern"dead and alive". We propose a measure that is applicable to superpositions of multiple macroscopically distinct states, superpositions with different weights as well as mixed states. The measure is based on the mutual information to characterize the distinguishability between multiple superposition states. This allows us to overcome limitations of previous proposals, and to bridge the gap between general measures for macroscopic quantumness and measures for Schr\"odinger-cat type superpositions. We discuss a number of relevant examples, provide an alternative definition using basis-dependent quantum discord and reveal connections to other proposals in the literature. Finally, we also show the connection between the size of quantum states as quantified by our measure and their vulnerability to noise.


I. INTRODUCTION
When going from single microscopic particles to composite systems with many degrees of freedom, quantum mechanics shows enormous complexity. Genuine quantum features such as entanglement or nonlocality fall into several subclasses and notions such as "maximally entangled state" can not be generalized in a straightforward way. A full characterization of meso-or even macroscopic quantum systems seems to be out of reach, not only for practical reasons. However, one can identify global properties in such systems that are only mildly influenced by microscopic details. One such aspect is the macroscopic quantumness in large quantum systems.
An historic example that has played an important role is the so-called Schrödinger-cat state [1]. The pictorial idea of Schrödinger's cat in a macroscopic superposition of dead and alive and entangled with a radioactive atom is easy to grasp. However, as first emphasized by Leggett [2], the so-called macroscopic distinctness of the two superposed components |alive + |dead is a particularity not present in any quantum effect brought to macroscopic scales. For a counterexample, Leggett mentions superconductivity on visible scales. In order to further elaborate on this difference, several proposals to formalize the concept of macroscopic distinctness based on the "dead and alive" structure of a quantum state have been put forward [3][4][5][6][7][8]. For instance, the redundancy of information encoding in subparts of the system (like in the cells of the biological cat) [5], or the distance measured in units of "microscopic steps" [6] have been suggested. Even though these approaches are conceptually appealing, they suffer from some shortcomings. A general pure state does not have a Schrödinger-cat like structure, and, though one can always try to find a decomposition of a state into "dead and alive", such a decomposition is never unique. Even in the case where a natural choice seems to exist, this may not automatically lead to the maximal result [9]. In addition, extensions to superposition with different weights, or to mixed states are not straightforward. This limits the proposals to analyze ideal situations, while experimental data is difficult to interpret.
Other measures are directly formulated for arbitrary quantum states [2,[10][11][12][13][14]. Some of them are based on a pre-chosen observable of the system and define general-ized notions of "macroscopically distinct" as the spread of the wave function in the spectrum. The variance of this observable for pure states is closely connected to some proposals [10][11][12][13]. The more general approaches are however sometimes criticized to lack the conceptual beauty and clear physical intuition (as given by the distinctness of the two components for the other measures).
In this paper, we close the gap between these two basic approaches. We propose a measure that is applicable to superpositions of multiple states with unequal weights and is readily extendable to mixed states, thereby overcoming the shortcomings of previous proposals. We start with the intuition that |alive and |dead are macroscopically distinct if the two states can be distinguished by "classical" detectors [7], i.e. detectors that do not in general completely collapse the system into perfectly orthogonal states upon measurement but only weakly disturb the system. Needless to say that such detectors also do not perfectly extract information about any state of the system, hence they are said to have a limited resolution precision or resolution. Considering general pure states |Ψ without specifying a subdivision into two branches, we quantify how much information about |Ψ can be extracted by such detectors. The informative content is measured with the mutual information between system and measurement apparatus in the relevant bases. Then, we attribute an effective size to |Ψ as the robustness of this information with respect to the detector's resolution. The concept of macroscopic distinctness is hence formalized as "macroscopically extractable information". This idea is generalized to mixed states via a convex roof construction (i.e., considering the "worst case"). In contrast to [8], which follows a similar approach using pairwise distinguishability, using the mutual information ensures a holistic treatment, and allows for a connection between different approaches.
The paper is summarized as follows. In Sec. II, we formalize the intuitive idea and define our measure for pure states. We discuss a paradigmatic example of multiple superposition states and establish a connection to the variance. In Sec. III, we present an extension of our measure to mixed states using a convex-roof construction, and illustrate it with a simple example. In Sec. IV we introduce alternative ways to formalize our intuition. For one of them, we use the basis-dependent discord. We discuss implications and connections between the alternatives and to other proposals from the literature. In Sec. V, we provide a connection between macroscopic distinctness as formalized by our measure and fragility of entanglement to another system. We summarize and conclude in Sec. VI.

II. MACROSCOPICALLY EXTRACTABLE INFORMATION
A. Abstract definition Consider a set S = { √ p |A } N =1 of N quantum states |A and N corresponding probabilities p . This set can be used to define a superposition state or of a "micro-macro" entangled state between the system and some microscopic system with N orthogonal states {| } N =1 called "the atom" [15]. We wish to construct a meaningful definition for the size of such superposition states, based on some notion of generalized macroscopic distinctness of the superposed components S.
Following [7], we assume that we measure the macroscopic system with a measurement device that has a rather coarse-grained resolution ∆ (i.e., "low resolution" means large ∆). Let us consider a game in which Bob draws a random variable described by a probability distribution p and sends the corresponding state |A to Alice. She measures the received state with the detector (characterized by ∆), and obtains some outcome x. The information that she collects on the random variable hold by Bob can be quantified by the mutual information (MI) of the probability distribution p( , x), with the Shannon entropy H(p ) = − p log p [16]. Note that the MI can never exceed the Shanon entropy of the initial probability distribution H(p ). Hence, the maximal MI for as set of N orthogonal states with equal weights is given by b max = log 2 (N ).
The intuition inherited from the macroscopic distinctness of the cat's two states |dead and |alive tells us that a truly macroscopic superposition does not require technologically advanced detectors with high resolution in order to collapse the superposition to a single branch (or equivalently to learn the state of the atom in Eq. (2)) [7]. To quantify this intuition we define the effective size of |Ψ S or |Ψ mM as the maximal ∆ of the detector that still allows Alice to gain b bits of information about the preparation of Bob standing for the Macroscopicness of Information Content of the superposition. The minimal information b is a parameter of the proposed measure, whose role we discuss later.

B. Model of a coarse-grained measurement
Up to this point we were quite unspecific about the measurement device. Indeed, the definition above only assumes that there is a meaningful way to attribute a resolving parameter ∆ to the measurement device (and to continuously vary this parameter). In general, the detector does not have to be uniquely characterized by ∆, but can have additional knobs. In such a case, an additional optimization is necessary, as one is interested in the largest possible MI. However, we do not consider this more complicated situation in the following. As a first example, note that low resolution can come from inefficiencies modeled by a loss channel preceding an ideal measurement, in which case ∆ is associated to the probability to not (or only partially) measure the system (see also [5,17]).
In the following, however, we will consider the von Neumann pointer model with weak coupling between system and pointer. Suppose one would like to measure system with the observable which, for simplicity, is supposed to have non-degenerate discrete spectrum (if this is not the case replace the sum with an integral). For the formal definition of our measure the choice of A is irrelevant. However, it does determine which states are considered to be macroscopically distinguishable. Typically, we choose operators A with a classical limit such as collective spin operators for atomic ensembles or number of photons and quadrature operators for photonic state. The measurement is done via a pointer P (i.e., an auxiliary system), which first interacts with the system and is subsequently read out in a preferred basis. Consider a pointer system modeled by a particle on a onedimensional line with the usual commutation relation for position and momentum [x,p] = i (with = 1). We assume the pointer's initial state to be with ∆ characterizing the width of the distribution |ξ ∆ (x)| 2 , and we choose a real valued function ξ ∆ (x). The system interacts with the pointer via the unitary U = e −i A⊗p . Afterwards, the pointer is measured in the x-basis leaving the system in the state with K x = x| U |ξ ∆ = ξ ∆ (x − A). On an abstract level, this protocol realizes a general measurement with POVM elements K 2 . Trivially, if the width ∆ of the initial pointer state tends to zero, one recovers the usual "strong" projective measurement ξ 2 In contrast, the coupling becomes effectively weaker as ∆ increases. The system is less disturbed by the measurement and, consequently, the measurement progressively loses resolution and becomes less informative. This is sometimes called a weak measurement. In case one does not postselect on (or does not have access to) the measurement result x, the post-measurement state of the system -after tracing out the pointer-reads ρ out = tr P U ρ ⊗ |ξ ∆ ξ ∆ | U † = µ(p)e −ipA ρ e ipA dp, (8) where µ(p) = | p|ξ δ | 2 . In words, if the measurement outcome is ignored the effect of the weak measurement on the state is a dephasing channel generated by the observable A. Note that p|ξ and ξ ∆ (x) are connected via a Fourier transform, such that, in general, the weaker the measurement the lower is the strength of the induced dephasing.
To be more specific, we consider two examples for the pointer function ξ ∆ (x) in the following. In Sec. II C, we assume the distribution of the pointer to be square with a width ∆, such that an outcome x corresponds to a POVM element Another important example is when is a Gaussian function with spread ∆, that is, C. Example: Equally spaced peaks We now illustrate our formalism with a simple example of the equally weighted superposition of k + 1 equally spaced eigenstates A l N k = N k N k , all contained in the interval [0, N ], and a square pointed E ∆ (x) of Eq. (9). As the distribution is uniform one has H(p ) = log 2 (k + 1). First, note that the probability to observe an outcome x only depends on k and the ratio r = ∆ 2N . So we directly move to the scale-invariant problem with k + 1 eigenstates S = {(1 + k) −1/2 k } k =0 contained in the interval [0, 1], and the square pointer E 2r (y) of width 2r = ∆ N . The calculation of the MI is mainly a combinatorial problem. Lengthy but straightforward arithmetics (see Appendix A) gives, for r ≥ 1/2, and, for r < 1/2 with c = 2rk , This implies with the hyperfactorial H!(k) = Π k n=1 n n . In Fig. 1 we plot I ∆ for several numbers of peaks, as well as the limiting case k → ∞. For ∆ ≥ N the maximal MI is obtained for two peaks and is given by I ∆ = N ∆ . Accordingly for b ≤ 1 this is also the state that maximizes For ∆ < N things are more complicated, but numerical evidence shows that for b = log 2 (k + 1) is maximized by the state with k + 1 peaks. Combining the two we find, for any b, the maximal size attained by state with eqally spaced peaks in the interval [0, N ] To conclude this example let us remark that with the results above this family of states can be used for calibration of the measure MIC b for any state. Concretely, for any superposition state in addition to attributing a value MIC b for each b one says that the state under consideration is as macroscopic as k + 1 equally spaced peaks in the interval [0, N ], for some k and N easily obtained from Eq. (17) and Eq. (16).

D. Role of b and calibration of the measure
The proposed measure is parametrized by b, that is, the amount of extractable information in the protocol of Sec. II A measured in bits. This might seem as a flaw of our approach, adding some arbitrariness to the definition. But this is not so, b can be understood as the "rank" of the macroscopic superposition -it counts the effective number of different components that are superposed. This is an important characterization of the state that is independent and irreducible to its "size". For example, the state in the famous thought experiment of Schrödinger cat |↑ |alive +|↓ |dead is undeniably a very large macroscopic superposition, still it is a superposition of only two components and can never yield more then one bit of information. Similarly, one can easily think of a microscopic state that is a superposition of many components yielding a large amount of information b 1, nevertheless it has a small size MIC b=1 even for one bit.
It is then appealing to introduce an archetypal reference state for each value of b, which can be used for the calibration of the size measure. In view of the results above, this can be naturally done using the family of k + 1-peaks states. Concretely, for any value of b k = log 2 (k + 1) we can identify the state with k + 1 equally spaced peaks in the interval [0, N ]. Then for a general state |Φ S and for each value b k , in addition to attributing a value MIC b k , one can conclude that the state |Ψ S is as macroscopic as the state with k + 1 peaks distributed on the interval of width using the result of Eq. (17). N b k (|Ψ S ) can be interpreted as a calibration of the size measure.

E. Connection to the variance
The variance of a state V (|Ψ , A) = Ψ| A 2 |Ψ − Ψ| A |Ψ 2 is a natural measure of how large is the spread of a state in the eigenbasis of A. So it is natural to study the relation of our measure to the variance. For this, we consider Gaussian pointers, Eq. (10), for which the MI can be expressed as since p(x| ) = g ∆ (x − a ). In Appendix D, we prove that MI is always upper bounded by the variance One might wonder if there also exists a lower bound involving the variance. However, with the following example it is easy to see that no such bound can exist. For an appropriate choice of parameters p and N the superposition state √ p |0 + √ 1 − p |N can have an arbitrarily low MI and an arbitrarily large variance. Consequently, the two are inequivalent and the requirement for a large MI is strictly more restrictive than for a large variance. Nevertheless, the inequality (21) becomes tight when ∆ is sufficiently large This shows that, for a weak Gaussian measurement and for pure states, our measure is connected to earlier proposals [10][11][12][13] where the variance V (|Ψ , A) plays a role to measure the macroscopic distinctness. Equation (22) is further useful to evaluate our measure for small b.

III. MIXED STATES AND CONVEX ROOF
In practice, quantum states ρ are mixed. On the conceptual level, one can treat the macroscopicness and the quantumness of ρ as two independent aspects. The mixedness of a state ρ can then be attributed to the decay of its quantumness, while its maroscopicness, stemming from S, is left unchanged. Nevertheless this is not satisfactory in our case. First, we would like the MIC measure to be a single quantity that encompasses both the macroscopicness and the quantumness of the state. Second, a mixed state ρ = q i |Ψ i Ψ i | admits infinitely many ensemble decompositions which can yield different average MIC, since different elements |Ψ i correspond to different S i and do not necessarily have the same size.
To get a MIC defined on all states ρ and non-increasing on average under mixing one uses the convex-roof extension In words one finds the ensemble partition of q i |Ψ i Ψ i | = ρ that has the least average size, and defines this value as the size of ρ. This is by construction non-increasing under mixing, given any measure defined on pure states. Note that, despite the uncountable number of pure-state decompositions of ρ, the number of pure states in an extremal ensemble is limited to d 2 , where d is the rank of ρ. They also form a closed manifold, as there is a one to one mapping between decompositions of ρ and partitions of identity, or POVMs, see appendix B).
As an example, we consider quantum states lying in the span of two eigenstate of the observable |0 and |N as in Sec. II C. The most general state of this form reads expressed in the basis {|0 , |N } in the superposition scenario, or {|0, 0 mM , |N, N mM } in the micro-macro entanglement scenario. To shorten the notation we define r = (x ρ y ρ z ρ ). As the size is invariant under rotation of the state around the z-axis, we assumed y ρ = 0 in Eq. (24). For pure states (i.e., r 2 = 1) and a square pointer as defined in Eq. (9), the MI Eq. (3) can be easily computed with a convex functioñ Consequently, for pure states the size is given by where the spread of the state N appears as a factor (as only the relative size of the spread to the pointer width is relevant). Given this expression one can compute the average size for any ensemble partition of ρ. It remains then to find the optimal ensemble. Though this step can be done analytically, it turns out to be quite tedious (see Appendix C). The final result is a function that linearly depends on N , but has a complicated dependence of the state ρ.
In Figure 2 we plot the rescaled size mic b (x ρ , z ρ ) for two values of b = 0.082 and b = 1/3. These value are chosen to allow a comparison with the example of [7], where equally-weighted pure superpositions 1 √ 2 (|0 + |N ) are characterized by the probability P c to correctly distinguish between the two branches (|0 ≡ |A and |N ≡ |D in [7]; with the choice P c = 2/3 in the example). The value of P c does not uniquely determine the MI obtained by the measurement. Indeed, as entropy is concave given the two distributions p(x|0) and p(x|N ), with the fixed guessing probability the MI can range from 0.082 bits for the extremal case, where for each outcome x either p(0|x) = 2/3 and p(N |x) = 1/3 or vice versa, to 1/3 bits for the other extreme, where for with probability 1/3 one obtains an outcome for which either p(0|x) = 0 or p(N |x) = 0 and p(0|x) = p(N |x) = 1/2 for all the others others.

IV. ALTERNATIVES AND COMPARISON
The convex-roof extension for mixed states, Eq. (4), is conceptually straightforward. However, the convex roof of an inverted function is generally difficult to handle in proofs or in calculations of specific examples. In this section, we present alternative formulations and compare them to recent contributions in the literature. We present two variants. For the first one, we start with the original idea using the MI, but do the convex-roof extension of the MI instead of MIC. The second alternative uses a slightly different motivation to directly measure the nonclassical part of the macroscopically extractable information. We find a formulation that turns out to be equivalent to the so-called basis-dependent quantum discord.
A. Direct convex-roof extension of the MI Instead of the convex-roof extension of MIC we consider the direct convex roof of the MI and define a slightly different version of the MIC, namely For the example of Eq. (24) this definition gives the size of a pure state MIC b (ρ) = MIC b (x ρ ) with the same x ρ in Eq. (27), as it follows from the convexity ofĨ 0 (x). So it has the advantage to be more straightforward to compute. In addition this alternative definition allows us find the following connection.
In [18], a set of criteria were proposed for quantities that aim to capture the macroscopic coherence of a state. These are in the same spirit as the criteria for good entanglement measures. The most important ones say that a valid measure should (C1) vanish if and only if a state is an "incoherent" mixture of the form p |A A |; (C2) not increase under any "covariant" operation. An operation is covariant when it commutes with transformations of the form e −itA -this set captures all the possible operations which cannot create a superposition of the |A and which respect the "scale" |a i − a j | of a superposition |A i + |A j . (C2) can be broken down into two versions, (C2a) for deterministic processes, and (C2b) for stochastic processes under which the measure cannot increase on average. In addition, one can demand that a measure be (C3) convex (i.e., non-increasing under mixing) and (C4) increasing with respect to the scale |a i − a j |.
We show in Appendix E that this extended I ∆ (A : ) ρ , assuming a Gaussian pointer, satisfies all criteria (C1-4). In addition, MIC b satisfies (C2a), (C4) and a modified version of (C1), namely In other words, MIC is well-behaved in the sense that it vanishes for states that are close to incoherent mixtures, cannot increase under covariant operations, and is increasing with the scale of a superposition.
It is also worth noting that the relations between the MI and the variance for pure states of Eq. 21are directly generalized to mixed states via the convex roof of MI Where the quantum Fisher information F(ρ, A) [19] of the state ρ with respect to the operator A is known to equal to four times the convex roof of the variance [20]. It follows that MIC b satisfies This allows one to obtain upper-bounds on the size. Moreover, since the quantum Fisher information is known to satisfy all criteria (C1-4), we conclude that MIC b fulfills them in the limit of b → 0. Note that the quantum Fisher information plays a central role in one of the general proposals for macroscopic distinctness [13], so this measure is in some sense contained in the presented family as a limiting case.
In particular, the insights we have about the quantum Fisher information can be used to apply our measure for small b to real experimental data [21].

B. Alternative measure using quantum correlations
In this section, we build up an alternative measure which is conceptually similar but has a slightly different motivation. As argued earlier, the distinguishability of a set of states under a noisy measurement can be captured by the mutual information between Bob, who prepares the ensemble of states, and Alice, who reads out the measurement device. Put differently, when a measurement device interacts with a system in the superposition state |Ψ S = √ p |A , the correlations I(A : ) between the macroscopic system M and measurement device P are related to how well the device discriminates the branches of the superposition. However, I(A : ) can be non-zero even when the system is initially in an incoherent mixture p |A A |. This issue can be avoided by using the convex roof constructions for mixed states. This is conceptually appealing, but comes at a price of make things hard to compute, as illustrated with the example of Section II C. But is there a more direct way to avoid this issue that retains the nice physical intuition behind the MI.
Here we introduce the quantity C ∆ (ρ, A) that is related to I ∆ (A : ) in spirit, but can be directly applied to mixed states. We start by introducing it in two different ways, and then show that they are equivalent. Let the system start in an arbitrary state ρ, and the measurement device in the initial pointer state |ξ ∆ . We call the overall state after the interaction U = e −iA⊗p , ρ M the post-measurement state of the system, given in Eq. (8), and ρ P the final state for the pointer. Using the Von Neumann entropy S(ρ) = −Trρ log ρ, define as the entropy difference between the post-measurement state ρ M , given in Eq. (8), and the initial state ρ. Intuitively, the entropy increase in the system can only come from its correlations to the pointer created by the interaction. Hence, C ∆ (ρ, A) captures how much information is potentially available to the pointer about the system [22] Note that this quantity avoids all problems associated with mixed states. In particular, the system state with no coherence is not affected by the interaction with pointer, implying C ∆ (G(ρ), A) = 0. An alternative definition can be given via the quantum mutual information (QMI) I(P : M ) ρ := S(ρ P ) + S(ρ M ) − S(ρ ) between the system and the pointer after the interaction. As we show in the next paragraph, is given by the QMI for the initial state ρ minus the QMI for its incoherent version G(ρ). Here, the issue of mixed states in resolved even more explicitly as the incoherent contribution to the QMI is simply subtracted. In fact, the definition (40) also makes it clear that C ∆ (ρ, A) corresponds to the (fixed-basis) quantum discord of the final state [23]. As the quantity with ρ P | = tr M ρ |A A | trρ |A A | and trρ |A A | = p , is equal to the conditional entropy of the pointer on the system measured in the eigenstates of A (note that the measurement commutes with the interaction). So since the initial state of the pointer is pure I(P : M ) G(ρ) = J (P |M ) ρ gives the QMI between the system and the pointer, available upon the measurement of A. Note that the usual approach is to maximize the classical correlations over all possible measurements on M ; but since we have a fixed observable A of interest here, it is natural to fix the measurement.
Let us now show that the two definitions are equivalent. Since the interaction is unitary, we have S(ρ ) = S(ρ ⊗ |ξ ∆ ξ ∆ |) = S(ρ), thus I(P : M ) ρ = S(ρ M ) + S(ρ P ) − S(ρ ). For the classical correlations, we write σ = G(ρ) and σ = p |A A | ⊗ e −ia p |ξ ∆ ξ ∆ | e ia p . It follows that The rest of this section is devoted to examining the properties of C ∆ (ρ, A). We assume a Gaussian pointer from now on. As before, we can define another version of MIC: Just as before, we find that C ∆ satisfies all the coherence measure criteria (C1-4) -see Appendix G for the proof. Again, MIC b satisfies (C2a), (C4) and a modified version of (C1), Let us look at the behaviour of C ∆ in the limit of a weak Gaussian measurement, where ∆ is large. We find that, for pure states, to leading order, where h(t) = −t log t. This contrasts a little with Eq. (22) for I ∆ (A : ), where the leading order was ∆ −2 . See Appendix H for the proof.
Recently, another measure for macroscopic distinctness has been proposed [24]. Although formulated differently, it is closely related to the quantity C ∆ (ρ, A). In particular, both measures fulfill the proposed set of criteria for macroscopic coherence [18]. As noted in [24], small ∆ leads to the counter-intuitive result that the measure assigns a larger value to some product states than to a superposition of two extremal eigenstates of A. This reveals the role of ∆ in these measures as a characteristic scale. Using the measure tells us how much quantum coherence the state of interest provides on this scale. In our framework it is MIC b (ρ), rather then C ∆ (ρ, A), that is quantifying the macroscopic quantumness of the state. But along the same line MIC b (ρ) can be understood as the maximal scale at which the state provides the desired amount of coherence.

V. IMPLICATIONS ON FRAGILITY
In this section we consider the micro-macro entangled state Under the assumption A j |A k = δ jk Eq. (2) gives the Schmidt decomposition of |Ψ mM . For the particular case 1 √ 2 (|0 m |A M + |1 m |D M ) we know that the micromacro entanglement is more fragile for a larger size of the macroscopic part of the state using the framework of [7,17]. Here we will show that a similar relation between the size of the state as defined in Eq. (4) and the fragility of entanglement under certain type of noise persists in the general case. The intuition behind is rather simple: If the noise channel can be interpreted as an imprecise measurement of the system by the environment, then the size of the state relates to the amount of information extractable by the environment. The decay of entanglement through the channel is related to the information obtained by the environment. Since at least the mathematical modeling of the environment and a measurement pointer is similar, we denote the environment as P as well.

A. Entanglement of formation
Concretely, we consider the entanglement of formation E F (ρ AB ). E F is an entanglement measure [25] on bipartite states, defined as the convex roof of the entropy of entanglement where the entropy of entanglement is by definition an entanglement measure on bipartite pure states given by the Von Neumann entropy S(ρ B k ) = S(ρ A k ) of the partial states ρ B k = tr A |Ψ k Ψ k |. Because we assume all the branches of |Ψ mM to be orthogonal, its entanglement of formation reads Note that the entanglement in the state is invariant under local unitary transformation, so its amount is independent of the spread of the state in the spectrum of A and of the macroscopicness of the state.

B. Noise as measurement by environment
For any Kraus representation K of a channel E can be interpreted as a measurement of the system by the environment, described by the POVM elements induced by the Kraus operators {E x = K † x K x } (in the expression above the sum is replaced by an integral if a Kraus representation is continuous). For simplicity, the channel is supposed to act only on the macroscopic part (see Fig. 3). Hence, the output of the channel is a mixture of states ρ x = KxρK † x p(x) with p(x) = trE x ρ conditional on the environment observing outcome x. Note that the POVM elements do not uniquely specify the channel, as the same element E x can correspond to physically different Kraus operators U K x = √ E x . Now consider the action of the channel on the state ρ = |Ψ Ψ| mM , for which all the conditional states are also pure. Similarly to Eq. (3) we define the MI between the microscopic system and the measurement (that arises from the Kraus representation K of the channel) carried by the environment I E,K (P : ). One has where p( |x) = tr | | m ρ x . Note that it does not matter whether the projection {| |} on the atom's side or the measurement {E x } by the environment is performed first. The Shanon entropy of the distribution p( |x) upperbounds the Von Neumann entropy of the partial states This inequality allows one to obtain a bound on the average partial entropy of the post-channel state The left hand side is the average entropy of entanglement of the state E(|Ψ mM ) that correspond to its purestate partition provided by the Kraus representation K. Consequently, by definition of the entanglement of formation one has In words, the decay of entanglement of formation through a channel is lower or equal than the MI obtained by the environment via the measurement induced by any Kraus representation by the channel.

C. Examples
(i) Dephasing generated by the observable A of strength δ given by the with of the distribution µ(λ). As already mentionened in (8), this noise corresponds to a coarse-grained measurement of A by the environment. And the noise distribution µ(p) = | p|ξ | 2 is related to the resolution of the measurement ξ 2 ∆ (x) = | x|ξ | 2 by a Fourrier transform implying δ ∼ 1 ∆ . This shows that the quantity I ∆ (A : ) yields a lower bound on the decrease of entanglement in the state after the action of the channel E ∆ X . Similarly, 1/MIC b |Ψ mM gives an upper bound on the amount of noise δ ∼ 1 ∆ that leaves E F |Ψ mM − b bits of entanglement in the system.
In the case of a channel E describing weak Gaussian noise from the environment, Eq. (22) shows that It also turns out that C ∆ lets us say something about the degradation of quantum correlations between m and M . Note that E(|Ψ mM ) has the structure of a "maximally correlated state", which is generally written as i,j ρ ij |i j| ⊗ |i j|. It is known that the entanglement of a maximally correlated state is often the same as the coherence of the corresponding single-system state i,j ρ ij |i j| [27][28][29]. For example, this is true for the distillable entanglement E D [25] and the relative entropy of coherence C R [30], which can be written as that is, the relative entropy between a state and its fully dephased version. It also has the simple expression C R (ρ) = S(G(ρ)) − S(ρ).
The channel E leaves the fully dephased part of a state unchanged. Therefore we simply have where the final line uses the approximation (46) for a weak measurement.
(ii) A loss channel L η with "efficiency" η (corresponding to the efficiency of the measurement device) models a process where each particle (subsystem) is lost to the environment with probability 1−η, in other words the initial states of the particle and environment are eventually swapped. This symmetry implies that the transmitted state of the system L η (ρ) and the state of the partial state of environment ρ E η are the same if η = 1 − η (the roles of the "transmitted" and "reflected" systems are exchanged). For a family of states S one defines I η (A : ) as the maximal MI that is obtainable with a measurement device of efficiency η, and the corresponding MIC b (S) as the minimal efficiency that allows to obtain b bits. Again, for the state |Ψ mM the quantity I 1−η (A : ) gives a bound on the decay of entanglement through the loss channel L η , while 1 − MIC b is the minimal transmission of the channel that that leaves at least E F |Ψ mM − b bits of entanglement in the system.

VI. CONCLUSION AND DISCUSSION
Starting with the intuition that the macroscopic distinctness between two states "dead and alive" can be understood as "the ease to distinguish" the two states we lift this intuition to superpositions of multiple components by looking at "the ease to obtain information" about the state. We formalize this idea into a general measure that is also useful for mixed states. More precisely, we first quantify how much information one can extract from a pure state by measuring it with a classical detector with a limited resolution. Second, the minimal resolution that is required to extract the desired amount of information b is associated with a measure that quantifies the "macroscopic distinctness" of the state, that we call macroscopicness of information content (MIC). Throughout a large part of the paper we use the Von Neumann model for a weak measurement of a fixed observable A to model the classical detector.
To extend our measure to mixed states we use a convex roof construction, and illustrated it on a simple example. It is argued that the parameter b in our family of measures attributes a kind of "macroscopicity rank" to the superposition as it counts the effective number of components that are superposed. We also establish a relation between our measure and the variance of the state with respect to the opetator A (its quantun Fisher information for a mixed state), that plays a central role in previously defined measures. In particular, we show that for a Gaussian pointer they are equal in the limit of small b.
Later, we present an alternative formulation of our measure, which stems from the same intuition but allows to directly deal with mixed stated without the detour of a heavy convex roof construction. It turns out to be equal to the basis-dependent quantum discord, and is also closely related to the measure for macroscopic distinctness that has been proposed in [24]. In particular, it also fulfills the proposed set of criteria for macroscopic coherence [18]. So we can interpret is as the maximal scale at which the state provides the required amount of coherence, as quantified by b.
Finally, we study the relation between the fragility of the state and its macroscopicness quantified by our measures. Concretely, we analyze the decay of the entanglement of formation in a micro-macro state when noise is applied on the macro side. We show that regardless of the model of the classical detector used to quantify the size, there is always a noise channel for which the fragility of entanglement is directly related to the macroscopicity of the state. This result is then applied to two models of classical detector: a weak measurement of A central to the paper, and a generic inefficient detector modeled by a loss channel preceding an unknown measurement.
Our work provides a novel tool to analyze and compare recent and future experiments aiming at the observation of quantum effects at larger and larger scales.
Acknowledgments.-We thank Nicolas Sangouard for interesting discussions. This work was supported by the National Swiss Science Foundation (SNSF) projects P2GEP2 151964 and 00021 149109, and the Austrian Science Fund (FWF) P28000-N27, and the European Research Council (ERC MEC).
Appendix A: Details to equally-spaced-peaks example, Sec. II C The following calculation gives some details for the example discussed in Sec. II C. The probability of a measurement outcome y is given where # (a,b] ∈ counts the number of peaks (elements of S) in the interval (a, b]. Its value can be expressed as the difference # ≤ of the number of peaks with y ≤ b (respectively y ≤ a), which reads with x denoting the integer part of x. The outcome y can be equally well triggered by any peak from to the interval (y − r, y + r], hence the conditional entropy is given by The MI reads I(A : ) = log 2 (k + 1) − p(y) log 2 # (y−r,y+r] ∈ dy (A4) Since both the probability of an outcome and the conditional entropy are uniquely determined by the number of peaks in the corresponding interval, one can rephrase the problem in terms of the random variable n that englobes all the outcomes compatible with n peaks. One has P n = δ n Here, we discuss a simplification in the convex-roof construction for measures defined for mixed states. Let us assume that ρ is a full-rank state. If this is not true, one simply restricts the Hilbert space to the support of ρ. The number of ensemble averages of any non-pure density matrix ρ is infinite. Moreover, even the number of pure states in such an ensemble is not bounded, one can even have ensembles defined by a non-discrete probability density on the manifold of pure states. This being said, the number of pure state in an extremal ensemble is actually limited to d 2 , where d is the rank of ρ. This can be seen from the following argument.
First, there is a one-to-one correspondence between decompositions of ρ (not necessarily in pure states) and partitions of identity (or POVMs) known as ρ-distortion [31]. For each POVM, {E i } with i E i = 1, the operator is a valid state, and Second, ρ i is a pure state iff E i is rank one. This yields a one-to-one correspondence between ensemble partition (pure states) of ρ and POVM composed of rank-one operators {E i }. Finally, it is known that in dimension d an extremal POVM, i.e. that a measurement that does not correspond to a mixture of different POVMs (such a procedure physically corresponds to randomly choosing the measurement to perform and forgetting the choice), has maximally n = d 2 elements [32]. Via the correspondence above the same holds for extremal ensemble decomposition of ρ, and by construction the minimal size of Eq. (23) is attained by an extremal ensemble.
Appendix C: Convex roof example of Sec. III a. Restriction to the XZ plane Consider an ensemble decomposition i q i |Ψ i Ψ i | = i qi 2 (1 + r i · σ) = ρ. We show that there exists another decomposition that has a smaller or equal size but only involves states that lie in the XZ plane. To do so notice that also holds for eachr i = (x i 0 z i ) restricted to the XZ plane. This is not a partition in pure state, but it naturally gives one, since eachρ i can be decomposed in pure with the corresponding Bloch sphere vectors r i + y 2 i | ≥ |x i | and the size is monotonously increasing. Consequently the new decomposition yields a lower or equal size. It follows than from the beginning it is sufficient to only consider the ensembles where all elements lie in the XZ plane. As follows from [], extremal ensembles of this form involve three state at most.
b. Optimal ensemble Recall that the size of pure states in Eq. (27) is zero for small |x| (such thatĨ 0 (x) < b) and then increases monotonously with -|x|. In addition, I 0 (x) is convex in the regions (r ≡ I −1 0 (b), 1] and [−1, −r). If |x ρ | ≤ r, then the size of the state is zero. For example, this follows from the vertical decomposition: ρ is a mixture of the two pure states that have the same x = x ρ , that both have zero size. So in the following we assume x ρ > r. Without loss of generality, it follows that in the ensemble decomposition of ρ there is at least one pure state that lies in the right white sector of the XZ circle on the right from ρ, see Fig. 5. Actually, there are two possibilities: either (i) all the states lie in the right white sector, i.e. all these states satisfy x ∈ (r, 1], or (ii) some lie outside. The case (ii) is more involved, however it can be simplified by the following remark. Let us label all the pure in the ensemble states with x ≤ r by |Φ i and all states the states with x > r by |Ψ i . We then have with both σ and τ valid density matrices, that are represented by the triangle and the empty square in Fig. 5(b). The figure also directly suggests a decomposition that has a smaller average size that the one we started with. This is given by where τ (represented by the emply circle in Fig. 5(b)) lies on the intersection of the x = r line and the line passing through τ , ρ and σ and has a zero average size (think of its vertical decomposition). In addition, one easily sees that p ≤ p implying that this decomposition indeed yields a lower average size.
The two previous observation imply that minimal ensemble consists of at most of four pure state: the two states Ψ with x = n x , where n x ≥ x ρ [33]. In addition, all such ensembles have the same average size since the total weight only depends on n x , see Fig. 5(c). Furthermore the case (i), discussed above, corresponds to the extremal case of MIC b nx=xρ for which q nx = 1. Finally, note that from above n x is bounded by n x ≤ n max as follows from a simple geometrical argument. This allows to write the size of ρ from Eq. (23) in the form with r = I −1 0 (b), which is a well-behaved function that can be easily computed. For the inequality (21), we calculate the relative entropy between an arbitrary distribution p(x) and the Gaussian g ∆ (x −x), wherex := dx xp(x). The relative entropy between two distributions p(x), q(x) is defined as where we have used the fact that p(x) is a convolution of p and g ∆ (x), under which the variance is additive. The inequality then follows from the non-negativity of the relative entropy.
To show (22), we first prove the following useful result relating to classical statistics: Let A, X be random variables and B := X + tA. For sufficiently small t, where F c (X) is the classical Fisher information of X, and V (A) is the variance of A.
Denote the density functions of X, B by g(x), p(x) respectively, and let A have values a with probabilities p . The classical Fisher information of X is defined by From the definition of B, we have where we have done an expansion to O(t 2 ). Similarly, it is easily shown that In order to find H(B), we integrate by parts assuming g is sufficiently regular that lim x→±∞ g (x) ln g(x) = 0. Hence we have from which the result follows. Now it can be verified that I ∆ (A : ) is unchanged under a simultaneous rescaling ∆ → α∆, A → αA. So I ∆ (A : ) in the limit of large ∆ is the same as taking small t in H(tA + X) − 1 2 log(2πe/t 2 ), where t = 1/∆ and X is a standard Gaussian of unit variance and zero mean. Applying the above result, we get and F c (X) = 1.
(ii) then immediately follows, since [E, U k ] = 0 by definition. Instead tracing out P with position eigenstates, we have showing (iii); part (iv) follows from this expression.
Proof. (a) We need to use the fact that, for any quantum channel N , tr[N (ρ) log σ] ≤ tr[ρ log N † (σ)], which is a consequence of the concavity of the logarithm [34] (Lemma 3.6). From this, we have having used property (iv) for the third line.
(b) From property (i) above, where for the third line, we have used the fact that U k leaves the entropy unchanged.
(d) The scale-invariance is immediate from the fact that Φ ∆ multiplies the matrix element |A i A j | by a function of (a i − a j )/∆. The properties of MIC b follow exactly the same logic as for MIC b above.