Geometry of two-qubit states with negative conditional entropy

We review the geometric features of negative conditional entropy and the properties of the conditional amplitude operator proposed by Cerf and Adami for two qubit states in comparison with entanglement and nonlocality of the states. We identify the region of negative conditional entropy in the tetrahedron of locally maximally mixed two-qubit states. Within this set of states, negative conditional entropy implies nonlocality and entanglement, but not vice versa, and we show that the Cerf-Adami conditional amplitude operator provides an entanglement witness equivalent to the Peres-Horodecki criterion. Outside of the tetrahedron this equivalence is generally not true.


I. INTRODUCTION
The feature of entanglement is the basis for many fascinating phenomena in quantum information and quantum communication, such as quantum teleportation [1,2] or quantum cryptography [3][4][5]. Although the division of quantum states into entangled and separable states is well-defined mathematically, checking whether a given state is entangled or not often proves to be extraordinarily difficult. Consequently, a plethora of inequivalent criteria and measures is available for the detection and classification of entanglement [6][7][8]. The spectrum of available methods ranges from entanglement monotones such as the concurrence [9][10][11] or negativity [12], and geometric entanglement detection criteria in the form of socalled entanglement witnesses [13][14][15][16], to measures that directly quantify the utility of a state for specific tasks requiring entanglement. One of the most prominent such tasks, which challenges our preconceptions of the reality of nature [17], is the test of a Bell inequality [18,19], distinguishing between so-called local and nonlocal states, for which the inequality is satisfied or violated, respectively. However, while entanglement is required to violate a Bell inequality, entanglement and nonlocality are not the same concepts. As discovered by Werner [20], certain mixed states, albeit still being entangled, cannot be used to violate a Bell inequality and hence behave like strictly local states.
In contrast to methods that directly relate entanglement to physically measurable quantities stand information-theoretic approaches based on the entropies of quantum states. In both classical and quantum information theory entropies play a crucial role. Quite generally, entropy represents the degree of uncertainty -the lack of knowledge -about a (quantum) system. More specifically, the von Neumann entropy of a quantum state can be interpreted [21,22] as the minimal amount of information necessary to fully specify the * nicolai.friis@uibk.ac.at † reinhold.bertlmann@univie.ac.at state, be it separable or entangled. For the quantification of the correlations between two subsystems A and B, two particularly interesting entropies are the mutual entropy (or mutual information) S(A : B) and the conditional entropy S(A|B). In analogy to the classical case, the mutual entropy S(A : B) corresponds to the amount of information contained in the joint state that exceeds the information locally available to A and B, i.e., S(A : B) is a measure for the degree of correlation between subsystems A and B . On the other hand, S(A|B) is the entropy of the state of subsystem A conditioned on the knowledge of the state of subsystem B. In a series of papers [23][24][25][26][27] investigating the conditional entropy and the mutual entropy by means of so-called mutual and conditional amplitude operators (CAO), Cerf and Adami concluded that the quantum conditional entropy -in contrast to its classical counterpart -can become negative for entangled states. This provides a connection between quantum nonseparability and conditional entropy, or mutual entropy that we wish to investigate further in this article. The purpose of our article is hence to review the geometry of quantum states with negative conditional entropy and to compare it with the different regions of nonlocality, entanglement and separability. In particular, we want to focus on the paradigmatic case of two qubits, which is one of the very few examples where the different methods for detection and quantification of entanglement and nonlocality described above are practically computable and can be compared both numerically and geometrically. In this sense, albeit being a system of comparatively small complexity, the two-qubit case is of high significance, since it serves as a guiding example for developing the geometric understanding and intuition necessary to study more complicated systems.
Our investigation confirms that for the interesting class of locally maximally mixed states, the requirement of negative conditional entropy is a strictly stronger constraint than that of nonlocality, i.e., all states with negative conditional entropy are nonlocal, and therefore entangled, but the converse statements do not hold. We then consider an entanglement criterion based on the Cerf-Adami conditional amplitude operator and show that it is equivalent to the Peres-Horodecki criterion [13,28] for the set of locally maximally mixed states, but not for all two-qubit states.
This paper is structured as follows. We begin with a pedagogical review of the basic methods in Sect. II, discussing the geometric entanglement and separability characteristics in Sects. II A and II B, the boundary between local and nonlocal states in Sects. II C and II D, and the entropic correlation measures in Sects. II E and II F. We then present the results of our investigation in Sect. III, where we discuss the geometric aspects of the conditional entropy within the set of locally maximally mixed states in Sect. III A, provide an example for the general inequivalence of the Peres-Horodecki and the CAO criterion in Sect. III B, and discuss extensions to generalized entropies in Sect. III C. Finally, we draw conclusions in Sect. IV.

II. METHODS
In this section we will provide a pedagogical review of the methods for the detection and quantification of entanglement and nonlocality relevant to this study. The reader already well familiar with the geometry of separable, entangled, and nonlocal states for two-qubits may skip directly to Sect. III, where we present our results.

A. Entanglement & Separability
Quantum states are described by density operators ρ, i.e., positive semi-definite (ρ ≥ 0), hermitean (ρ = ρ † , which of course follows from positivity) operators with unit trace, Tr(ρ) = 1. These operators form a convex subset H ⊂ L(H) in the Hilbert-Schmidt space L(H) of linear operators over the Hilbert space H of pure states. Given a bipartition of the Hilbert space into two subsystems A and B with respect to the tensor product, H = H A ⊗ H B , one may classify the quantum states into separable and entangled states. The set S of separable states is defined by the convex (and compact) hull of product states where ρ A n and ρ B n are density operators in H A ⊂ L(H A ) and H B ⊂ L(H B ), respectively. In contrast, any state that is not separable, i.e., which cannot be expressed as a convex combination of product states, is called entangled. The set S c of entangled states hence forms the complement of the set of separable states, such that S ∪ S c = H.
Here we would like to emphasize that the characterization of a given state as being entangled or separable very much depends on the choice of factorizing the algebra of the corresponding density matrix [29,30]. From a practical point of view, this choice of bipartition is often suggested by the experimental setup, e.g., by the spatial separation of the observers Alice and Bob corresponding to subsystems A and B, respectively. From the perspective of a theorist on the other hand, one has a freedom to choose the bipartition into two subsystems. While a given density operator may well be separable with respect to the decomposition H = H A ⊗ H B , it may be entangled with respect to another factorization H A ⊗ H B . Since such a different choice of bipartition corresponds to a change of basis in H, it can be represented by a (global) unitary transformation. As shown in Ref. [29], every separable pure state admits a unitary operator transforming it to an entangled state, and vice versa. Interestingly, for mixed states this switch between separability and entanglement is only possible above a certain bound of purity. This implies that there exist quantum states which are separable with respect to all possible factorizations of the composite system into subsystems. This is the case if U ρ U † remains separable for any unitary transformation U . Such states are called absolutely separable states [31][32][33]. Geometrically one may think of the absolutely separable states as a convex and compact [34] subset A of the separable states S, much like S forms a convex subset of all states. In particular, when dim( H A ) = dim( H B ) = d, one may inscribe a ball of maximal radius r max = 1/ d √ d 2 − 1 into the set S, where the distance of a state ρ from the central maximally mixed state ρ mix = 1 d 2 /d 2 is measured by the trace distance ||ρ − ρ mix || = Tr(ρ − ρ mix ) 2 . All states within this so-called Kuś-Życzkowski ball [31] are separable. Moreover, since the condition r ≤ r max translates to the purity as Tr(ρ 2 ) ≤ 1/ d 2 − 1 and (global) unitaries leave the purity invariant, all states within this maximal ball are also absolutely separable. However, note that not all absolutely separable states lie within this ball, see, e.g., Refs. [29,31,35].
The convex nesting hierarchy A ⊂ S ⊂ H ⊂ L(H) holds for (bipartite) quantum systems of arbitrary dimensions dim( H A ) = d A and dim( H B ) = d B . The density operators of such systems can be written in a generalized Bloch-Fano decomposition [36,37] as where the hermitean operators σ i m for i = A, B are the generalizations of the Pauli matrices, i.e., they are orthogonal in the sense that Tr(σ i m σ i n ) = 2δ mn , and traceless, Tr(σ i m ) = 0, and they coincide with the Pauli matrices for dimension 2. The coefficients a m , b n ∈ R are the components of the generalized Bloch vectors a and b of the subsystems A and B, respectively, which completely determine the reduced states ρ A = Tr B (ρ) and ρ B = Tr A (ρ). The real coefficients t mn are the components of the so-called correlation tensor. Note that a m , b n , and t mn cannot be chosen completely independently, but are jointly constrained by the positivity of ρ.
An interesting subset of the state space is given by the set W of locally maximally mixed states or Weyl states, that is, the set of quantum systems with vanishing Bloch vectors, a m = b n = 0 ∀m, n such that ρ A = 1 A /d A and The set W contains all the maximally entangled states (for which the marginals ρ A and ρ B are maximally mixed) and the uncorrelated maximally mixed , and all states in W are fully determined by their correlation tensors t = (t mn ). For d A = d B = d, the singular value decomposition of the correlation tensor allows bringing t to a diagonal form t = diag{t n } using two orthogonal transformations R 1 and R 2 , such that R 1 t R 2 =t. Moreover, these orthogonal transformations can be realized by local unitaries U 1 ⊗ U 2 , which do not change the entanglement (or the entropy) of the state. This means that, up to local unitaries, all Weyl states for d A = d B = d can be represented by vectors in R d 2 −1 with componentst n and density operators The vector componentst n are constrained by the positivity of ρ, and the allowed vectors map out a convex set in R d 2 −1 . In the case of two qubits, i.e., when d = 2, the Weyl states can hence be nicely illustrated in R 3 , where −1 ≤t n ≤ +1 and (up to local unitaries) the set W forms a tetrahedron, shown in Fig. 1. The four Bell states |Φ ± = 1 √ 2 |00 ± |11 and |Ψ ± = 1 √ 2 |01 ± |10 , where |0 and |1 are the eigenstates of the third Pauli matrix with eigenvalues +1 and −1, respectively, are located at the four corners of the tetrahedron at (t 1 ,t 2 ,t 3 ) = (1, −1, 1) (|Φ + ), (−1, 1, 1) (|Φ − ), (1, 1, −1) (|Ψ + ), and (−1, −1, −1) (|Ψ − ), while the maximally mixed state ρ mix = 1 4 1 4 is located at the origin (0, 0, 0).
The region of separability is determined by the socalled positive partial transpose (PPT) criterion established by Peres [28] and the Horodecki family [13]. The criterion allows to identify bipartite quantum states as entangled, if the partial transposition of their density operator does not yield a positive operator. Given a density matrix ρ in the general Bloch decomposition of Eq. (2), the partial transposition corresponds to the transposition of the (generalized) Pauli operators in one of the subsystems, e.g., (σ A n ) kl → (σ A n ) lk . In 2×2 and 2×3 dimensions the PPT criterion is necessary and sufficient to detect entanglement, but in higher dimensions entangled states can have a positive partial transpose. In our example of the Weyl states, the positivity constraint of the partial transpose identifies the separable Weyl states to lie within a double pyramid (see Refs. [15,38,39]) with corners at (t 1 ,t 2 ,t 3 ) = (±1, 0, 0), (0, ±1, 0) and (0, 0, ±1), as illustrated in Fig. 1. The maximal Kuś-Życzkowski ball [31] of absolutely separable states lies within the The singular values (t1,t2,t3) of the correlation tensor serve as coordinates. The set S of separable states forms a double pyramid (blue) and the entangled states are located in the remaining corners outside. The maximally mixed state ρmix is located at the origin, and the maximal ball of absolutely separable states (purple) around ρmix is contained within S, but touches the double pyramid at the central points of its eight faces. The states located at the tips of the double pyramid, for instance, the Narnhofer state ρN at the coordinates (1, 0, 0), are the separable states with maximal purity. All states that cannot violate the CHSH-Bell inequality lie within the dark-yellow, curved surfaces drawn outside the double pyramid, which includes also some entangled states, whereas all states with positive conditional entropy lie within the outermost curved surfaces (red), as discussed in Section III A. double pyramid and touches the faces of the pyramids at the points where |t 1 | = |t 2 | = |t 3 | = 1 3 , four of which mark the closest separable states to the four Bell states. The entangled Weyl states are located in the four corners of the tetrahedron outside the double pyramid, extending from the separable states to the maximally entangled Bell states at the tips.

B. Entanglement Witnesses
The geometric picture that presents itself for the twoqubit Weyl states, i.e., the separation of separable from entangled states by planes (the faces of the double pyramid), can indeed be generalized to arbitrary dimensions. Owing to the convex structure of the set S and the Hahn-Banach theorem (see, e.g., Ref. [40, p. 75]), one may define so-called entanglement witness operators via the following theorem [13][14][15].
where the Hilbert-Schmidt inner product is defined as (A|B ) HS := Tr A † B for any A, B ∈ L(H) and S denotes the set of separable states from Eq. (1).
Geometrically, a witness operator for a given state ρ defines a hyperplane in the Hilbert spaceH that separates the set S from the point representing the state ρ. An entanglement witness W opt is called optimal if in addition to the requirements of Eq. (4) there exists a separable state ρ sep ∈ S such that (W opt |ρ sep ) HS = 0. The operator W opt defines a tangent plane to the convex set of separable states.
On the other hand, the minimal (trace) distance of an entangled state ρ from the set S, the Hilbert-Schmidt measure D(ρ) given by can be viewed as a measure of entanglement, where the state ρ 0 is called the nearest separable state to ρ. An interesting connection between the Hilbert-Schmidt measure and the entanglement witness inequality arises when we define the maximal violation of the entanglement witness inequality as Here, the minimum is taken over all separable states and the maximum over all possible entanglement witnesses W = W † ∈ L(H) that are suitably normalized, i.e., ||W || = 1. With this, we can formulate the following Theorem [15].
The maximal violation of the entanglement witness inequality is equal to the Hilbert-Schmidt measure, i.e., and is achieved for ρ → ρ 0 and W → W opt , where the optimal entanglement witness is given by As an example, consider the totally antisymmetric Bell state where σ ⊗ σ = 3 n=1 σ n ⊗ σ n is used as a shorthand and the σ n are the usual Pauli matrices. The optimal entanglement witness W ρ − opt for this state is given by The Hilbert-Schmidt product with the entangled state of Eq. (9) yields as required for an entanglement witness in inequality (4a). To confirm that also the second inequality (4b) is satisfied, first note that any separable state can be written as a convex combination of product states with local Bloch vectors a i and b i , that is, The correlation tensor of any separable state hence has components t mn = i p i a m i b n i and its trace is given by where δ i is the angle between the Bloch vectors a i and b i . For two qubits we have | a i |, | b i | ≤ 1 and hence −1 ≤ Tr(t) ≤ 1. If we then compute the Hilbert-Schmidt product of W ρ − opt with an arbitrary separable state we therefore find as required in (4b). The nearest separable state ρ 0 , for which (W ρ − opt |ρ 0 ) HS = 0 is given by which can be seen to lie on the face (closest to the corner representing |Ψ − ) of the double pyramid illustrated in Fig. 1. The optimal witness W ρ − opt from Eq. (8) hence defines the plane containing this face of the pyramid. Finally, we can now easily compute the Hilbert-Schmidt measure D(ρ − ) from Eq. (5) and compare it to Eq. (11), obtaining Since min ρsep∈S (W |ρ sep ) HS = (W ρ − opt |ρ 0 ) HS = 0 we can therefore conclude from Eq. (6) that, indeed, D(ρ − ) = B(ρ − ), as claimed in Theorem II.2.

C. Bell Inequalities & Nonlocality
Let us now turn from the geometric aspects of entanglement to the property referred to as nonlocality. A quantum state is said to be nonlocal if it allows for the violation of a Bell inequality [18,19]. This terminology originates in Bell's locality hypothesis for local hidden-variable theories. In such models, the possible measurement outcomes A and B of two (distant) parties are determined by a hidden parameter λ. These theories are local in the sense that the values A = A(λ, a) and B = B(λ, b) depend on their local measurement settings a and b, respectively, but not on the setting of the other party. As can be shown [18,19], combinations of expectation values of local hidden-variable models are constrained by Bell inequalities, which may be violated by certain (entangled) quantum states.
To be more specific, we consider the Clauser-Horne-Shimony-Holt (CHSH) inequality [19,41] which, in analogy to the entanglement witness inequalities (Theorem II.1), can be written as where the CHSH-Bell operator B CHSH is given by and the vectors a, b, a , b ∈ R 3 denote the measurement directions. All local (and all separable) states ρ loc satisfy the inequality (17). On the other hand, for nonlocal states, like the maximally entangled Bell state ρ − of Eq. (9), the CHSH inequality can be violated for some choice of measurement directions. That is, there exist states ρ nonloc and settings a, b, a , b such that mirroring Eq. (4a) in Theorem II.1. Since all separable states are local, the operator (21 − B CHSH ) can be seen as a witness for nonlocality, and as a (non-optimal) entanglement witness. Unfortunately, this witness is not useful for arbitrary measurement settings. However, the cumbersome task of explicitly determining the directions a, b, a , b can be circumvented via another powerful theorem by the Horodecki family [42].

Theorem II.3 (CHSH operator criterion).
Let ρ be the density operator of a two-qubit state with correlation tensor t = (t mn ), see Eq.
(2), and let µ 1 and µ 2 be the two largest eigenvalues of M ρ = t T t. The state is nonlocal if B max CHSH , the maximally possible expectation value of the Bell-CHSH operator, is larger than 2, i.e., if Using the CHSH operator criterion, it is straightforward to verify that there are quantum states that are entangled, but nonetheless local in the sense of the CHSH inequality. This is best exemplified by a certain family of bipartite mixed states, the so-called Werner states [20], given by For the parameter range 0 ≤ α ≤ 1, the state ρ W (α) can be viewed as an incoherent mixture of the maximally entangled Bell state |Ψ − with probability α on one hand, and the maximally mixed state 1 4 1 4 with probability (1 − α) on the other. However, ρ W (α) represents a valid density operator also for the range − 1 3 ≤ α ≤ 0. Geometrically this can be understood as α parameterizing a straight line in Fig. 1, that connects the corner representing |Ψ − (for α = 1) with ρ mix = 1 4 1 4 (for α = 0) at the origin, but continues onward until it intersects the opposite face of the double pyramid for α = − 1 3 . As Werner discovered [20], the state ρ W (α) is entangled for half its parameter range, that is, for 1 3 < α ≤ 1, the partial transpose of ρ W (α) has a negative eigenvalue. However, the correlation tensor for this state is found to be t α = −α1 3  Interestingly, there exist other Bell inequalities that are more efficient than the CHSH inequality in the sense that one may find states which violate the former, but not the latter. For instance, in Ref. [43], a Bell-type inequality was introduced for which the Werner states show nonlocality already when α > 0.7056, which is slightly smaller than 1 √ 2 ≈ 0.7071. At the same time, recent improvement [44] of a known bound [45] has revealed that Bell inequalities based on projective measurements cannot be violated by Werner states with α ≤ 0.682, leaving only a small window of uncertainty. By employing general positive-operator-valued measurements (POVMs), one may in principle even go beyond the results for projective measurements and the Werner states may be nonlocal also for values of α below 0.682. Bounds on the region of nonlocality have also been obtained in this case. In Ref. [44] it was shown that the correlations of Werner states with α < 0.4547 can be explained by local hidden-variable models for any measurement (improving on the previously known bounds 0.416 [46] and 0.4519 [47]).
In general one may in fact even find states with positive partial transposition that can violate certain Bell inequalities [48]. The relationship of nonlocality with the PPT criterion, bound entanglement [49][50][51], or steering criteria [52] is hence complicated. For example, there are states whose entanglement is bound (no pure entangled state may be distilled from any number of copies of the state), which may yet violate a Bell inequality. Conversely, there are states with non-positive partial transposition (NPT) that do not violate any Bell inequality. And while all entangled states with positive partial transpose are bound entangled, it is not known whether a non-positive partial transposition implies distillability. For the remainder of this paper we will therefore focus on nonlocality in the sense of the violation of the CHSH inequality.
To incorporate this notion of nonlocality into our geometric picture, one can systematically apply the CHSH operator criterion to all Weyl states, noting that all locally maximally mixed states for which max i =j t 2 i +t 2 j > 2 are nonlocal. The resulting region of nonlocality is illustrated in Fig. 1 where it is situated in the four corners of the tetrahedron outside the dark-yellow "parachutes". The region of local states can be found within these parachutes and contains all separable but also a number of (mixed) entangled states [29,53].

D. Hidden Nonlocality
Since, as Werner demonstrated [20], certain entangled mixed states may satisfy all possible Bell inequalities, locality is not a sufficient criterion for separability. At this point it is important to note that the definition of nonlocality that we have used here is not the only one possible. Indeed, we call states nonlocal only if they can be directly used to violate a Bell inequality. However, as shown by Gisin [54], for some initially local quantum states the entanglement may be amplified by local filtering operations to allow for the violation of a Bell inequality. In this way the nonlocal character of the quantum system can be revealed (see also Ref. [55] in this connection).
To understand this phenomenon, we consider a family of quantum states that arise as mixtures of pure (entangled) states ρ θ = |ψ θ ψ θ |, where for 0 < θ < π 2 , with the mixed state given by The Gisin states [54] ρ G are hence given by for real probability weights 0 ≤ λ ≤ 1. Note that the Gisin states are in general not locally maximally mixed, i.e., the local Bloch vectors do not vanish for the whole parameter range. Only the subset for which θ = π/4 can be represented in the tetrahedron of Weyl states as a line connecting the state ρ top at the upper corner of the separable double pyramid with the maximally entangled state |Ψ + , as shown in Fig. 1.
With the help of the PPT criterion one immediately finds that the Gisin state is entangled if and only if 1 − λ < λ sin(2θ). We can furthermore quantify the entanglement of ρ G using an entanglement monotone called concurrence [9][10][11]. For an arbitrary two-qubit density operator ρ, the concurrence C[ρ] is given by where the λ i (i = 1, 2, 3, 4) are the (nonnegative) eigenvalues of ρ σ y ⊗ σ y ρ * σ y ⊗ σ y in decreasing order (λ 1 ≥ λ 2 ≥ λ 3 ≥ λ 4 ), and ρ * is the complex conjugate of ρ with respect to the computational basis. For the Gisin state, a simple calculation reveals that which is illustrated in Fig. 2 for the allowed range of λ and θ. In contrast, we can determine the parameter range for which ρ G (λ, θ) is nonlocal using the CHSH operator criterion of Theorem II.3. Reading off the matrix elements of the correlation tensor from the Bloch decomposition The parameter region for which the Gisin states are nonlocal is indicated in Fig. 2. Similar to the Weyl states in Fig. 1, some of the Gisin states may be local although being more entangled (as measured by the concurrence) than some of the nonlocal Gisin states.
However, the most interesting feature of the Gisin states is revealed by applying a local filtering procedure. That is, suppose that after sharing the state ρ G (λ, θ) for some θ between 0 and π/4, Alice and Bob locally amplify their qubit states |0 A and |1 B , respectively. Since in that case sin(θ) < cos(θ), this increases the component of |01 with respect to that of |10 in |ψ θ , which effectively moves the state closer to the maximally entangled state |Ψ + . Likewise, if π/4 < θ < π/2, amplifying |1 A and |0 B , respectively, will have the same effect. Mathematically, these filtering operations are represented by a family of local, completely positive, and trace-nonincreasing maps F θ , parameterized by θ, and given by Here, we choose Kraus operators satisfying F † θ F θ ≤ 1 which are given by where the local operations are F 0 (θ) = 1 0 0 tan(θ) , and F 1 (θ) = tan(θ) 0 0 1 .
The probability for successful filtering can be computed as With this, we get the normalized quantum state after the filtering procedure, i.e., The filtered Gisin state now fully lies within the set of Weyl states. In fact, the set of filtered Gisin states coin-cides with the set of unfiltered Gisin states for θ = π/4, represented by the dashed red lines from ρ top to ρ + = |Ψ + Ψ + | in Fig. 1 and Fig. 2. Moreover, we can easily evaluate the concurrence of the filtered Gisin states, obtaining which is illustrated in Fig. 3. As for the unfiltered state, we see that ρ F is entangled if (and only if) λ sin(2θ) > (1 − λ). This means, the filtering is not able to entangle initially separable states. However, since the denominator satisfies λ sin(2θ) + 1 − λ ≤ 1, all the already entangled states can be seen to become more entangled.
Although the filtering operation is local, this is possible since the part of the initial quantum state that does not pass the filters is disregarded. If we were to complete the (trace-nonincreasing) quantum operation F θ to a (tracepreserving) quantum channel F θ with Kraus operators F θ and F θ = 1−F † θ F θ 1/2 , then the entanglement would not increase.
Having noted that this amplification of the entanglement leaves separable states separable, it is now interesting to consider the effect on nonlocality. We calculate the maximally possible expectation value of the CHSH inequality from Theorem II.3, which yields Focussing on the parameter region where ρ F is entangled, i.e., for λ > 1 + sin(2θ) −1 , we find the condition for the filtered Gisin state to be nonlocal as As illustrated in Fig. 3, the nonlocal parameter region for ρ F (λ, θ) includes the entire region of nonlocality of the unfiltered state, but is also strictly larger. Some previously local (entangled) states become more strongly entangled and even nonlocal due to the filtering. The amplification of entanglement hence reveals the hidden nonlocality of some of the Gisin states, while others remain local. Although this separation may attributed to the choice of filtering operation, it should be remarked here that not every entangled state can become nonlocal under local filtering operations [56].
Further note that, in contrast to Gisin's nonunitary but local filtering operations, one may instead use a unitary but nonlocal operation to increase the entanglement of the Gisin state. This simply corresponds to another choice of factorizing the algebra of a density matrix [29]. In this case, the mixedness of the state would remain unchanged. For instance, consider the unitary transformation given by where f ± (θ) = cos(θ) ± sin(θ). Since ρ θ from Eq. (21) is transformed to the maximally entangled state ρ + , i.e., U θ ρ θ U † θ = ρ + , and ρ top = U θ ρ top U † θ is left invariant by the unitary transformation, the Gisin states become The unitarily transformed Gisin states are independent of θ, and more specifically, ρ U (λ) = ρ G (λ, π/4). The unitary hence corresponds to vertically moving states in Fig. 2 towards the dashed red line of θ = π/4 while keeping λ fixed. It can easily be seen that this allows for separable states to become entangled, and even nonlocal with respect to the new factorization.

E. Classical and Quantum Entropy Measures
Let us now turn to another major category of quantities used for the characterization of correlations. Many fundamental features of multi-party (quantum) systems can be captured by entropy, a key concept in both classical and quantum physics. In classical information theory, the basic quantity is the Shannon entropy. For a random variable A whose possible values a are encountered with probability p(a), the Shannon entropy is given by where the logarithm is understood to be to base 2. The Shannon entropy H(A) represents the uncertainty for the occurrence of the values a in the sense that it quantifies the amount of information (in bits) that is gained on average by sampling the random variable once. For a bipartite system with independent random variables A and However, when we extend these entropic measures to quantum systems, we will encounter some interesting differences to the classical case, especially when entangled systems are considered. The quantum analogue to the classical entropy of Eq. (35) is the von Neumann entropy S(ρ), defined as the Shannon entropy of the spectrum of the density operator ρ representing the quantum state, that is, where ρ = n p n |ψ n ψ n | for some orthonormal basis {|ψ n }. Similar to the Shannon entropy, the von Neumann entropy represents the uncertainty -the lack of information -we have about the state represented by ρ. This definition naturally applies to bipartite systems with density operators ρ AB , such that the joint entropy is Since the von Neumann entropy of pure states vanishes, one may quantify the entanglement of bipartite pure states |ψ AB by the entropy of the reduced states, i.e., one can define the entropy of entanglement E(|ψ AB ) as where S(A) ≡ S(ρ A ) and S(B) ≡ S(ρ B ). However, when generalizing this concept to mixed states, it becomes problematic to distinguish the contributions of the joint state entropy and entanglement to the entropy of the subsystems. This necessitates the introduction of a complicated optimization procedure when defining the so-called entanglement of formation E oF of a mixed state as where the minimization is carried out over all purestate ensembles realizing the density operator ρ = n p n |ψ n ψ n |. It is not known how to practically carry out this optimization in general, but for some special cases, E oF (ρ) can be computed explicitly. Amongst these, the most prominent is the case of two qubits, where the entanglement of formation is found to be a monotonously increasing function of the concurrence [10] of Eq. (24), i.e., does not separate genuine quantum correlations (i.e., entanglement) from purely classical correlations. Instead, as emphasized by Cerf and Adami [23], the quantum mutual information S (A : B) is a measure of the overall correlations. Moreover, S(A : B) has an interesting interpretation in the context of quantum thermodynamics. The quantum mutual information can be shown to be proportional to the work cost of its creation from an initial thermal bath [57]. That is, the maximal amount of correlation as measured by the mutual information that can be created between two initially thermal, noninteracting systems at temperature T at the expense of the work W is S As we shall see in the next section, when one also introduces the generalization of the conditional entropy to the quantum regime one encounters some more surprises.

F. Conditional Entropy and Conditional Amplitude Operator
A straightforward generalization 1 of the conditional entropy of Eq. (37) to bipartite density operators ρ AB on a joint Hilbert space is A physical interpretation for the negative quantum conditional entropy was given in the context of state merging protocols between two observers [61]. There it was found that positive values of S(A : B) quantify the partial information in qubits that need to be sent from A to B, whereas a negative conditional entropy indicates that, in addition to successfully running the protocol, a surplus of qubits remains for potential future communication. Moreover, a classical analogue of negative partial information can also be given [62]. Other physical interpretations of negative conditional entropy arise in quantum thermodynamics [63], and when considering measurements of quantum systems, where the negative conditional entropy quantifies the amount of information in the post-selected ensembles [64]. These interesting interpretations motivate considering "entropic Bell inequalities" whose violation implies a negative conditional entropy [65].
Here, we want to better understand the relationship of negative conditional entropy and entanglement. In order to do so, let us first discuss a different way to extend the classical conditional entropy of Eq. (37) to the quantum case. That is, we consider the conditional amplitude operator ρ A|B proposed by Cerf and Adami [23,24], which is given by where the exponential map is understood to be to base 2. The conditional amplitude operator is a positive semidefinite hermitian operator defined on the support of ρ AB that takes over the role of the classical conditional probability p(a|b) in the sense that one can now define the conditional entropy as in analogy to Eq. (37). To see that this definition is equivalent to Eq. (50), simply note that Despite this close analogy between the conditional probability distribution and the conditional amplitude operator there are some fundamental differences. Whereas p(a|b) is a probability distribution satisfying 0 ≤ p(a|b) ≤ 1, its quantum analogue ρ A|B is not a density matrix in general. While ρ A|B is hermitian and positive semidefinite, it can have eigenvalues larger than one, and hence ρ A|B 1 . Ultimately, this is what can lead to the negativity of the conditional entropy. As we have seen, a state for which this occurs is the maximally entangled Bell state. Moreover, we can immediately note that the spectrum of ρ A|B (and thus the conditional entropy) is invariant under any local unitary transformation of the form U A ⊗ U B , which also leaves entanglement unchanged. This already suggests that the spectrum of the Cerf-Adami operator ρ A|B is related to the separability of quantum states. Indeed, the following theorem due to Cerf and Adami [24] can be formulated.
is positive semi-definite if the bipartite quantum states characterized by ρ AB are separable.
Theorem II.4 implies that any separable bipartite state satisfies the condition ρ A|B ≤ 1 . In turn, this means that the conditional entropy is non-negative, S(A|B) ≥ 0, for any separable state. States with negative conditional entropy must hence necessarily be entangled. Here, it is important to note that the negativity of the conditional entropy implies that (some of) the eigenvalues of ρ A|B exceed the physical boundary of unity, but the converse is not true as we demonstrate by several examples in Sect. III A.
Moreover, the condition ρ A|B ≤ 1 and the positivity of S(A|B) are only necessary for the separability of the quantum states, but are in general not sufficient. As realized in Ref. [24], there exist entangled quantum states ρ AB for which the operator σ AB from Eq. (56) is positive semi-definite, σ AB ≥ 0, and hence ρ A|B ≤ 1 and S(A|B) ≥ 0. Such cases are of interest to the present work when it comes to detecting entanglement and nonlocality. The results of our investigation in 2 × 2 dimensions are presented in Sect. III A. Before we finally turn to these results, also note that an operator analogous to that of Eq. (53) can be defined for the mutual information [23]. The mutual amplitude operator ρ A:B defined as gives rise to the mutual information of Eq.
III. RESULTS

A. Geometry of Two Qubit States with Negative Conditional Entropy
We now wish to incorporate the negativity of the conditional entropy and the conditional amplitude operator bound into the geometric picture of two-qubit entanglement. To this end, we first consider again the Werner states ρ W (α) from Eq. (20). As we have previously argued, these locally maximally mixed states form a line in the tetrahedron of Weyl states, reaching from the maximally entangled state |Ψ − at α = 1, through the maximally mixed state at the origin for α = 0 to the opposite side of the separable double pyramid until α = −1/3, see Fig. 1. The Werner states are entangled for α > 1/3 and violate the CHSH inequality for α > 1/ √ 2.
When we now compute the conditional entropy for the Werner state, we note that the eigenvalues of ρ W are 1−α 4 (thrice degenerate) and 1+3α 4 . With this we find that the boundary between negative and non-negative conditional entropy is given by the state ρ W (α 0 ), where α 0 ≈ 0.7476 > 1/ √ 2 is the solution of the transcendental equation The condition S(A|B) < 0 is hence a strictly stronger condition than nonlocality for the family of Werner states, as illustrated in Fig. 4. Indeed, a numerical analysis shows that this is the case for all Weyl states, i.e., the curved surfaces beyond which the conditional entropy becomes negative lie strictly outside of the local region within the orange parachutes in the tetrahedron of locally maximally mixed states, see Fig. 1. Examining, on the other hand, the conditional amplitude operator for the Werner states, one finds ρ A|B = 2ρ W , since ρ B = Tr A (ρ W ) = 1 2 1 such that log 1 A ⊗ ρ B commutes with log(ρ W ). Therefore, the condition ρ A|B ≤ 1 is met as long as α ≤ 1/3, i.e., as long as ρ W is separable, whereas ρ A|B has an eigenvalue larger than 1 for α > 1/3. For the Werner states the Cerf-Adami condition ρ A|B 1 is thus equivalent to the PPT criterion [13,28], a fact already noticed by Cerf and Adami [24].
Indeed, the observations we have made for the Werner states also hold for all other Weyl states in addition to this one-parameter subfamily. That is, a numerical evaluation of the conditional entropy of the locally maximally mixed states of Eq. (3) presented in Fig. 1 shows that the negativity of S(A|B) is a strictly stronger condition than nonlocality for these states. That is, the red, curved surfaces indicating where the conditional entropy changes sign lie outside the orange parachute surfaces marking the boundary of nonlocality for all Weyl states. Moreover, we can also formulate the following conditional amplitude operator (CAO) criterion.
Theorem III.1 (CAO Criterion). For every locally maximally mixed state ρ ∈ W, the criterion ρ A|B ≤ 1 for the Cerf-Adami conditional amplitude operator ρ A|B given by Eq. (53) is equivalent to the PPT criterion, i.e., Proof. For the proof of Theorem III.1 we recall Wootters' concurrence [9][10][11] from Eq. (24). For calculating C we need the "spin-flipped" stateρ = σ y ⊗σ y ρ * σ y ⊗σ y which is equal to the density operator ρ for all Weyl states, ρ = ρ. The square roots of the eigenvalues of ρρ needed for the concurrence are hence just the eigenvalues p n (n = 1, 2, 3, 4) of ρ, which satisfy n p n = 1. Consequently, the concurrence of all Weyl states can be written as where the largest eigenvalue p 1 must exceed the value of 1/2 for ρ to be entangled. Next, recall that for all Weyl states we have ρ A = ρ B = 1 2 1 and the Cerf-Adami conditional amplitude operator is hence given by Consequently, we have ρ A|B 1 when the largest eigenvalue of ρ A|B = 2ρ exceeds 1, i.e., when the largest eigenvalue of ρ exceeds 1/2. By virtue of Eq. (59) this means that the state is entangled. Conversely, all entangled Weyl states must have an eigenvalue larger than 1/2 such that ρ A|B 1. The fact that all entangled two-qubit states have nonzero concurrence and non-positive partial transposition concludes the proof.

B. Inequivalence of the CAO and PPT Criteria
Having established the significance of the conditional amplitude operator and the relationship of entanglement, negative conditional entropy, and nonlocality for the Weyl states, we are curious whether the observations we have made also hold for other states. We therefore consider the unitary orbit of one of the Weyl states that takes us outside this set. Starting from the Narnhofer state ρ N = 1 4 (1 2 ⊗ 1 2 + σ x ⊗ σ x ), situated at the corner of the double pyramid of separable states half-way on the line connecting |Ψ + and |Φ + in the tetrahedron of Fig. 1, we apply the unitary transformation The resulting state, given by lies outside of the set W due to the occurrence of the term 1 2 (σ z ⊗ 1 2 + 1 2 ⊗ σ z ). The purity Tr(ρ 2 N ) = Tr(ρ 2 V ) = 1 2 of the state is left unchanged by the unitary transformation but the final state ρ V is entangled. In fact, the concurrence takes the maximally possible value at this fixed purity, C[ρ V ] = 1 2 , i.e., the state ρ V belongs to the class of maximally entangled mixed states (MEMS) [66,67]. In other words, no global unitary may entangle this state any further.
With this in mind, we now consider a family of states in the two-qubit Hilbert space along the line from ρ V to ρ top from Eq. (22), i.e., we define where 0 ≤ ν ≤ 1. The eigenvalues of the partial transpose of ρ V (ν) are ν 4 (twice degenerate) and 1 4 2 − ν(1 ± √ 2) . The states along the line are hence entangled if ν > 2( √ 2 − 1). Now, if we consider the CAO criterion, we first compute the reduced state and the spectrum of ρ V (ν), given by To compute the spectrum of ρ A|B , note that ρ V (ν) has (at least) one vanishing eigenvalue [see Eq. (65)], which is problematic when evaluating log ρ V (ν). However, a simple work-around is to replace the vanishing eigenvalue by > 0 throughout the computation and take the limit → 0 at the end. With this procedure we obtain the eigenvalues of ρ A|B as The first three eigenvalues are always smaller than 1, but the last eigenvalue becomes larger than one when ν > 2 √ 5 > 2( √ 2 − 1). We thus see that the PPT criterion and the CAO criterion are inequivalent in general. Nonetheless, the conditional entropy of the state ρ V (ν) remains nonnegative for all values ν, and none of these states allows for a violation of the CHSH inequality either.
To incorporate also negative conditional entropy and nonlocality into the picture, we hence turn again to the Gisin states ρ G (λ, θ) from Eq. (23). The spectrum of the density operator ρ G (λ, θ) is given by while the reduced states ρ A and ρ B are already diagonal and have eigenvalues 1 2 1 ± λ cos(2θ) . The graphical analysis of the parameter region for λ and θ for which the conditional entropy is negative reveals an interesting feature. As can be seen in Fig. 5 (a), while some Gisin states are both nonlocal and have negative conditional entropy, some only have one of these properties, but not the other. That is, contrary to what was found for the Weyl states, in general not all states for which S(A|B) < 0 are also nonlocal. And, as before, not all nonlocal states have negative conditional entropy.
Following up on this surprise, let us quickly examine the condition ρ A|B ≤ 1 for the Gisin states. As noted in Ref. [24], there exist entangled states for which ρ A|B ≤ 1 indeed holds, but due to Theorem III.1, these must lie outside the set W. The Gisin states are hence perfect examples for such states.
When computing the spectrum of ρ A|B for ρ G we again encounter (at least) one vanishing eigenvalue [see Eq. (67)]. As before, we therefore replace the vanishing eigenvalue by > 0 in the computation and consider the limit → 0 at the end. With this method, the eigenvalues κ i of ρ A|B are found to be , While κ 2 and κ 3 are smaller than 1 for all values of λ and θ, the fourth eigenvalue κ 4 can become larger than 1. The corresponding region, delimited by the purple lines in Fig. 5 (a), is contained within the region of entangled states, but there is a region of entanglement where κ 4 < 1 and hence ρ A|B ≤ 1. This clearly demonstrates that the condition ρ A|B ≤ 1 for the Cerf-Adami operator in general provides a necessary but not sufficient condition for separability. For the sake of completeness and illustration, let us also re-examine the filtered Gisin states ρ F (λ, θ) from Eq. (29). Since these are Weyl states, Theorem III.1 applies and the boundary between ρ A|B ≤ 1 and ρ A|B 1 coincides with the boundary between separability and entanglement. To determine the conditional entropy of ρ F we note that the nonzero eigenvalues of the filtered . For the unfiltered states in (a), the region of entangled states whose conditional amplitude operator is bounded by unity, ρ A|B ≤ 1, is delimited in purple, showing that the condition ρ A|B 1 is strictly weaker than the PPT criterion for two qubits. In addition, it can be clearly seen in (a) that there is no clear hierarchy between the conditions of nonlocality and negative conditional (von Neumann) entropy. That is, there exist local states with negative conditional entropy, as well as nonlocal states with positive conditional entropy, S(A|B) ≥ 0. However, the hierarchy is recovered when considering conditional Rényi entropies Sα(A|B) from Eq. (71) for α ≥ 2. The corresponding boundaries for α = 2, 3, . . . , 10 are shown as dashed lines. In (b), the boundaries for S(A|B) < 0, nonlocality, and ρ A|B 1 for the unfiltered states are indicated by the respective dashed lines.
Gisin states are λ sin(2θ) 1 − λ + λ sin(2θ) and 1 − λ where the latter eigenvalue is twice degenerate. With these eigenvalues, we can evaluate the conditional entropy and find that the region where it is negative is contained within the region of nonlocality, see Fig. 5 (b).

C. Negativity of Generalized Conditional Entropies
For the conditional entropy based on the von Neumann entropy S(ρ), no clear hierarchy with nonlocality can hence be established in general. Some states may be nonlocal and satisfy S(A|B) ≥ 0, while other states may have negative values of S(A|B), whilst being local (in the sense of the CHSH inequality). An interesting way out of this confusion is employing generalized entropy measures. One candidate for such an extension is the Rényi α-entropy, defined as where S α (A, B) ≡ S α (ρ AB ) and S α (B) ≡ S α (ρ B ). These generalized conditional entropies can be shown [68] to be nonnegative for all separable states, such that S α (A|B) < 0 implies that the quantum state is entangled. Moreover, it was shown in Ref. [68] that the negativity of the conditional Rényi 2-entropy, S 2 (A|B) < 0, already is a necessary condition for nonlocality in the CHSH sense. In other words, the positivity of S 2 (A|B) means that the CHSH inequality cannot be violated, i.e., where a and b are the Bloch vectors [see Eq.
(2)] of the two qubits, respectively. The conditional Rényi 2-entropy hence provides a strictly stronger condition than nonlocality for two qubits, but not in higher dimensions [68]. This is illustrated for the Weyl states in one sector of the tetrahedron in Fig. 6. Moreover, it was proven in Ref. [39] that all Weyl states are separable if and only if all con- The sector of the tetrahedron of Weyl states defined bỹ t1,t2 ≥ 0 andt3 ≤ 0 is shown. In addition to the boundaries shown in Fig. 1, the boundary between states with positive and negative conditional Rényi 2-entropy from Eq. (71) is illustrated. As can be seen, all local states lie within the set of states for which S1(A|B) ≥ 0, whereas all nonlocal states have negative conditional Rényi 2-entropy. The dashed red line indicates the filtered Gisin states in this sector of the tetrahedron, corresponding to (parts of) the dashed red lines in Fig. 1 and Fig. 2.
The positivity of the entire family of conditional Rényi αentropies hence provides an entanglement criterion equivalent to the PPT criterion for locally maximally mixed states. For other two-qubit states things are again less clear. For instance, for the unfiltered Gisin states ρ G (λ, θ) from Eq. (23), the conditions S α (A|B) ≥ 0 are shown in Fig. 5 (a) for α = 2, 3, . . . , 10, which are clearly stronger than nonlocality, but weaker than PPT or ρ A|B ≤ 1 for detecting entanglement.
Nonetheless, conditional entropies and the conditional amplitude operator provide straightforward entanglement witnesses that can be in principle employed in systems of arbitrary dimension. For example, for some specific two-mode Gaussian states (where the Hilbert space is infinite-dimensional), the Tsallis q-conditional entropy gives comparable results [69] to the two-qubit case. In general, the exact relationship between entanglement, nonlocality, and conditional entropies is nonetheless complicated.

IV. CONCLUSION
We have reviewed the geometry of entanglement for two-qubit systems. Despite its simplicity, this bipartite system already reveals many of the intricacies in the relationship of the numerous criteria for entanglement and separability, and is hence an important guiding example. In particular, we have focussed on highlighting the roles of negative conditional entropy and the conditional amplitude operator criterion as entanglement detection methods. Since many technical complications already arise for the simple two-qubit case, we have placed specific emphasis on the family of locally maximally mixed Weyl states. For the latter, a clear hierarchy emerges, in which the set of CHSH-nonlocal states fully contains the set of states with negative conditional (von Neumann) entropy, while it is itself fully contained within the set of states with negative conditional Rényi 2-entropy. At the same time, we have shown that the conditional amplitude operator criterion is equivalent to the PPT criterion for all Weyl states, but not in general, as we have demonstrated for several examples.
Our article hence provides both an introduction to the topic of entanglement geometry and a step towards the exploration of conditional amplitude operators as general entanglement detection tools. Specifically, it may be of interest for future research to investigate possible generalizations of conditional entropy operators, and to clarify whether the violation of the criterion ρ A|B ≤ 1 implies a nonpositive partial transpose in general.