Fundamental limits on key rates in device-independent quantum key distribution

In this paper, we introduce intrinsic non-locality as a quantifier for Bell non-locality, and we prove that it satisfies certain desirable properties such as faithfulness, convexity, and monotonicity under local operations and shared randomness. We then prove that intrinsic non-locality is an upper bound on the secret-key-agreement capacity of any device-independent protocol conducted using a device characterized by a correlation $p$. We also prove that intrinsic steerability is an upper bound on the secret-key-agreement capacity of any semi-device-independent protocol conducted using a device characterized by an assemblage $\hat{\rho}$. We also establish the faithfulness of intrinsic steerability and intrinsic non-locality. Finally, we prove that intrinsic non-locality is bounded from above by intrinsic steerability.


Introduction
In principle, quantum key distribution (QKD) [BB84, Eke91, SBPC + 09] provides unconditional security [May01,SP00,LCT14] for establishing secret key at a distance. In the standard QKD setting, Alice and Bob (two spatially separated parties) trust the functioning of their devices. That is, it is assumed that they know the ensemble of states that their sources are preparing and the measurement that their devices are performing. However, this is a very strong set of assumptions.
It is possible to consider other scenarios in which the trust assumptions are relaxed while still obtaining unconditional security. When one of the devices is untrusted, the protocol is referred to as one-sided device-independent (SDI) quantum key distribution [TR11, BCW + 12]. If both the devices are untrusted, then we are dealing with the scenario of device-independent (DI) quantum key distribution [MY98, ABG + 07, VV14, AFDF + 18].
It is interesting to note that the three scenarios of QKD mentioned above are in correspondence with a hierarchy of quantum correlations [WJD07]. The standard QKD approach requires that Alice and Bob share entanglement [HHHH09] or that they are connected by a channel that can preserve entanglement. In SDI-QKD, a requirement for Alice and Bob to generate secret key is that their systems violate a steering inequality [BCW + 12, CS17]. For DI-QKD, Alice and Bob's systems should violate a Bell inequality [CHSH69, ABG + 07, BCP + 14].
non-locality. This quantity was then proven to be an upper bound on device-independent secret key rates that are secure against a no-signaling adversary with classical inputs and outputs. This is the scenario in which an untrusted no-signaling device is shared by the honest parties (as in our model), while the inputs and outputs of an adversary Eve are assumed to satisfy only the no-signaling conditions. The latter is more 'liberal' than the models considered in our work.

Restricted intrinsic steerability
In this section, we recall the definition of restricted intrinsic steerability, which was introduced in [KWW17]. We begin by recalling the notion of an assemblage. Let ρ AB be a bipartite quantum state shared by Alice and Bob. Suppose that Alice performs a measurement labeled by x ∈ X , with X denoting a finite set of quantum measurement choices, and she gets a classical output a ∈ A, with A denoting a finite set of measurement outcomes. An assemblage [Pus13] consists of the state of Bob's subsystem and the conditional probability of Alice's outcome a (correlated with Bob's state) given the measurement choice x. This is specified as {pĀ |X (a|x), ρ a,x B } a∈A,x∈X . The sub-normalized state possessed by Bob isρ a,x B := pĀ |X (a|x)ρ a,x B . Taking p X (x) as a probability distribution over measurement choices, we can then embed the assemblage {ρ a,x B } a,x in a classical-quantum state as follows: (1) Notation 1 In the above and what follows, we employ the shorthand [x a] XĀ to denote |x x| X ⊗ |a a|Ā.
Assemblages are restricted by the no-signaling principle. That is, the reduced state of Bob's system should not depend on the input x to Alice's black box if the measurement output a is not available to him: ∀x, x ∈ X .
(2) This is equivalent to I(X; B) ρ = 0 for all input probability distributions p X (x), where I(X; B) ρ := H(X) ρ + H(B) ρ − H(XB) ρ is the mutual information of the reduced state ρ XB = TrĀ(ρ XĀB ). An assemblage is referred to as LHS (local-hidden-state) if it arises from a classical shared hidden variable Λ in the following sense: We now recall a measure of steerability that was introduced in [KWW17]: Definition 2 (Restricted intrinsic steerability [KWW17]) Let {ρ a,x B } a,x denote an assemblage, and let ρ XĀB denote a corresponding classical-quantum state. Consider a no-signaling extension ρ XĀBE of ρ XĀB of the following form: ∀x, x ∈ X . (5) The restricted intrinsic steerability of {ρ a,x B } a,x is defined as follows: where the supremum is with respect to all probability distributions p X and the infimum is with respect to all non-signaling extensions of ρ XĀB as specified above. Furthermore, the conditional mutual information of a tripartite state σ KLM is defined as Using the no-signaling constraints, which imply that I(X; B|E) ρ = 0, and the chain rule for conditional mutual information, it follows that S(Ā; B)ρ := sup 3 Quantum non-locality

Correlations
Consider a two-component device that takes in two inputs and gives out two outputs. Let one component be with Alice and the other component be with Bob. Let us set some notation now. Alice's component takes in an input letter x ∈ X and outputs a ∈ A. Similarly, Bob's component accepts an input letter y ∈ Y and outputs b ∈ B. We consider X and Y to be finite sets of quantum measurement choices and A and B to be finite sets of measurement outcomes. For simplicity, we consider X = Y = [s] and A = B = [r]. The conditional probability distribution {p(a, b|x, y)} a,b∈[r],x,y∈ [s] corresponding to the device is traditionally called a "correlation." Then the correlations can be divided as follows according to the constraints that they fulfill.
• Local correlations: A correlation is said to have a local-hidden variable (LHV) description or be a local correlation if it can be written as p(a, b|x, y) = λ p Λ (λ)p(a|x, λ)p(b|y, λ), where Λ is a local hidden variable, p Λ (λ) is the probability that the realization λ of the local hidden variable Λ occurs, p(a|x, λ) is the probability of obtaining the outcome a given x and λ, and p(b|y, λ) is the probability of obtaining the outcome b given y and λ. Let L denote the set of correlations that can be written as in (9). A device characterized by local correlations is known as a local box.
• Quantum correlations: The set Q of quantum correlations corresponds to the set of correlations that can be written as p(a, b|x, y) = Tr([Λ a x ⊗ Λ b y ]ρ AB ), where ρ AB is a bipartite quantum state and {Λ a x } a and {Λ b y } b are POVMs characterizing Alice and Bob's respective measurements with Λ a x , Λ b y ≥ 0 for all a ∈ A and b ∈ B and a Λ a x = I and b Λ b y = I.
• No-signaling correlations: The set NS corresponds to the set of correlations that fulfill the following no-signaling principle: The no-signaling constraints (11) and (12) can be expressed equivalently in terms of conditional mutual informations, namely ∀p(x, y) I(X;B|Y ) p = 0 = I(Y ;Ā|X) p , with respect to the joint distribution p(a, b, x, y) = p(x, y)p(a, b|x, y), and where p(x, y) ranges over probability distributions on X and Y .
It is well known that local correlations are contained in the set of quantum correlations, that is, L ⊂ Q. Since the correlations in Q fulfill the constraints in (11) and (12), we have that Q ⊂ NS. For more details on correlations, please refer to [BCP + 14]. An example of a correlation that belongs to the no-signaling correlations, but not the quantum correlations, is a Popescu-Rohrlich (PR) box [RP94] box, which is defined as follows: Definition 3 (PR box) A PR box is a device corresponding to the following correlation p(a, b|x, y): p(0, 0|x, y) = p(1, 1|x, y) = 1 2 for (x, y) = (1, 1), while p(a, b|x, y) = 0 for all other quadruples. This correlation is no-signaling between Alice and Bob, as defined in (11) and (12).

Local operations and shared randomness
Physically, local operations and shared randomness [FWW09,FW11] refers to an operation in which Alice and Bob share unlimited free randomness between their two components and can perform local operations on • the inputs given by Alice and Bob to their respective components, • the outputs of the two components to give the final outputs to Alice and Bob.
The local operations and shared randomness act on the initial correlation p i (a, b|x, y) corresponding to the device, in order to yield a final, modified correlation p f (a, b|x, y). These operations can be parametrized as follows [GA17]: Here, I (L) corresponds to a local correlation for a local device that takes in the inputs x f and y f from Alice and Bob, uses shared randomness, and performs local operations to yield new inputs x and y for the main device characterized by p i . This can be written as where p Λ 2 (λ 2 ) corresponds to the probability distribution of the shared classical variable Λ 2 , I A (x|x f , λ 2 ) corresponds to the probability of obtaining x given x f and λ 2 , and I B (y|y f , λ 2 ) corresponds to the probability of obtaining y given y f and λ 2 . Once the initial device p i generates the outputs a and b, it can be post-processed by a local device that is characterized by the local correlation O (L) . This can be written as This device takes in a, b, x, y, x f , y f and gives the final outputs a f , b f by using shared randomness and performing local operations on the inputs. Here, p Λ 1 (λ 1 ) is a probability distribution over the classical shared random variable λ 1 , O A (a f |a, x, x f , λ 1 ) is a conditional probability distribution for obtaining a f given x, x f , λ 1 , a, and O B (b f |b, y, y f , λ 1 ) is a conditional probability distribution for obtaining b f given y, y f , λ 1 , b. See Figure 1 for a pictorial representation of the most general transformation of local operations and shared randomness on a correlation p i (a, b|x, y).
In the resource theory of Bell non-locality [dV14,GA17], the resources are non-local correlations p(a, b|x, y). Local operations and shared randomness are one possible set of free operations in this resource theory [dV14]. It can be shown from the definition of a local correlation that the action of the local operations and share randomness transforms a local correlation to a correlation in L. Furthermore, a quantum correlation remains in the set Q when acted upon by these free operations. To see this, replace the local boxes O (L) and I (L) in (15) by separable states shared between Alice and Bob with the local states encoding the probability distributions required in (16) and (17) and the measurements as projective measurements.
In [GA17], a larger set of free operations known as wirings and prior-to-input classical communication (WPICC) was considered. It was also shown in Lemma 6 of [GA17] that any quantifier that is monotone under local operations and shared randomness is also monotone under WPICCs.

Intrinsic non-locality
To calculate the amount of non-locality present in the correlation p(a, b|x, y), we introduce a function N : p(a, b|x, y) → R ≥0 , which we call intrinsic non-locality. Consider a correlation p(a, b|x, y) ∈ NS. Now embed the correlation p(a, b|x, y) into a classical-classical state as where p(x, y) is a probability distribution for the measurement choices x and y. Consider a nosignaling extension ρĀB XY E of ρĀB XY : such that Tr E (ρĀB XY E ) = ρĀB XY , and the following no-signaling constraints hold: It is then easy to see that, given the value in system Y , the state of systems X and systemsBE is product. This is equivalent to the following constraint on conditional mutual information: Similarly, the following no-signaling constraints hold b p(a, b|x, y)ρ a,b,x,y It is easy to see that, given the value in systems X, the state of systems Y andĀE is product. This is equivalent to the following constraint on conditional mutual information Finally, we have that The first equality follows from (20), and the second equality follows from (22). This implies that the state of Eve's system is independent of the measurement choices, i.e., I(XY ; E) ρ = 0 for all p(x, y). We can then quantify the amount of non-local correlations in the correlation p(a, b|x, y) as inf ρĀB XY E I(Ā;B|XY E), where the infimum is with respect to no-signaling extensions ρĀB XY E of the above form. Since Alice and Bob want to maximize the non-local correlations of the two black boxes, we maximize over input probability distributions p(x, y), leading us to the following definition: Definition 4 (Intrinsic non-locality) The intrinsic non-locality of a correlation p(a, b|x, y) ∈ NS is defined as where ρĀB XY E is a no-signaling extension of the state ρĀB XY , i.e., subject to the constraints in (20) and (22).

Quantum intrinsic non-locality
We now introduce a function N Q : p(a, b|x, y) → R ≥0 , which we call quantum intrinsic non-locality, with p(a, b|x, y) ∈ Q. As stated above, the correlation in the set Q arises from some underlying state ρ AB and POVMs of Alice and Bob characterized by {Λ a x } a and Λ b y b , respectively. 1 Now, consider a quantum state ρ ABE such that Tr E (ρ ABE ) = ρ AB . We call ρ ABE an extension of the state ρ AB . Then, one possible extension of the classical-classical state ρĀB XY as defined in (18) is where p(a, b|x, y)ρ a,b,x,y By definition, this extension is also a nosignaling extension and is subjected to the constraints in (20) and (22). We call the extensions of the form in (27) quantum extensions.
For p ∈ Q, the set of no-signaling extensions of p is strictly larger than the set of quantum extensions. For example, in the CHSH game, a correlation p(a, b|x, y) reaching the Tsirelson bound only admits a trivial quantum extension, i.e., with constant ρ a,b,x,y E independent of a, b, x, and y. Whereas, the no-signaling extensions of such a correlation are not extremal, as can be seen by writing p(a, b|x, y) as a convex combination of a PR box (with necessarily constant ρ a,b,x,y E as an extension) and a local box (where ρ a,b,x,y E contains the local hidden variable). Therefore, to consider the regime in which there is an underlying quantum model, we define quantum intrinsic non-locality as follows: Definition 5 (Quantum intrinsic non-locality) The quantum intrinsic non-locality of a correlation p(a, b|x, y) ∈ Q is defined as where ρĀB XY E is a quantum extension of the state ρĀB XY that is subject to the constraints in (27).
Proposition 6 If p(a, b|x, y) ∈ Q, then Proof. This follows from the observation that a quantum extension σĀB XY E of ρĀB XY is a particular kind of no-signaling extension.

Properties of intrinsic non-locality and quantum intrinsic non-locality
In this section, we prove that intrinsic non-locality and quantum intrinsic non-locality are faithful, monotone with respect to local operations and shared randomness, superadditive, and additive with respect to tensor products of correlations. These are the properties that are desirable for a measure of Bell non-locality to possess. We also prove that the quantum intrinsic non-locality of a correlation is never larger than the intrinsic steerability of an associated assemblage.
Proposition 7 Intrinsic non-locality and quantum intrinsic non-locality vanish for correlations having a local hidden-variable model; i.e., if p(a, b|x, y) ∈ L, then N (Ā;B) p = 0 and N Q (Ā;B) p = 0.
Embed this in a classical-classical state with p(x, y) an arbitrary probability distribution over x, y: Then, consider the following quantum extension Then, by inspection,Ā andB are independent given XY E. This implies that inf ρĀB XY E I(Ā;B|XY E) ρ = 0. Since this equality holds for an arbitrary probability distribution p(x, y), we can then conclude that N Q (Ā;B) p = 0. Then, by (30), we conclude that N (Ā;B) p = 0.
We later prove in Theorem 19 that N (Ā;B) p = 0 or N Q (Ā;B) p = 0 implies that p ∈ L.
We expect any quantifier of non-locality to be monotone under the free operations of local operations and shared randomness. That is, a free operation should not increase the amount of non-locality in the device. We state this in the following proposition: Proposition 8 (Monotonicity of intrinsic non-locality) Let p i (a, b|x, y) be a correlation, and let p f (a f , b f |x f , y f ) be a correlation that results from the action of local operations and shared randomness on p i (a, b|x, y), so that we can write the final probability distribution as follows: where y, x f , y f ) are local boxes as described in (16) and (17). Then, Proof. First, we embed p f (a f , b f |x f , y f ) in a quantum state: where p(x f , y f ) is an arbitrary probability distribution for x f and y f . Then invoking (15), (16), and (17), we obtain An arbitrary extension of the state in (36) is given by A particular extension of the state in (36) is given by This in turn is a marginal of the following state: Consider that inf ext. in (38) The first inequality follows from considering a particular extension in (39). The second inequality follows from data processing of conditional mutual information. The second equality follows be- The last equality follows from the chain rule for conditional mutual information. Now, let us consider each term in (45). By inspection, Upon re-arranging, we obtain So, given X, Y , the states ζ x,ȳ ABE and ζ x,y X f Y f Λ 1 are in tensor product. Therefore Then, by inspection Here, X and Y are independent given X f , Y f , and Λ 1 . Therefore, Since (52) is true for an arbitrary no-signaling extension of ρĀB XY , the above inequality holds after taking the infimum over all possible no-signaling extensions ζĀB XY E . Finally, we can take the supremum over all the measurement choices, and we find that This concludes the proof.
Proposition 9 (Monotonicity of quantum intrinsic non-locality) Let p i (a, b|x, y) ∈ Q, and let p f (a f , b f |x f , y f ) result from the action of local operations and shared randomness on p i (a, b|x, y). We can write the final probability distribution as follows: where y, x f , y f ) are local boxes as described in (16) and (17). Then, where p(x f , y f ) is an arbitrary probability distribution for x f and y f . The set of quantum correlations Q is closed under the action of local operations and shared randomness, implying that is also a quantum correlation, we know that there exists an underlying state σ AB and POVMs Λ An arbitrary quantum extension of the state in (56) is given by where σ and σ ABE is an extension of σ AB . Now, we know that and that the correlations Therefore, there exist separable states ρ XY and ρ A F B F , along with POVMs that result in the correlations I (L) and O (L) . That is, Furthermore, we know that the correlation p i (a, b|x, y) is a quantum correlation. Therefore, it has an underlying state ρ AB and POVMs characterized by {Λ a x } a and Λ b y b . Then Since ρ XY is a separable state, we can write it as ρ XY = A particular quantum extension of the state in (56) is given by where Then it follows that This in turn is a marginal of the following state: Then, following arguments similar to that given in Proposition 8, we obtain Proposition 10 (Convexity of intrinsic non-locality) Let p(a, b|x, y) and q(a, b|x, y) be two correlations, and let λ ∈ [0, 1]. Let t(a, b|x, y) be a mixture of the two correlations, defined as t(a, b|x, y) = λp(a, b|x, y) + (1 − λ) q(a, b|x, y). Then Proof. First, we embed the correlation t(a, b|x, y) in the following classical-classical state τĀB XY : where p(x, y) is an arbitrary probability distribution. Similarly, embed p(a, b|x, y) in ρĀB XY and q(a, b|x, y) in γĀB XY : Next, consider an arbitrary no-signaling extension of τĀB XY : Similarly, consider an arbitrary no-signaling extension of ρĀB XY and γĀB XY : Now, consider the following particular no-signaling extension of τĀB XY : The first inequality follows from choosing a particular no-signaling extension. The equality follows from properties of conditional mutual information. Since this holds for all non-signaling extensions of the form in (73) and (74), we conclude that inf ext. in (72) Taking the supremum over all measurement choices, we find that This completes the proof.
Proposition 11 (Convexity of quantum intrinsic non-locality) Let p(a, b|x, y) and q(a, b|x, y) be correlations in Q, and let λ ∈ [0, 1]. Let t(a, b|x, y) be a mixture of the correlations defined as t(a, b|x, y) = λp(a, b|x, y) + (1 − λ) q(a, b|x, y). Then Proof. Since Q is a convex set [Pit86], we know that t(a, b|x, y) ∈ Q. First, we embed the correlation t(a, b|x, y) in the following quantum state τĀB XY : where p(x, y) is an arbitrary probability distribution. Similarly, embed p(a, b|x, y) in ρĀB XY and q(a, b|x, y) in γĀB XY : Next, consider an arbitrary quantum extension of τĀB XY : Similarly, consider an arbitrary quantum extension of ρĀB XY and γĀB XY : Let ρ AB be a quantum state that, along with the POVMs characterized by Λ a x and Λ b y , yield the correlation p(a, b|x, y). Let ρ ABE be an extension of ρ AB . Similarly, let γ AB be a quantum state that, along with the POVMs characterized by M a x and M b y , yield the correlation q(a, b|x, y). Let γ ABE be an extension of γ AB . Then, a particular quantum state that realizes the correlation t(a, b|x, y) is the following: where it is understood that Alice is measuring σ Z on her system A and Bob is measuring σ Z on B , in addition to the other measurements on their systems A and B. Now, consider the following extension of τ ABA B : Furthermore, consider the following particular quantum extension of τĀB XY : Then following similar arguments given in the proof of Proposition 10, we obtain concluding the proof.
Proposition 12 (Superadditivity and additivity of intrinsic non-locality) Let p(a 1 , a 2 , b 1 , b 2 |x 1 , x 2 , y 1 , y 2 ) be a correlation for which the following no-signaling constraints hold: Let t(a 1 , b 1 |x 1 , y 1 ) and r(a 2 , b 2 |x 2 , y 2 ) be correlations corresponding to the marginals of the probability distribution p(a 1 , a 2 , b 1 , b 2 |x 1 , x 2 , y 1 , y 2 ). Then the intrinsic non-locality is super-additive, in the sense that If p(a 1 , b 1 , a 2 , b 2 |x 1 , x 2 , y 1 , y 2 ) = t(a 1 , b 1 |x 1 , y 1 )r(a 2 , b 2 |x 2 , y 2 ), then the intrinsic non-locality is additive in the following sense: Proof. Consider the classical-classical state ρĀ 1Ā2B1B2 X 1 Y 1 X 2 Y 2 with the following arbitrary nosignaling extension: where p(x 1 , x 2 , y 1 , y 2 ) is an arbitrary probability distribution. From the chain rule of mutual information and non-negativity of conditional mutual information, we obtain From the no-signaling constraints in the statement of the proposition and (94), we obtain We first embed t(a 1 , b 1 |x 1 , y 1 ) in τĀ 1B1 X 1 Y 1 E , and r(a 2 , b 2 |x 2 , y 2 ) in γĀ 2B2 X 2 Y 2 E and consider the following arbitrary no-signaling extensions: Since is a particular no-signaling extension of γĀ 2B2 X 2 Y 2 , we obtain the following inequality: Since (102) holds for an arbitrary no-signaling extension of ρ, we obtain inf ext. in (94) Since the above equation holds for arbitrary probability distributions, we can take a supremum over all probability distributions to obtain Since we have considered a supremum over product probability distributions for the measurement choices on the LHS, we can relax this to consider the supremum over all probability distributions p(x 1 , y 1 , x 2 , y 2 ) of the measurement choices. This concludes the proof of (92). Now we give a proof for additivity of intrinsic non-locality with respect to product probability distributions. Since intrinsic non-locality is super-additive, it is sufficient to prove the following sub-additivity property for product probability distributions: Consider the following states b 1 ,x 1 ,y 1 ,a 2 ,b 2 ,x 2 ,y 2 p(x 1 , x 2 , y 1 , y 2 ) t(a 1 , b 1 |x 1 , y 1 ) r(a 2 , b 2 |x 2 , y 2 ) [a 1 b 1 x 1 y 1 a 2 b 2 x 2 y 2 ] ⊗ ρ a 1 ,b 1 ,x 1 ,y 1 ,a 2 Now, consider a particular extension of the state ρĀ 1Ā2B1B2 X 1 X 2 Y 1 Y 2 : b 1 ,x 1 ,y 1 ,a 2 ,b 2 ,x 2 ,y 2 p(x 1 , x 2 , y 1 , y 2 ) t(a 1 , b 1 |x 1 , y 1 ) r(a 2 , b 2 |x 2 , y 2 ) Then, we have the following set of inequalities: inf ext. in (107) The first inequality follows from a particular choice of an extension. The first equality follows from the chain rule. For the second equality, observe the following: where ζ x 1 ,x 2 ,y 1 ,y 2 Then, from (114) and (115), it follows that This is equivalent to Then by inspection of (108), and from the nosignaling constraints, it follows that inf ext. in (107) Since the above statement holds for an arbitrary no-signaling extension of the form in (107), it follows that inf ext. in (107) Since the above inequality holds for an arbitrary probability distribution p(x 1 , x 2 , y 1 , y 2 ), we find that sup p(x 1 ,x 2 ,y 1 ,y 2 ) inf ext. in (107) This concludes the proof.
Let ρ AB be quantum state, and let pĀ |X (a|x)ρ a,x B be an assemblage that arises from the quantum state ρ AB and some measurement {Λ x a }. 2 We then prove that the intrinsic steerability of the assemblage pĀ |X (a|x)ρ a,x B is never smaller than the quantum intrinsic non-locality of all the bipartite correlations that can arise from this assemblage.
Proposition 14 Let p(a, b|x, y) be a quantum correlation that is obtained by performing a POVM Λ b y b on the assemblage {pĀ |X (a|x)ρ a,x B } a,x . Then the quantum intrinsic non-locality of the correlation p does not exceed the intrinsic steerability of the assemblageρ. That is, where we recall thatρ is a shorthand to denote the assemblage.
Proof. Let p(a, b|x, y) be a quantum correlation that arises from the assemblage pĀ |X (a|x)ρ a,x B . That is, p(a, b|x, y) = Tr Λ b y pĀ |X (a|x)ρ a,x B .
Let pĀ |X (a|x)ρ a,x BE be a particular no-signaling extension of pĀ |X (a|x)ρ a,x B . Then one possible nosignaling extension of p(a, b|x, y) is From [SBC + 15], it follows that the above is also a quantum extension. Let p(x, y) be an arbitrary probability distribution. Let p(a, b|x, y) be a correlation embedded in a classical-classical state ρĀB XY with the following particular no-signaling extension: and an arbitrary quantum extension: Similarly, let ρĀ XB be a state into which the assemblage pĀ |X (a|x)ρ a,x B is embedded, and let ρĀ XBE be a particular extension, where Let Then, This follows from the chain rule of conditional mutual information and inspection of (129). Observe that Bob can perform a local operation and transform the state ρĀ BXY E to ρĀB XY E . Then, from the data-processing inequality, we find that This means that for every no-signaling extension ρĀ BXE of the state ρĀ BX that encodes the assemblage pĀ |X (a|x)ρ a,x B , we can find a quantum extension ρĀB XY E of ρĀB XY that encodes the correlation p(a, b|x, y) derived from the assemblage pĀ |X (a|x)ρ a,x B , such that (131) is true. Therefore, we obtain the following: concluding the proof.

Intrinsic non-locality of a PR box
In this section, we calculate the intrinsic non-locality of a PR box.

Proposition 15
The intrinsic non-locality of a PR box is equal to 1, i.e., N (Ā;B) p = 1, where p is the correlation defined in (14).
Proof. Consider the state where p(x, y) is an arbitrary probability distribution. Consider a no-signaling extension of the state The no-signaling constraints are From (14), and the no-signaling constraint in (137), we arrive at the following constraints on the possible states of Eve's system: In the matrices given above, the rows and columns are indexed by (y, b). The first matrix on the left corresponds to x = 0, and the second one on the right corresponds to x = 1. The constraints in (139) can also be written as By following 1 → 7 → 4 → 6 → 2 → 8 → 3 → 5 → 1 in the above, we obtain ρ x,y,a,b E = ρ x ,y ,a ,b E ∀x, x , y, y ∈ [s] and a, a , b, b ∈ [r]. This implies that ρĀB XY has a trivial tensor product no-signaling extension. Hence, = 1.
It is easy to check that given realizations of X, Y , the entropies H(Ā|B) ρ x,y = 0 and H(Ā) ρ x,y = 1.

Faithfulness of restricted intrinsic steerability
In this section, we solve an open question from [KWW17], regarding the faithfulness of restricted intrinsic steerability. (146) Proof. The forward direction ("if") was established in [KWW17, Proposition 12]. We now give a proof for the reverse direction ("only if") of the theorem. Let us first construct a proof strategy for a uniform probability distribution p X (x) = 1 |X | , and then we generalize it to a proof for an arbitrary distribution p X (x). This proof shares some ideas from the proof for faithfulness of squashed entanglement [LW18].
Invoking Theorem 5.1 of [FR15], we know that there exists a recovery channel R XE→ĀXE such that where systemsĀ 1 andĀ 2 are isomorphic to systemĀ, and systems X 1 and X 2 are isomorphic to X. In the above, we have invoked the no-signaling condition I(X; BE) ρ = 0, which implies that ρ BE and ρ X are product as written. Now, let us apply this recovery channel again. We then have that which follows from the monotonicity of trace distance with respect to R X 3 E→Ā 3 X 3 E •Tr X 2Ā2 . Then, combining the above equation with (147) via the triangle inequality, we obtain For j ∈ {4, . . . , n}, again apply the channels R XE→Ā j X j E •TrĀ j−1 X j−1 , along with the monotonicity of trace norm under quantum channels, combining the equations via the triangle inequality, to obtain the following inequality: The recovery channel R X i E→Ā i X i E can be taken as [Wil15] R XE→ĀXE (·) = ρ 1 2 +iω for some ω ∈ R. Let σĀn X n BE denote the following state: = a n ,x n p X n (x n )qĀn |X n (a n |x n ) |x n x n | X n ⊗ |a n a n |Ān ⊗ σ a n ,x n BE .
= a n ,x n p X n (x n )qĀn |X n (a n |x n ) |x n x n | X n ⊗ |a n a n |Ān ⊗ σ a n ,x n B . (157) = a n ,x n p X n (x n )qĀn |X n (a n |x n ) |x i where A [n]\{i} = A 1 A 2 · · · A i−1 A i+1 · · · A n and similarly X [n]\{i} = X 1 X 2 · · · X i−1 X i+1 · · · X n . Furthermore, qĀn |X n (a n |x n ) is a probability distribution for a n given x n after the application of the recovery channels R X i E→Ā i X i E . From (151), we obtain for all i ∈ {1, 2, . . . , n} that The application of the recovery channels generates the data (x 1 , a 1 ), (x 2 , a 2 ), . . . , (x n , a n ). The x i correspond to the measurement choices, and the a i correspond to the measurement outcomes. This data is called the "cheat sheet" and acts like a hidden variable λ. The formulation of the cheat sheet is similar to the construction of a local hidden-variable model in [TDS03].
We now devise an algorithm to generateã fromx by using the cheat sheet. The generated state σÃX B is a local hidden state, with the cheat sheet as the hidden variable. We then prove that σÃX B is close to the original state ρĀ XB .
Alice receivesx. She searches for all the values of i for which x i =x, and generates i uniformly at random where δ x ix is the Kronecker delta function and where N (x|x n ) is the number of times that the letter x appears in the sequence x n . Then, she outputsã with probability pÃ |A n I (ã|a n i) = δã ,a i .
Therefore, pÃ |XX n A n (ã|xx n a n ) = n i=1 pÃ |A n IX nX (ã|a n ix nx )p I|XX n A n (i|xx n a n ) (163) Ifx does not belong to the sequence x n , then she generatesã randomly. This sequence of actions can be expressed in terms of the following conditional probability distribution: It is easy to check that ã pÃ |XX n A n (ã|x, x n , a n ) = 1. We now use the notion of robust typicality [OR01] for the analysis.
Definition 17 (Robust typicality [OR01]) Let x n be a sequence of elements drawn from a finite alphabet X , and let p(x) be a probability distribution on X . Let N (x|x n ) be the empirical distribution of x n . Then the δ-robustly typical set T X n δ for δ > 0 is defined as The following result holds for 0 < δ < 1:

Property 18
The probability for a sequence x n to be in the robustly typical set is bounded from below as where µ X := min The state generated after the application of the algorithm in (166) is as follows: x n ,a n pÃ |XX n A n (ã|x, x n , a n )p X n (x n )qĀn |X n (a n |x n ) |ã ã|Ã ⊗ σ a n ,x n B .
(170) Then, define the following sets: • S 1 (x): set of sequences x n such thatx ∈ x n and x n ∈ T X n δ , • S 2 (x): set of sequences x n such thatx ∈ x n and x n ∈ T X n δ , • S 3 : set of sequences x n such that x n ∈ T X n δ .
So we can write the state σÃX B as x n ∈S 1 (x),a n p(ã|x, x n , a n ) |ã ã| ⊗ q(a n , x n )σ a n ,x n B + x n ∈S 2 (x),a n p(ã|x, x n , a n ) |ã ã| ⊗ q(a n , x n )σ a n ,x n B + x n ∈S 3 ,a n p(ã|x, x n , a n ) |ã ã| ⊗ q(a n , x n )σ a n ,x n B , From the triangle inequality, we obtain the following: where Let us analyze each term individually, beginning with pX (x) |x x|X ⊗ x n ∈S 3 ,a n p(ã|x, x n , a n ) |ã ã| ⊗ q(a n , x n )σ a n ,x n B 1 (176) x n ∈S 3 ,a n p(x n )q(a n |x n )p(ã|x, x n , a n ) |x x| ⊗ |ã ã| ⊗ σ a n ,x n B 1 p(x n ) a n q(a n |x n ) ã p(ã|x, x n , a n ) ≤ ε 1 , where ε 1 = 2|X | exp − nδ 2 µ X 3 . The first inequality follows from convexity of trace distance, and the second inequality follows from the definition of S 3 and (168).
Let us now consider S 2 (x), that is, the set of sequences x n such thatx ∈ x n and x n ∈ T X n δ . From Definition 17, we know that for the robustly-typical set, the following condition holds For a robustly-typical sequence to have an empirical distribution N (x|x n ) = 0, it is required that δ ≥ 1. So, we restrict δ ∈ (0, 1). Thus, by the fact that p X (x) > 0 for all x ∈ X , it is impossible for N (x|x n ) = 0 and x n ∈ T X n δ . That is, Consider that where x [n]\{i},x refers to a sequence x n with x i =x. We now want to give an upper bound on the second term in (175): where Let us define the following sets: • S 1 (x i ): set of sequences x n such that x i ∈ x n and x n ∈ T X n δ , • S 2 (x i ): set of sequences x n such that x i ∈ x n and x n ∈ T X n δ , • S 3 : set of sequences x n such that x n ∈ T X n δ . Then, Then, using the convexity of trace distance with (183) and typicality arguments similar to (178) and (180), we find that where σ (1) and Invoking (179), we find that where δ ∈ (0, 1). After combining (178), (180), (188), and (191), we obtain Minimizing over all possible no-signaling extensions, as required by the definition, we find that Since ρĀ XB and σĀ XB are classical-quantum states with p X (x) = 1 |X | , we obtain This implies that the following inequality holds for all x ∈ X : This means that we can average the above to get a bound for any arbitrary distribution p(x) on x. Therefore, we can now relax the assumption of a uniform probability distribution, in order to obtain the following bound for an arbitrary probability distribution: which implies that Given S(Ā; B)ρ ≤ ε (as required by the condition of faithfulness), choose n = (1/ε) 1/4 , δ = ε 1/16 |X | 1/2 (recall that we require δ ∈ (0, 1)). We know by the Chernoff bound [OR01] that Substituting these values, we find that This concludes the proof.

Faithfulness of intrinsic non-locality
The following theorem, combined with Proposition 7, establishes that intrinsic non-locality is faithful.
Theorem 19 (Faithfulness of intrinsic non-locality) For every no-signaling or quantum correlation p(a, b|x, y), the intrinsic non-locality N (Ā;B) p = 0, if and only if it has a local hidden variable description. Quantitatively, if N (Ā;B) p ≤ ε, where 0 < ε 1/16 d 1/2 < 1, for d = |X | · |Y|, there exists a probability distribution l(a, b|x, y) having a local hidden variable description, such that where ρĀ XBY correponds to the classical-classical state p XY (x, y)p(a, b|x, y) and γĀ XBY is the classical-classical state corresponding to p XY (x, y)l(a, b|x, y).
Proof. The proof closely follows the proof for faithfulness of intrinsic steerability. We first construct a strategy for p XY (x, y) = 1 |X | . 1 |Y| and then generalize it to an arbitrary distribution. Invoking [FR15], we know that there exists a recovery channel R XE→ĀXE such that Since I(BE; X|Y ) ρ = 0 from (21), and p XY (x, y) = 1 X . 1 Y , we can write ρB XY E = ρB Y E ⊗ ρ X . Following an argument similar to (148)-(151), we obtain the following inequality: where Since the distributions p X (x) and p Y (y) are independent, we have From the no-signaling constraints, we have This implies that Since the systemsĀ n X n E of ωĀn X nB Y E are obtained from the application of the recovery channel on systems X n E of the state ρ XnY EB , we can use quantum data processing for mutual information to obtain the following inequality: This implies that ωĀn X nB Y = x n ,a n ,y,b p(x n ) q(a n |x n ) p(y) q(b|a n x n y) [x n a n b y] X nĀnB Y .
Proof. The if-part of the proof follows from Proposition 7. The only-if part follows from Proposition 6 and Theorem 19.
6 Upper bounds on secret key rates in device-independent quantum key distribution We now consider the task of device-independent quantum key distribution. We consider two honest parties, Alice and Bob, who share a two-component device and want to extract a shared secret key from this device. In general, in the device-independent literature, many prior works have devised lower bounds on the key rates for particular protocols, as done in [ABG + 07, AFDF + 18]. By a protocol, we mean a sequence of steps in which Alice and Bob interact with their devices and communicate publicly with each other.
Here, we are interested in a different question. We fix the black-box device that is shared by Alice and Bob. We suppose that the correlations generated from this device are characterized by a correlation p(a, b|x, y). We then pose the following question: Given a device characterized by p(a, b|x, y), what is a non-trivial upper bound on the secret-key rate that can be extracted from this device with any possible protocol?
We answer this question for an i.i.d. device, which means that in each round of the protocol, the device considered is characterized by the correlation p(a, b|x, y). The inputs of the device in a particular round can be correlated with the inputs of the device in other rounds. The assumption that the device is characterized by the correlation p(a, b|x, y) is not a drawback since we are interested in determining upper bounds on secret-key rates here. In what follows, we prove that the quantifiers introduced above are upper bounds on the secret-key rates that can be generated from the device.
In device-independent quantum key distribution, we assume the presence of an eavesdropper who obtains all of the classical data communicated between Alice and Bob during the protocol. Furthermore, the system held by the eavesdropper can have joint correlations with the systems held by Alice and Bob. Let Alice and Bob share a quantum correlation p(a, b|x, y) as defined in (10). Let the correlation shared between Alice, Bob, and Eve be defined by p(a, b|x, y)ρ a,b,x,y E . If p(a, b|x, y)ρ a,b,x,y E has an underlying quantum strategy as described in (27), then we call the eavesdropper a quantum Eve. If p(a, b|x, y)ρ a,b,x,y E only fulfills the constraints given in (20) and (22), then we call the eavesdropper a no-signaling Eve.

Device-independent protocols
We now state the general form of a device-independent protocol with no-signaling eavesdropper for which our upper bounds hold. Such protocols have previously been considered in [BHK05, Mas09, MRC + 14]. Let n ∈ Z + , R ≥ 0, and ε ∈ [0, 1]. Let p(a, b|x, y) be the correlation of the device shared between Alice and Bob. We define an (n, R, ε) device-independent secret-key-agreement protocol as follows: • Alice and Bob give the inputs x n and y n to their devices according to p X n Y n (x n , y n ). The device is used n times, and the distribution p X n Y n (x n , y n ) is independent of Eve. Alice inputs x i and obtains the output a i . Bob inputs y i and obtains the output b i , where i ∈ {1, . . . , n}.
The input and output distributions are embedded in the state σĀnBn X n Y n , where σĀnBn X n Y n := x n ,y n ,a n ,b n p X n Y n (x n , y n )p n (a n , b n |x n , y n )[a n b n x n y n ]ĀnBn X n Y n , and p n (a n , b n |x n , y n ) is the i.i.d. extension of p(a, b|x, y). The joint state held by Alice, Bob, and Eve is a no-signaling extension σĀnBn X n Y n E of σĀnBn X n Y n .
• Alice and Bob perform local operations and public communication, with C A denoting the classical register communicated from Alice to Bob,C A is a classical register held by Eve that is a copy of C A , the classical register C B is communicated from Bob to Alice, and C B is a classical register held by Eve that is a copy of C B . This protocol yields a state ω K A K B EC ACB X n Y n that satisfies for all no-signaling extensions, where A rate R is achievable for a device characterized by p if there exists an (n, R − δ, ε) deviceindependent protocol for all ε ∈ (0, 1), δ > 0, and sufficiently large n. The device-independent secret-key-agreement capacity DI(p) of the device characterized by p is defined to be equal to the supremum of all achievable rates.
Theorem 21 The intrinsic non-locality N (Ā;B) p is an upper bound on the device-independent secret-key-agreement capacity of a device characterized by p and sharing no-signaling correlations with an eavesdropper: Proof. For an arbitrary (n, R, ε) protocol, consider that ≤ I(Ā n ;B n |EX n Y n ) σ + ε , where In the above equations, σ X nĀnBn Y n is the classical-classical state obtained from the device after Alice and Bob enter in the measurement inputs. Alice, Bob, and Eve hold a no-signaling extension σ X nĀnBn Y n E . Alice performs a local operation L A to obtain M A and C A . She communicates C A to Bob, and Eve also obtains a copyC The first inequality follows from the uniform continuity of conditional mutual information [Shi17, Proposition 1]. The second inequality follows from data processing. The second equality and third inequality follow from the chain rule of conditional mutual information, as well as the fact thatC A is a classical copy of C A andC B is a classical copy of C B . The last inequality follows from data processing for conditional mutual information. Since the above inequality holds for an arbitrary no-signaling extension of σĀnBn X n Y n , we find that This implies that nR ≤ N (Ā n ;B n ) p + ε .
By the assumption that the device is i.i.d., we can invoke the additivity of intrinsic non-locality from Proposition 12 to obtain Taking the limit as n → ∞ and ε → 0 then leads to DI(p) ≤ N (Ā;B) p . Now, let us consider a class of device-independent protocols in which the eavesdropper is restricted by quantum mechanics. These models have previously been studied in [ABG + 07, AFDF + 18]. The general form of a device-independent protocol with a quantum eavesdropper remains the same except that we now consider a quantum extension (27) of the state in (217). We then arrive at the following theorem: Theorem 22 The quantum intrinsic non-locality N Q (Ā;B) p is an upper bound on the deviceindependent secret-key-agreement capacity of a device characterized by p and sharing quantum correlations with an eavesdropper: Proof. The proof of the theorem is similar to that of Theorem 21.
We should explicitly point out that the general form for protocols that we consider allow both Alice and Bob to exchange public classical information. Therefore, the upper bounds via intrinsic non-locality and quantum intrinsic non-locality hold for two-way error correction as well. It has been observed in device-dependent QKD that two-way error-correcting protocols surpass the threshold of one-way error-correcting protocols [BA07,WMUK07,KL17]. This question has only recently been explored in DI-QKD in [TLR19]. Therefore, it is possible that the upper bound via the intrinsic non-locality will not be tight for the existing DI-QKD protocols [ABG + 07, AFDF + 18] which consider only one-way error correction.
Another point to make is that in the protocols we consider, Alice and Bob announce their measurement choices. That is, X and Y are known to Eve. The secret key is extracted fromĀ andB. There are certain protocols in the device-independent literature where the outputsĀ and B are broadcast and the local randomness variables X and Y are the basis of the key [RPMP15] (note that [SARG04] introduced this concept in the device-dependent QKD literature). For such DI-QKD protocols, our upper bounds do not hold.

Other considerations
Bounds on device-independent QKD protocols based on certain states were also previously discussed in [HM15].
There is yet another way to model a no-signaling adversary in the device-independent secret agreement protocols which has been considered in [BHK05]. This model is set in "box world," in which each player including the eavesdropper has a set of possible inputs and outputs. Therefore, it becomes natural to model the joint system with a conditional probability distribution P ABE|XY Z . In [WDH19], the authors introduced squashed non-locality to provide an upper bound on key rates of device-independent protocols with the aforementioned model of the eavesdropper. This is in contrast to the model that we consider where the eavesdropper is a quantum no-signaling adversary but is not equipped with a number of measurements.

One-sided-device-independent protocol
Let n ∈ Z + , R ≥ 0, and ε ∈ [0, 1]. We define an (n, R, ε) one-sided-device-independent secret-keyagreement protocol for an assemblageρ := {p A|X (a|x)ρ a,x B } a,x as follows: • Alice gives input x n to get an output a n . The assemblage shared by Alice and Bob is then ρĀn X n B n := x n ,a n p X n (x n )p A n |X n (a n |x n ) [x n , a n ] X n A n ⊗ ρ a n ,x n B n , where {p A n |X n (a n |x n )ρ a n ,x n B n } a n ,x n is an i.i.d. extension of the assemblage {p A|X (a|x)ρ a,x B } a,x . Alice, Bob, and Eve hold a no-signaling extension of the above assemblage: ρĀn X n B n E := x n ,a n p X n (x n )p A n |X n (a n |x n ) [x n , a n ] X n A n ⊗ ρ a n ,x n B n E .
• Bob inputs y i and obtains the output b i , where i ∈ {1, . . . , n}. Let the measurement corresponding to y n be a set {Y n b n } b n of measurement operators, such that b n (Y n b n ) † Y n b n = I. The state shared between Alice, Bob and Eve is then σĀn X nBn Y n E . σĀn X n Y nBn E := x n ,a n p X n (x n )pĀn |X n (a n |x n ) [x n , a n ] X nĀn ⊗ • Alice and Bob perform local operations and public communication, with C A being the classical register communicated from Alice to Bob,C A is a classical register held by Eve that is a copy of C A , the classical register C B is communicated from Bob to Alice, andC B is a classical register held by Eve that is a copy of C B . This protocol yields a state ω K A K B EC ACB X n Y n that satisfies for all no-signaling extensions, where A rate R is achievable for a device characterized byρ if there exists an (n, R − δ, ε) one-sided device-independent protocol for all ε ∈ (0, 1), δ > 0, and sufficiently large n. The one-sided device-independent capacity SDI(ρ) of the device characterized byρ is defined to be equal to the supremum of all achievable rates forρ.
Theorem 23 The restricted intrinsic steerability S(Ā;B)ρ is an upper bound on the one-sided device-independent secret-key-agreement capacity SDI(ρ) of a device characterized byρ: Proof. For obtaining an upper bound in the one-sided device-independent setting, we continue from (226) as follows: ≤ I(Ā n ;B n Y n |EX n ) σ + ε (239) The first inequality follows from the chain rule of conditional mutual information. The last inequality follows from data processing. Since the above inequality holds for an arbitrary no-signaling extension of ρĀn X n B n , we obtain nR ≤ inf ρĀn X n B n E I(Ā n ; B n |X n E) ρ + ε .
Since we assume an i.i.d. device, we find by applying the additivity of restricted intrinsic steerability Taking the limit as n → ∞ and ε → 0 then leads to the desired inequality SDI(ρ) ≤ S(Ā; B)ρ.
In the following proposition, K D (ρ AB ) refers to the distillable key of the state ρ AB . For the exact definition, please refer to Definition 8 of [HHHO09].
Proposition 24 Let ρ AB be a bipartite state,ρ a,x B an assemblage resulting from the action of a POVM on Alice's system, and p(a, b|x, y) a quantum correlation resulting from the action of an additional POVM on Bob's system. Then, the device-independent secret-key-agreement capacity of the quantum correlation p does not exceed the one-sided device-independent secret-key-agreement capacity ofρ, which in turn does not exceed the distillable key of the state ρ AB : Proof. The proof is a consequence of the following observation: the DI secret-key-agreement protocol is a special case of the SDI secret-key-agreement protocol with the measurements on Bob's side corresponding to i.i.d. measurements. Similarly, the SDI secret-key-agreement protocol is a special case of a secret-key-agreement protocol acting on the state ρ AB with the local operations on Alice's side consisting of i.i.d. measurements.

Device-independent protocol
We now consider a device that is characterized by the correlation p which has the following quantum strategy: Alice and Bob share a two-qubit isotropic state ω p AB = (1 − p)Φ AB + pπ A ⊗ π B , where Φ AB = 1 2 1 i,j=0 |ii jj|, and π denotes the maximally mixed state. This state arises from sending one share of Φ AB through a depolarizing channel. Alice's measurement choices x 0 , x 1 , and x 2 correspond to σ z , σz+σx √ 2 , and σz−σx √ 2 , respectively. Bob's measurement choices y 1 and y 2 correspond to σ z and σ x , respectively. The correlation resulting from this setup is then p(a, b|x, y), with x taking values from {x 0 , x 1 , x 2 }, the variable y taking values from {y 1 , y 2 }, and a, b ∈ {0, 1} being the measurement results. A specific device-independent protocol was studied in [ABG + 07], which was then used to obtain a lower bound on the key rate from the above specified correlation.
The secret-key rate in a device-independent protocol is bounded from above as follows (Theorem 22): The idea is now to consider some quantum extension of the probability distribution obtained from the black box, and then bound the quantum intrinsic non-locality from above. The technique presented below is similar to the technique used in [GEW16] to obtain upper bounds on the squashed entanglement of a depolarizing channel. An isotropic state is Bell local if p ≥ 1 − 1 √ 2 [HHH95]. This implies that the quantum intrinsic non-locality of a correlation derived from ω p AB is equal to zero for p ≥ 1 − 1 √ 2 (Proposition 7). For ≤ p ≤ 1 − 1 √ 2 , we can write the probability distribution q ω p (a, b|x, y) obtained from ω p AB as a convex combination of probability distributions obtained from ω and ω 1−1/ √ 2 . That is, for some 0 ≤ α ≤ 1, we have q ω p (a, b|x, y) = (1 − α( ))q ω (a, b|x, y) + α( )q ω 1−1/ √ 2 (a, b|x, y).
By simple algebra, we obtain Equation (246) can be written as Then, from convexity of quantum intrinsic non-locality (Proposition 11), we obtain Since the above equation is true for all α, we find that This implies that where q ω in encoded in ρĀB XY ( ) with ρĀB XY E ( ) as the quantum extension. Let us choose a trivial extension of the state ρ x,ȳ AB ( ). It is easy to see that Therefore, We plot this upper bound in Figure 3, and we interpret it and explain the relative entropy of entanglement bound in the next subsection. for the device-independent protocol described in Section 7.1. The relative entropy of entanglement of a qubit-qubit isotropic state is given in [VPRK97]. For further explanation of this plot, see the next section.
We plot this bound in Figure 4. Due to the fact that squashed entanglement is an upper bound on the rate at which secret key can be distilled from an isotropic state [CEH + 07, Wil16], as well as the above protocols being particular protocols for secret key distillation, squashed entanglement is also an upper bound on the rate at which the secret key can be distilled in one-sided-device-independent and device-independent protocols. However, the upper bound on squashed entanglement of an isotropic state that we obtain after choosing the extension as given in [GEW16] is greater than the bound obtained on intrinsic steerability of the assemblage considered above. Therefore, we do not plot the squashedentanglement bounds in Figures 3 or 4. For the same reason given above, the relative entropy of entanglement is also an upper bound on the rate at which secret key can be distilled in one-sided-device-independent and device-independent protocols [HHHO09]. The relative entropy of entanglement of qubit-qubit isotropic states has been calculated in [VPRK97], which we plot in the above figures. This bound performs better than intrinsic non-locality and intrinsic steerability in certain regimes. This suggests that it might be worthwhile to explore if relative entropy of steering [GA15,KW17] and relative entropy of nonlocality [vDGG05] would be useful as upper bounds for one-sided-device-independent and deviceindependent quantum key distribution, respectively.
The bounds that we obtain do not closely match the lower bounds obtained from prior literature. One reason for this discrepancy can be traced back to the following question: is a violation of a Bell inequality or a steering inequality sufficient for security in DI-QKD and SDI-QKD, respectively? Since our measure is faithful, it is equal to zero if and only if there is no violation of steering inequality or Bell inequality. However, the lower bounds hit zero at a lower value of p than expected from the faithfulness condition. Another possible reason for the discrepancy has been discussed in Section 7.1, pertaining to two-way error correction that is allowed in the protocols considered above.

Conclusion and outlook
In the present work, we have introduced information-theoretic measures of non-locality called intrinsic non-locality and quantum intrinsic non-locality. They are inspired by the intrinsic information [MW99] and have a form similar to squashed entanglement [CW04] and intrinsic steerability [KWW17]. We have proven that intrinsic non-locality and quantum intrinsic non-locality are upper bounds on secret-key rates in device-independent secret-key-agreement protocols. Similarly, we have proven that restricted intrinsic steerability is an upper bound on secret-key rates in one-sided device-independent secret-key-agreement protocols. To our knowledge, this is the first time that monotones of Bell non-locality and steering have been used to obtain upper bounds on device-independent and one-sided-device-independent secret-key rates, respectively. The faithfulness properties for intrinsic steerability and intrinsic non-locality that we have proven here are of independent interest.
We now give an overview of the remaining open problems not addressed by the present work. It is not known if either intrinsic non-locality or intrinsic steerability are asymptotically continuous. A naive approach for establishing these properties is to follow the proof for asymptotic continuity of squashed entanglement [AF04]; however, this approach does not straightforwardly apply due to the no-signaling constraints on the extension system. From a foundational perspective, it would be interesting to provide an example of a probability distribution for which the intrinsic non-locality with a classical no-signaling extension is different from intrinsic non-locality with a quantum nosignaling extension.
We also suspect that the squashed entanglement of a bipartite state ρ AB is greater than or equal to the restricted intrinsic steerability of an assemblage that results from measuring ρ AB . The approach in Proposition 14 does not apply because it does not account for the factor of 1/2 present in the definition of squashed entanglement.
Another promising direction to pursue is to improve the upper bounds on secret-key rates for device-independent and one-sided-device-independent protocols. Several works in the classical information theory literature have introduced modifications of classical intrinsic information [RW03,GA10] in order to obtain better bounds on secret-key rates than intrinsic information. In [RW03], a modified measure of intrinsic information, called reduced intrinsic information, was introduced and proved to be a better upper bound on secret-key rate than intrinsic information [MW99]. This bound was also subsequently improved further in [GA10]. It would be interesting to check if these techniques lead to improvements on the upper bounds presented by intrinsic non-locality and intrinsic steerability.
One of the most important open questions is to determine if the relative entropy of steering [GA15,KW17] and relative entropy of non-locality [vDGG05] would be useful as upper bounds for one-sided-device-independent and device-independent secret-key-agreement protocols, respectively. It is possible that this might be the case; if true, it could lead to tighter upper bounds for certain device-independent and one-sided-device-independent protocols.