Valid and efficient entanglement verification with finite copies of a quantum state

Cieśliński, Paweł; Dziewior, Jan; Knips, Lukas; Kłobus, Waldemar; Meinecke, Jasmin; Paterek, Tomasz; Weinfurter, Harald; Laskowski, Wiesław

doi:10.1038/s41534-024-00810-3

Download PDF

Article
Open access
Published: 24 January 2024

Valid and efficient entanglement verification with finite copies of a quantum state

npj Quantum Information volume 10, Article number: 14 (2024) Cite this article

703 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Detecting entanglement in multipartite quantum states is an inherently probabilistic process, typically with a few measured samples. The level of confidence in entanglement detection quantifies the scheme’s validity via the probability that the signal comes from a separable state, offering a meaningful figure of merit for big datasets. Yet, with limited samples, avoiding experimental data misinterpretations requires considering not only the probabilities concerning separable states but also the probability that the signal came from an entangled state, i.e. the detection scheme’s efficiency. We demonstrate this explicitly and apply a general method to optimize both the validity and the efficiency in small data sets providing examples using at most 20 state copies. The method is based on an analytical model of finite statistics effects on correlation functions which takes into account both a Frequentist as well as a Bayesian approach and is applicable to arbitrary entanglement witnesses.

Mirrored entanglement witnesses

Article Open access 03 February 2020

Experimental few-copy multipartite entanglement detection

Article 24 June 2019

Quantifying entanglement in a 68-billion-dimensional quantum state space

Article Open access 25 June 2019

Introduction

Quantum entanglement has long been recognized as an important prerequisite of modern quantum technologies. Its detection is accordingly a well-studied topic with a plethora of different methods available. The field has evolved towards strategies directly applicable to experimental data which inevitably is limited to a finite number of detection events. If this number is large, various forms of entanglement witnesses^1,2,3 provide reliable entanglement verification⁴. Interestingly, also the analysis of smaller data sets allows to detect entanglement⁵, most recently with quantum state verification methods^6,7,8 and tailored game-like protocols in which particles are measured one by one^9,10,11,12. These methods are of high practical relevance in all cases where only a limited amount or only partial data is accessible, e.g. when an inefficient source has to be characterized quickly or one is interested in entanglement detection in large-scale quantum systems.

The finiteness of data sets used to derive a conclusion about whether the state is entangled or not inevitably leads to a probabilistic nature of this conclusion. Common certification schemes aim to ensure a small error if the outcome of the method speaks in favour of entanglement, i.e. a small error of a false positive. Yet, in the case of low statistics, this might come at the price of a large probability of rejecting an actually entangled state, i.e. a false negative. This tradeoff is illustrated in Fig. 1, which shows that for smaller statistics one is required to go to more and more strict criteria to reliably distinguish between results compatible with entangled states and results compatible with separable states. In consequence, due to the increased strictness, it becomes also more and more improbable that an entangled state will pass the test. It thus becomes crucial to properly apply the apparatus of statistical hypothesis testing^13,14,15 to the scenario of entanglement certification in order to avoid significant misinterpretations of conclusions based on small sample sizes and to properly optimize the usage of these scarce resources.

**Fig. 1: Illustration of finite statistics effects on entanglement verification.**

Here, we analyse the consequences of limited datasets for the application of entanglement witnesses based on correlation functions. As mentioned above, there are competing interests in choosing a certification strategy. The first is achieving a certain level of validity, i.e. trust in a positive result of the entanglement certification. The second interest is to ensure that the test can correctly assess as many input states as possible, which we call its efficiency. In general, different certification strategies will exhibit different efficiencies even if they achieve the same validity and vice versa. Thus, both quantifiers should be taken into account when choosing an optimal test strategy. We develop an algorithm for the optimization of experimental entanglement tests which simultaneously ensures high validity with optimal efficiency while being independent of the system’s size.

Although the general framework of statistical analysis which is employed here is well-known to statisticians, it has not yet been comprehensively applied in the field of entanglement certification, where it becomes especially relevant in the context of limited datasets. The present work contributes to closing this interdisciplinary gap. The results are illustrated with various examples where different entanglement witnesses are applied to general classes of quantum states. While our examples all consider systems of qubits (the most commonly encountered systems in quantum information), our general analysis is by no means limited to two-dimensional systems and can be applied to any witness which is a function of generalized correlation tensor elements.

Results

Entanglement certification as hypothesis testing

There are two main approaches to statistical hypothesis testing, the Frequentist and the Bayesian one (see e.g. ref. ¹³ for a Frequentist and refs. ^14,15 for a Bayesian perspective). These approaches employ different quantifiers to capture different figures of merit. In general, both are useful to statistically evaluate an entanglement certification scheme, but dependent on the context, such as e.g. the availability of prior information or the practical goal of certification, one of these frameworks might be more suited than the other. In the following, we will first express entanglement detection as a hypothesis test and subsequently show how it is statistically analysed from the Frequentist or Bayesian viewpoints.

Entanglement certification can be reduced to a so-called simple hypothesis test, where the parameter space contains just two elements corresponding to the two disjoint hypotheses ‘the state is separable’ and ‘the state is entangled’. We denote the two parameters as ‘sep’ and ‘ent’, respectively. The outcome space ${{{\mathcal{O}}}}$ is given by the set of all possible outcomes generated by a particular entanglement detection function under consideration. Depending on the chosen approach to statistical inference, the theory of hypothesis testing now allows to construct the optimal test to choose between the two hypotheses for any given outcome $Q\in {{{\mathcal{O}}}}$.

In non-randomized tests the outcome space is divided into two disjoint sets ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ (acceptance set) and ${{{{\mathcal{O}}}}}_{{{{\rm{rej}}}}}={{{\mathcal{O}}}}\setminus {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ (rejection set) such that entanglement is certified whenever $Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$. Note, that sometimes it can be optimal to perform a (partially) randomized test, where there exists a subset of outcomes for which the decision between the hypotheses is made randomly. For simplicity, we present our analysis in terms of non-randomized tests and note that randomized test can be treated equivalently. Furthermore, the usefulness of a randomized test is illustrated in the generalized scenario part of the Optimizing Entanglement Certification Strategies section.

In typical entanglement certification schemes the outcomes Q approximate theoretical quantities Q^∞ which are functions of the states that would be observed when the number of measurements tends to infinity. From this point on we use the superscript ∞ to indicate such ideal values. The inference of entanglement is usually justified by the fact that values in the acceptance set ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ cannot be attained by any separable state, i.e.

$${{{\rm{P}}}}({Q}^{\infty }\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}| {{{\rm{sep}}}})=0.$$

(1)

If the experiment is performed with sufficient statistical data the experimental result Q approximates the theoretical Q^∞ very well and thus a result in ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ entanglement can be certified essentially with certainty. Crucially, however, if the amount of empirical data is sufficiently small, the statistical fluctuations of the outcomes lead to a manifestly statistical nature of the conclusion and a proper analysis of the validity of the statistical inference becomes necessary.

Frequentist approach

In the Frequentist approach, the validity of the test is given by the level of confidence q_F = 1−α, that one or the other hypothesis is correct, with α being the level of significance. In our simple certification scenario, the significance α is the upper bound on the probability q_F with which the experiment will wrongly certify entanglement if the state is separable. As this is a statement about the average performance of the test, it does not allow any probabilistic assertions about the correctness of a particular single run of the procedure. For example, if we perform certification of entanglement with the values typically used for hypothesis testing, i.e. q_F = 95%(α = 5%), but are given separable states only, we would certify the state to be entangled on average in roughly 5% of the runs.

In a certification scenario, the level of significance only depends on the likelihood of type I errors (false positives). Thus, to achieve a certain validity q_F, the acceptance set has to be chosen such that the likelihood of yielding a value in the acceptance set given a separable state is at most 1−q_F, i.e.,

$$\mathop{\sum}\limits_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{rej}}}}}}{{{\rm{P}}}}(Q| {{{\rm{sep}}}})\ge {q}_{{{{\rm{F}}}}}.$$

(2)

The likelihood function for the opposite hypothesis P(Q∣ent) is irrelevant as far as the level of significance is concerned.

Although the distinction between types of errors in Frequentist inference is well known, particularly in the field of entanglement certification with small data sets, often the focus lies exclusively on the level of significance^{5,9,10,11,16,17}. We want to argue, that in most practical applications, one is not only interested in a valid method but also in its ability to positively certify at least some entangled states. This tradeoff becomes particularly crucial in the case of very low statistics. Thus, hypothesis tests should also be optimized based on a second quantifier, the power of the test, which characterizes exactly this ability to provide a positive certification corresponding to minimizing e₂, the error of type II (probability of false negatives). The power r is given by

$$r=1-{e}_{2}=\mathop{\sum}\limits_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}}{{{\rm{P}}}}(Q| {{{\rm{ent}}}}).$$

(3)

The concept of statistical power is essential for finding optimal tests and properly assessing the usefulness of statistical conclusions. See Fig. 2 for an illustration. Thus, in our terminology, the maximization of power corresponds to a maximization of the efficiency of the test.

**Fig. 2: Validity and power in the Frequentist approach applied to the nonlinear witness S.**

Bayesian approach

In a Bayesian approach to certification, probabilities for both hypotheses can be calculated given a certain outcome, Q, i.e. P(ent∣Q) and P(sep∣Q) = 1−P(ent∣Q). Validity in this context is readily identified with the Bayesian level of acceptance q_B, which is the minimal value of the posterior probability for which the hypothesis ent is accepted. Entanglement is thus certified whenever P(ent∣Q) ≥ q_B. This means that the acceptance set ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ has to be chosen such that

$${{{\rm{P}}}}({{{\rm{ent}}}}| Q)\ge {q}_{{{{\rm{B}}}}},\quad {{{\rm{for}}}}\,{{{\rm{all}}}}\quad Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}},$$

(4)

i.e., whenever a value of $Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ is observed, the certification of entanglement is credible at least with degree q_B. Note, that as given by Bayes theorem, the posterior P(ent∣Q) is a function of both P(Q∣sep) and P(Q∣ent), as well as of the quotient of priors P(ent)/P(sep).

To arrive at a formulation of Bayesian hypothesis testing in terms of validity and efficiency, one has to consider the connection between the level of acceptance and so-called loss functions. For a given outcome Q and weights β and (1−β) for false positive and false negative respectively, the (posterior) loss function L_Q is given by

$$\begin{array}{l}{L}_{Q}=\left\{\begin{array}{ll}\beta \ {{{\rm{P}}}}({{{\rm{sep}}}}| Q)\quad &\,{{\mbox{if}}}\,Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}\\ (1-\beta )\ {{{\rm{P}}}}({{{\rm{ent}}}}| Q)\quad &\,{{\mbox{if}}}\,Q\in {{{{\mathcal{O}}}}}_{{{{\rm{rej}}}}},\end{array}\right.\end{array}$$

(5)

i.e. the respective probability of error multiplied by the corresponding weight. It can be shown that for each Q this loss is optimized if the rejection and acceptance sets are chosen simply based on an acceptance level of q_B = β¹⁴. Thus for optimal tests, we can simply identify the acceptance level q_B with the (relative) weight β given to the error of false positive.

A priori, i.e., without the knowledge of the outcome Q, an expected loss L can be defined as

$$\begin{array}{ll}L=\mathop{\sum}\limits_{Q}{L}_{Q}{{{\rm{P}}}}(Q)={q}_{{{{\rm{B}}}}}{{{\rm{P}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}| {{{\rm{sep}}}}){{{\rm{P}}}}({{{\rm{sep}}}})\\\qquad+\,(1-{q}_{{{{\rm{B}}}}}){{{\rm{P}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{rej}}}}}| {{{\rm{ent}}}}){{{\rm{P}}}}({{{\rm{ent}}}}),\end{array}$$

(6)

which corresponds to a weighted sum of the probabilities for the errors of both types. This is also equivalent to the average loss if the procedure is applied multiple times. In analogy to the power for Frequentist tests, the average loss L allows to compare different Bayesian test strategies with equal acceptance levels. From a Bayesian perspective maximal efficiency of a test thus corresponds to minimal loss L.

Finally, let us also briefly mention why the relevant priors are not discussed in the usual detection of quantum entanglement, via witnesses, in the limit of many measurement results⁴. In such a case, when the number of state copies is large: n ≫ 1, it is possible to choose acceptance sets ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ for which it is practically impossible for a separable state to exceed the bound of the witness, i.e., ${{{\rm{P}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}| {{{\rm{sep}}}})\approx 0$, as discussed above. Therefore, Bayes rule simplifies and the posterior becomes P(ent∣Q) ≈ 1 independently of the priors. In consequence, for any outcome in the acceptance set entanglement is certified with validities q_F ≈ q_B ≈ 1 both from the Frequentist and the Bayesian perspective and the distinction between the two approaches becomes superfluous.

Effects of finite statistics

The standard experimental scheme for entanglement certification in qubit systems is based on the measurement of correlation functions or simply correlations ${T}_{{\mu }_{1}\ldots {\mu }_{N}}$ defined by

$${T}_{{\mu }_{1}\ldots {\mu }_{N}}=\langle {R}_{{\mu }_{1}}\ldots {R}_{{\mu }_{N}}\rangle ={{{\rm{Tr}}}}(\rho \,{\sigma }_{{\mu }_{1}}\otimes \cdots \otimes {\sigma }_{{\mu }_{N}}),$$

(7)

where ${\sigma }_{{\mu }_{k}}$ is the μ_kth local Pauli matrix of the kth party with μ_k ∈ {0, x, y, z} and σ₀ being the identity. The correlation is thus the average of the product of local measurement outcomes ${R}_{{\mu }_{k}}=\pm 1$ when suitable local Pauli measurements are executed. Note that definition (7) does not only include the full correlations, for which all parties perform measurements but also marginal correlations, i.e. values of T to which only a subset of parties contributes, which formally correspond to setting the remaining indices μ_k = 0. To simplify the notation in the following we write the correlations as T_j, where the index $j\in \left[1,{4}^{N}\right]$ indicates the set of chosen measurement settings over all parties.

Consider a broad class of procedures for entanglement certification, in which a quantity Q^∞, e.g., an entanglement witness, is calculated as a function of the correlations T_j. Note that by choosing Q^∞ tailored to witness various types of entanglement, e.g. genuine multipartite entanglement, one can employ the further discussed method for testing the statements about, e.g., genuine multipartite entanglement or biseparability. For the usual application of the conditions imposed by Q^∞, the actual correlation functions are estimated with a finite number of experimental trials. We denote the actually measured correlations resulting from finite statistics by τ_j.

Thus, instead of the ideal quantum mechanically predicted value Q^∞ based on T_j obtained in the limit of infinite statistics, experimentally we have only access to the modified value Q, calculated from τ_j. We therefore begin with modelling τ_j.

Let {n_j} denote the number of state copies measured for each setting j, which can vary from setting to setting. For the jth measurement setting, we have

$${\tau }_{j}=\frac{{n}_{j}^{+}-{n}_{j}^{-}}{{n}_{j}}$$

(8)

where ${n}_{j}^{\pm }$ are the numbers of trials in which the product of local results equals ${R}_{{\mu }_{1}}\cdots {R}_{{\mu }_{N}}=\pm 1$, with ${n}_{j}^{+}+{n}_{j}^{-}={n}_{j}$.

For large n_j we obtain τ_j ≈ T_j, but for small n_j the measured τ_j take potentially different values which are furthermore restricted to the discrete set ${\tau }_{j}\in \{-1,-1+\frac{2}{{n}_{j}},\ldots ,1-\frac{2}{{n}_{j}},1\}$. Given an N-qubit state, expressed by its correlations T_j, it is possible to explicitly calculate the ideal probability distribution P(τ_j) over these discrete values by employing the binomial distribution as demonstrated in Methods (Witnesses in finite statistics). The mean and variance of this distribution are given as

$$\langle {\tau }_{j}\rangle ={T}_{j},\qquad {{{\rm{Var}}}}({\tau }_{j})={\sigma }_{j}^{2}/{n}_{j},$$

(9)

where ${\sigma }_{j}^{2}$ is a function of the ideal quantum prediction equal to ${\sigma }_{j}^{2}=1-{T}_{j}^{2}$, see also¹⁸. Note that the variance depends on the correlation value itself, e.g. a perfect correlation has no variance. This already hints that the method is most useful for states with high values of ${T}_{j}^{2}$.

Considering a particular entanglement certification procedure with the outcome $Q=f({\tau }_{{j}_{1}},\ldots ,{\tau }_{{j}_{M}})$ (M being the number of correlation measurements), it becomes possible to calculate explicitly the probability distribution over the outcomes

$${{\mathrm{P}}}(Q)=\sum\limits_{{\tau_{j_1},\ldots,\tau_{j_M} \atop f(\tau_{j_1},\ldots,\tau_{j_M})=Q}} \, \prod\limits_{k=1}^M {{\mathrm{P}}}(\tau_{j_k}).$$

(10)

In the “Methods”, subsection “Witnesses in finite statistics” section, we illustrate this procedure for both linear^1,2,3 and non-linear entanglement witnesses^{17,19,20,21,22,23,24}. The linear witness W = ∑_jα_jσ_j, with expectation value ${{{\rm{Tr}}}}({{{\rm {{W}}}}}\rho )={\sum }_{j}{\alpha }_{j}{T}_{j}$ denoted as ${E}_{{{{\rm{W}}}}}^{\infty }$, extension to finite statistics has the following properties:

$$\langle {E}_{{{{\rm{W}}}}}\rangle ={E}_{{{{\rm{W}}}}}^{\infty },\quad {{{\rm{and}}}}\quad {{{\rm{Var}}}}({E}_{{{{\rm{W}}}}})=\mathop{\sum}\limits_{j}{\alpha }_{j}^{2}{{{\rm{Var}}}}({\tau }_{j}).$$

(11)

For nonlinear witnesses, further work is required as one needs to calculate the effect of finite statistics on nonlinear functions of τ_j. We describe a general procedure in the “Methods” section and here briefly illustrate the case when the witness depends on the square of correlations. Its mean and variance read

$$\langle {\tau }_{j}^{2}\rangle ={T}_{j}^{2}+{{{\rm{Var}}}}({\tau }_{j}),\qquad {{{\rm{Var}}}}({\tau }_{j}^{2})=\frac{2({n}_{j}-1)(1-{T}_{j}^{2})[(2{n}_{j}-3){T}_{j}^{2}+1]}{{n}_{j}^{3}}.$$

(12)

Note that the mean value of the square has a consistent shift by exactly the variance of the estimation from the finite data. This is a consequence of taking the square—the negative values in the distribution are transformed into positive values leading to a systematic deviation. This fact is important in the experimental analysis of finite data sets and has already been addressed in ref. ²⁵. As a final illustration assume the quadratic witness ${S}^{\infty }=\mathop{\sum }\nolimits_{j}^{M}{T}_{j}^{2}$ that sums up M squared correlations (see the subsection “Certyfing entanglement” in the “Methods” section), each measured with the same number of copies n, i.e. n_j = n. We obtain

$$\langle S\rangle =\frac{(n-1){S}_{M}^{\infty }+M}{n},\quad {{{\rm{with}}}}\quad {{{\rm{Var}}}}(S)=\mathop{\sum }\limits_{i=1}^{M}{{{\rm{Var}}}}({\tau }_{i}^{2}),$$

(13)

where now apart from the statistical uncertainty also a shift of the mean value is present.

Limited significance of entanglement certification

The modelling of probability distributions for the outcomes of single correlation measurements now allows us to calculate explicitly the probability distribution over outcomes of the entanglement witnesses Q. This calculation is always based on particular state spaces corresponding to the hypotheses ent and sep. Initially, for simplicity, we consider a simple scenario, where the test is supposed to distinguish between having a particular entangled state of the form

$$\rho =\frac{3}{4}\left\vert {{{\rm{GHZ}}}}\right\rangle \left\langle {{{\rm{GHZ}}}}\right\vert +\frac{1}{4}\,{\mathbb{1}}/{2}^{N},$$

(14)

or an arbitrary separable state. Here, $\left\vert {{{\rm{GHZ}}}}\right\rangle$ is an N-qubit GHZ state, for any N, and ${\mathbb{1}}/{2}^{N}$ corresponds to the so-called white noise with no correlations at all. Note that in realistic scenarios the expected distribution of input states might be more complicated as we also demonstrate below. The probability distributions corresponding to the sep hypothesis are modelled based on a worst-case estimation of correlations compatible with separability as discussed in detail in the “Methods” section (see the subsections “Estimating probability distributions” and “Correlations compatible with separability”).

Now consider the linear witness W = σ_y ⊗ σ_y ⊗ σ_x ⊗ ⋯ ⊗ σ_x−σ_x ⊗ ⋯ ⊗ σ_x + 1 with only n = 10 clicks per single correlation measurement, i.e., 20 state copies in total. Due to this very small number of data points, separable states can now give rise to outcomes in the acceptance set, i.e., P(E_W < 0∣sep) > 0, where we follow the convention that separable states admit ${E}_{{{{\rm{W}}}}}^{\infty }\ge 0$ in the case of infinite statistics. In fact, in this example, the probability that the witness is negative on a suitable separable state is as large as 41.5%. The straightforward solution seems to be to make the boundary more strict, i.e. require sufficiently negative values to certify entanglement. However, even with the strictest nontrivial bound of −4/5 we find that the violation by a separable state is still equal to 2.5%, i.e. P(E_W ≤ − 4/5∣sep) = 2.5%. At the same time, the state ρ gives rise to values E_W < −4/5 with probability of 26.7%. While it is clearly more probable for the entangled state to violate the bound than for any separable state, still, given the finite values of both probabilities, the question remains what the appropriate probabilistic conclusion about the quality of such certification should be.

The nonlinear criterion $S={\langle {\sigma }_{y}\otimes {\sigma }_{y}\otimes {\sigma }_{x}\otimes \cdots \otimes {\sigma }_{x}\rangle }^{2}+{\langle {\sigma }_{x}\otimes \cdots \otimes {\sigma }_{x}\rangle }^{2}$ has been illustrated in Fig. 1. Again we consider the sum of only two squared correlations, each estimated in 10 experimental trials. Our analytical tools return the plotted P(S∣ent) based on the above-mentioned entangled state ρ. Figure 1 clearly illustrates that finite statistics leads to the violation of the bound of 1 by a separable state. Again the immediate solution seems to be making the bound more strict. We find that P(S = 2∣sep) = 4.2%, but at the same time P(S = 2∣ρ) = 6.9%. With this bound, the procedure would wrongly certify entanglement (false positive) in at most only 4.2% of the cases, but at the same time, it would fail to correctly certify it (false negative) in roughly 93% of the cases where the state was entangled. Furthermore, for any single experiment yielding value S = 2, we are clearly not justified to conclude that the state is entangled, as a separable state might have given rise to the same value of S with a comparable probability.

This type of explicit calculation can be applied to any witness based on correlation values to evaluate whether it can be applied efficiently in the case of finite statistics. In particular, it becomes necessary to put ${{{\rm{P}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}| {{{\rm{sep}}}})$ in relation to the probability ${{{\rm{P}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}| {{{\rm{ent}}}})$ via a proper statistical analysis, either Frequentist or Bayesian.

Resource optimization

Based on the possibility of quantitatively evaluating any entanglement certification procedure in terms of validity and efficiency, we now tackle the question of how to best use a small number η of available copies of a multi-qubit quantum state. The experimental choice to be made is about the amount of settings M and the distribution of the copies among those {n_j}, where ∑_jn_j = η. Furthermore, a suitable outcome space ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ has to be chosen. By explicitly calculating the distributions P(Q∣sep) and P(Q∣ent), and assuming a certain prior the validity and efficiency of the procedure can be characterized, no matter if a Frequentist or Bayesian approach is chosen. In practice, a reasonable scenario could be to require a certain minimal validity and then choose a combination of experimental parameters with a suitable acceptance space, such as to maximize efficiency. This yields the following optimization problem:

Given a certain minimal validity ${q}_{\min }$ and a total number η of state copies, what set of M observables, each estimated on {n_j} copies of the state, and what acceptance set ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ are to be chosen such that an observation of $Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$ certifies entanglement with a validity $q\ge {q}_{\min }$, while keeping the efficiency as high as possible.

Note that when Q is a function of marginal correlations corresponding to commuting operators it can be possible to measure them simultaneously from a single global measurement setting. This is the approach chosen in ref. ⁹, where a single state copy is used to obtain values for several marginal correlations providing the outcome for a linear witness. Thus, in general, given a function Q also of marginal correlations it is of high interest to group these into as few measurement settings as possible. Since our example witnesses S and W are solely based on full correlations this particular layer of optimization is not relevant for them.

Optimizing entanglement certification strategies

Scenario: First of all, let us emphasise that the proper statistical analysis is especially crucial in the case of a very limited data set composed of at most a few tens of data points. In the examples below we assume that no more than 20 copies of multi-particle state from a possibly entangled source have been successfully measured. The small amount of data makes it practically impossible to distinguish entangled states which are close to separable ones. Thus, the best candidates are states which are strongly non-classically correlated. Generalizing Eq. (14), here, we therefore consider mixed states of the form

$$\rho =p\left\vert \psi \right\rangle \left\langle \psi \right\vert +(1-p){\rho }_{{{{\rm{n}}}}},$$

(15)

where a dominant admixture comes from a pure state $\left\vert \psi \right\rangle$, assumed to admit perfect correlations, and the remaining part, ρ_N, describes experimental noise. Since our theoretical analysis just depends on the number of correlations (measurement settings) it is thus independent of the number of involved particles. See ref. ²⁶ for fidelity estimation with the same feature. Accordingly, we will not specify the state $\left\vert \psi \right\rangle$ exactly as the only parameters that enter the witnesses are correlations. We would like to emphasise this point, the methods work for any GHZ states, graph states, cluster states, absolutely maximally entangled states etc. Although in practice the perfect correlations of $\left\vert \psi \right\rangle$ will be reduced by various types of noise, which we consider via the addition of ρ_n, we show in the “Methods” subsection “Effects of different types of noise” that the probability to observe a certain measurement outcome depends almost entirely on the parameter p, and not a type of noise as long as it is uniform and unbiased. Clearly, it is possible to introduce a specific noise that results in different success probability characteristics. In such a case one can still apply our methods, but here we focus on random noises as they are the usual models of experimental imperfections. Thus, we restrict our examples to the case of white noise, i.e. ${\rho }_{{{{\rm{n}}}}}={\mathbb{1}}/{2}^{N}$. In effect, the model (15) encompasses a plethora of experimental situations that are of interest in quantum information.

Consider a certification scenario where we expect that the tested state is either from the family (15) or a separable state modelled according to the worst-case correlations compatible with separability (see Estimating Probability Distributions and Correlations Compatible with Separability in the Methods), with P(sep) = P(ent) = 1/2. We assume 20 available state copies and require a target validity of q_F = q_B = 97.5%. This will help us to compare and highlight the differences in reasoning that occur from Frequentist and Bayesian perspectives on statistical inference. According to our optimization scheme, we want to determine the set of measurements, i.e., their amount M and in general also the measurement directions, together with an optimal distribution of clicks per setting {n_j}. For simplicity here, we restrict the presentation to cases where we assume that all n_j’s are the same and equal to n, and hence we constrain the product Mn = 20. Let us first set p = 3/4 and specify the Bayesian loss function as Eq. (31).

Linear witness

To illustrate how to optimally distribute the 20 copies for the linear witness W, as an example we choose a procedure where the acceptance set is chosen via an upper bound ${{{\mathcal{C}}}}$. We extend the linear witness to M different settings as ${E}_{{{{\rm{W}}}}}^{\infty }={T}_{1}-{T}_{2}-\cdots -{T}_{M}+1$. It turns out that the worst case separable state, i.e. the state which maximizes the probabilities in the acceptance set, for a sufficiently low bound ${{{\mathcal{C}}}}$ has the correlations T₁ = −1/M and T_i = 1/M for i = 2, …, M. In Fig. 3a we plot ${{{\rm{P}}}}({E}_{{{{\rm{W}}}}}\le {{{\mathcal{C}}}}| {{{\rm{sep}}}})$ and ${{{\rm{P}}}}({E}_{{{{\rm{W}}}}} > {{{\mathcal{C}}}}| {{{\rm{ent}}}})$, which are directly related to validity and efficiency. As the simultaneous optimization of the two quantifiers is based on keeping both probabilities as small as possible, clearly the optimal procedure is realized for M = 5 and n = 4.

From the Frequentist perspective to achieve the minimally required validity of 97.5%, a bound of ${{{{\mathcal{C}}}}}_{{{{\rm{F}}}}}=-2.5$ has to be chosen, which leads to an efficiency of r ≈ 76.5%. The highest validity achievable with 20 state copies is 99.996% with a bound of ${{{{\mathcal{C}}}}}_{{{{\rm{F}}}}}=-4$. However, it comes at the cost of efficiency of just r ≈ 6.9%.

The Bayesian framework requires P(ent∣E_W) ≥ 97.5% for all values in the acceptance set. This is satisfied if the bound ${{{{\mathcal{C}}}}}_{{{{\rm{B}}}}}$ is chosen as −3. The associated efficiency, in this case, is given by the loss function of L ≈ 0.007.

From the above results, one can see a clear difference in the choice of the acceptance sets for each reasoning. Namely, the Frequentist bound is less strict than the Bayesian counterpart, implying instances in which experimenters would disagree on the conclusions about the entanglement certification. For comparison, from the Frequentist perspective, the adjusted bound of ${{{{\mathcal{C}}}}}_{{{{\rm{B}}}}}=-3$ that is optimal for the Bayesian approach results in the level of confidence given by 99.64% and the corresponding power of r ≈ 53.5%. As expected, for the lower bound confidence is increased and power reduced, however, we emphasise that in general the Bayesian approach does not always yield stricter acceptance sets.

Quadratic witness

Just as for the case of the linear witness, we introduce a bound ${{{\mathcal{B}}}}$ to define the acceptance set also in the example of the quadratic witness. Figure 3b presents the two relevant quantities ${{{\rm{P}}}}(S\ge {{{\mathcal{B}}}}| {{{\rm{sep}}}})$ and ${{{\rm{P}}}}(S < {{{\mathcal{B}}}}| {{{\rm{ent}}}})$. Again, the best use of copies is given by n = 4 and M = 5.

Based on the minimally required validity of 97.5%, we calculate the bound ${{{\mathcal{B}}}}$ for the Frequentist approach to be ${{{{\mathcal{B}}}}}_{{{{\rm{F}}}}}=4$ and in the Bayesian approach we get ${{{{\mathcal{B}}}}}_{{{{\rm{B}}}}}=5$. This leads to an efficiency of r ≈ 31.4% and L ≈ 0.013 respectively. Recall that the chosen priors for the Bayesian inference are P(ent) = P(sep) = 0.5. In general, the examined state is entangled iff p > 1/(2^N−1 + 1)²⁷. Thus, given our state model, a more realistic scenario would take into account the cardinality of the separable states within the family (15) in the prior probabilities. For example, with a uniform distribution over p and N = 4, the natural prior is P(ent) = 8/9 and P(sep) = 1/9. This leads to the change of the acceptance set with ${{{{\mathcal{B}}}}}_{{{{\rm{B}}}}}=4$ and efficiency L ≈ 0.018. With increasing N the efficiency in the Bayesian approach is changing. As the last example, for N = 5 the prior is P(ent) = 16/17 and P(sep) = 1/17, the Bayesian bound is given by ${{{{\mathcal{B}}}}}_{{{{\rm{B}}}}}=3.5$ and the corresponding efficiency reads L ≈ 0.015.

Again, for equal priors Frequentist reasoning yields a greater acceptance set. However, here one can explicitly see that for more physically motivated priors this is no longer true and ${{{{\mathcal{C}}}}}_{{{{\rm{B}}}}}$ can be smaller than ${{{{\mathcal{C}}}}}_{{{{\rm{F}}}}}$. For all of the discussed cases, confidence and power are $({{{\mathcal{B}}}}=5,{q}_{{{{\rm{F}}}}}=99.8 \% ,r=6.9 \% )$, $({{{\mathcal{B}}}}=4,{q}_{{{{\rm{F}}}}}=97.6 \% ,r=31.4 \% )$ and $({{{\mathcal{B}}}}=3.5,{q}_{{{{\rm{F}}}}}=92.5 \% ,r=54.9 \% )$ respectively. Note that the last level of confidence does not meet the required validity condition.

Generalized scenario: Finally, in a more general example, we relax the condition of an equal amount of copies per setting and apply our optimization algorithm to determine the optimal amount of settings and distribution of state copies over these settings, such as to achieve optimal efficiency for certification with the quadratic witness S. It turns out that different certification strategies are found to be optimal, depending on whether a Frequentist or Bayesian approach is chosen. We calculate the distribution P(ρ_j∣ent) based on the model of a family of noisy three-qubit GHZ states (14), with a mean $\bar{p}=0.8$ and a standard deviation Δp = 0.1, such that for the entangled states

$$\begin{array}{r}{{{\rm{P}}}}(p)=\left\{\begin{array}{ll}{{{\mathcal{N}}}}{{{{\rm{e}}}}}^{-\frac{{(p-\bar{p})}^{2}}{2{(\Delta p)}^{2}}}\quad &p\in [{p}_{\min },1]\\ 0\quad &\,{{\mbox{else}}}\,,\end{array}\right.\end{array}$$

(16)

with proper normalization ${{{\mathcal{N}}}}$ and the lower bound ${p}_{\min }=1/5$ given by the requirement that the states have to be entangled, see the quadratic witness example in the previous subsection. We set the maximal number of settings to M = 3, allow at most 13 state copies and require a minimal validity of 70%. Furthermore, we again choose not to model the separable state space and instead estimate P(S∣sep) via the worst-case optimization used above. In this example for the prior, we assume that it is twice as likely to encounter an entangled state from our family than any separable state.

If the optimization is based on a Frequentist understanding of validity, i.e., achieving at least 70% confidence, it is optimal to use 12 of the 13 possible measurement runs and distribute them equally among the three settings as n₁ = 4, n₂ = 4 and n₃ = 4 (denoted as [444] for brevity). Figure 4a illustrates the resulting probability distribution P(S∣ent) and the worst case distribution P(S∣sep) obtained via the discussed optimization, where the best acceptance set turns out to be ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}=\{0,1,2.25,3\}$. The probability ${{{\rm{P}}}}(S\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}| {{{\rm{sep}}}})$ is only 29.8% and thus the confidence remains above 70% while at the same time the power $r={{{\rm{P}}}}(S\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}| {{{\rm{ent}}}})$ is maximized to 66.4%. Estimating also the Bayesian loss for this acceptance interval yields a value of 0.137. For comparison, Fig. 4b illustrates the estimated posterior distribution. From the Bayesian perspective, the acceptance set has to be chosen as {2.25, 3} which leads to a minimal level of acceptance of 76.8% and a loss of 0.149. The corresponding power would be 65.6%.

**Fig. 4: Optimal choices of distributions of states for Bayesian and Frequentist figures of merit.**

This example illustrates two seemingly peculiar consequences of the need to use a worst-case estimation. The first concerns the benefit of using a randomized test. To maximize the power of the test, the outcomes 0 and 1 are included in the acceptance set. As seen in Fig. 4 these are not valuable points since they have a significantly higher likelihood given a separable state than an entangled state. In principle, it would be much more beneficial to add a point with a better likelihood ratio, but as the optimization shows in this case the confidence would drop below 70%. This problem is resolved by the Neymark–Pearson lemma¹³, which proves that the optimal strategy consists of simply choosing the points with the best likelihood ratios (after corresponding optimization) and make a probabilistic conclusion whenever results with the lowest ratio are obtained. In this manner, the confidence level of 70% could be reached exactly while maximizing the power of the test even further. Second, the loss of the interval {0, 1, 2.25, 3} is estimated to be smaller than the loss of the interval {2.25, 3.0}, while still the latter is chosen as the Bayesian optimum. This discrepancy is a consequence of the fact that different worst-case estimates are chosen to find the acceptance sets and to calculate the loss, as explained further in the “Methods” subsection “Estimating the probability distributions”. These loss values are only used to compare the Bayesian performance given different distributions of measurement settings.

Figure 4d shows the posterior distribution for the optimal case according to the Bayesian approach, i.e. achieving at least 70% level of acceptance with as little expected loss as possible. It turns out that in this case, it is better to use only 11 of the 13 possible measurements and distribute them as [533] among the three settings. The requirement of q_B ≥ 70% leads to the choice of acceptance set ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}=\{2.36,3\}$ leading to a loss of 0.146, which is smaller than the value of 0.149 obtained with the distribution [444]. The corresponding power would be 50.4%. Conversely, as illustrated in Fig. 4c, with [444] only a power of 53.5% with a level of confidence of 71.1% (and a loss of 0.1603) can be reached from the point of view of the Frequentist approach making that choice suboptimal as compared with the best choice presented above. Note, that also here a randomized test would yield better results.

It might seem strange that in both cases a smaller than maximal number of copies is deemed optimal, but this is mainly a consequence of the strong discreteness of the outcome space. Indeed the maximal number of 13 measurements might provide maximal information about the state. However, the ability to harness this information within the given bounds on the levels of confidence and acceptance is restricted by the need to choose the acceptance and rejection sets from a small number of outcomes. It can thus happen that for a smaller number of measurements, the probabilities distribute more favourably such that the relevant quantifiers can reach better values.

Finally, it is important to keep in mind that although we subsume the level of confidence and level of acceptance by the term validity, they quantify different figures of merit which correspond to different practical interests. While the Frequentist is only interested in the average correctness of the conclusion of the certification method, the Bayesian wants to quantify the correctness of the conclusion on a case-by-case basis (and becomes dependent on the prior). Note, for example, that with the distribution [444] although it is much more probable for a separable state to yield a value 1 than for an entangled state (see Fig. 4a, given that result, the Frequentist would still certify entanglement as this decision increases the overall power without violating the significance level. On the other hand, if, as done here, a worst-case optimization of correlations compatible with separability is used, the Frequentist approach requires prior assumptions only to estimate the power (by modelling the space of entangled states) and can provide an assertion of the confidence which is completely independent of all priors.

Discussion

In entanglement certification, it is usually of interest to achieve a sufficiently high validity, i.e. be sufficiently convinced that the certification does not falsely conclude entanglement. Here we showed explicitly that for small data sets it is necessary to consider not only P(Q∣sep), i.e. the probability of a separable state to yield particular values of the witness Q, but also P(Q∣ent) of entangled states. Such proper design of an experiment is necessary if we are to make robust conclusions in the presence of finite statistics and allows us to define and evaluate the validity and efficiency of the method. Only with these two quantifiers, it becomes possible to optimize the usage of the available experimental resources.

We have introduced a universal procedure that allows the extension of any entanglement witness to the domain of finite statistics. It is based on an analytical calculation of probability distributions over possible outcomes when estimating correlation functions experimentally with finite resources. Based on these probability distributions we recalled the measures of validity and efficiency, applicable to any entanglement detection scheme, from the points of view of both Frequentist and Bayesian interpretation. Illustrative examples were provided based on linear as well as nonlinear witnesses and broad families of states.

The methods introduced here are directly applicable to raw data and should be especially helpful for a resource-efficient estimation of the performance of a known apparatus subject to variations of external parameters or in multipartite experiments with rare detection events, e.g., multi-photon setups based on coincidence clicks. For example, our scheme could be employed to quickly certify the quality of a large quantum processor before a time-consuming computation task.

Methods

Certifying entanglement

Linear witnesses W admit the following expansion in terms of the Pauli basis

$${{{\rm{W}}}}=\mathop{\sum}\limits_{j}{\alpha }_{j}{\sigma }_{j}.$$

(17)

The expectation value E_W ≡ 〈W〉 given a particular state ρ, is then a linear function of the correlations T_j given by

$${E}_{{{{\rm{W}}}}}={{{\rm{Tr}}}}({{{\rm{W}}}}\rho )=\mathop{\sum}\limits_{j}{\alpha }_{j}{T}_{j}.$$

(18)

Measuring each correlation and computing the expectation value of the witness function provides information about the separability of the state. Constructions of such functions are tailored to a specific entangled state for which E_W < 0, whereas the expectation value on separable states is never negative. For example, a linear witness detecting entanglement of a three-qubit GHZ state could be chosen as

$${{{\rm{W}}}}={\sigma }_{y}\otimes {\sigma }_{y}\otimes {\sigma }_{x}-{\sigma }_{x}\otimes {\sigma }_{x}\otimes {\sigma }_{x}+1,$$

(19)

where the constant ensures that the witness is non-negative on separable states and negative on the GHZ state $(\left\vert 000\right\rangle +\left\vert 111\right\rangle )/\sqrt{2}$, written in the standard σ_z basis.

Entanglement in state ρ can also be revealed via nonlinear witnesses. As an example we consider the scheme introduced in ref. ¹⁹. Consider the so-called correlation length S defined by

$$S\equiv \mathop{\sum }\limits_{j=1}^{{3}^{N}}{T}_{j}^{2},$$

(20)

which is the sum of all full correlations squared. Accordingly, the sum goes only up to 3^N (the marginal correlations are not included). It has been proven^{17,20,21,22,23,24} that a value of S > 1 certifies entanglement. Since S sums up only non-negative terms, entanglement is verified as soon as a subset of M correlations violates the bound, i.e., S_M > 1. We omit the subscript whenever the number of measured settings is clear.

Witnesses in finite statistics

Since the product of qubit measurement results is either equal to +1 or −1, the probability that in n trials we obtain n₊ products equal to +1 is given by the binomial distribution:

$${{{{\rm{p}}}}}_{i}({n}_{+})=\left(\begin{array}{c}n\\ {n}_{+}\end{array}\right){\left(\frac{1+{T}_{i}}{2}\right)}^{{n}_{+}}{\left(\frac{1-{T}_{i}}{2}\right)}^{{n}_{-}},$$

(21)

where (1 ± T_i)/2 is the probability to obtain the product equal to ±1 in a single trial, estimated from the ideal quantum mechanical prediction. All random variables are assumed to be independent and identically distributed. Using Eq. (8) the correlation τ_i is also binomially distributed as

$${{{\rm{P}}}}({\tau }_{i})={{{{\rm{p}}}}}_{i}\left(\frac{n}{2}(1+{T}_{i})\right).$$

(22)

Note that due to the finiteness of n also τ_i can take on a finite set of values $\{-1,-1+\frac{2}{n},\ldots ,1-\frac{2}{n},1\}$.

In the case of nonlinear witnesses, these correlations have to be further processed. We illustrate this process using the quadratic witness S since it can be easily generalized to any power. In order to write the probability distribution of different values of S we first note that

$${{{\rm{P}}}}({\tau }_{i}^{2})={{{\rm{P}}}}({\tau }_{i})+{{{\rm{P}}}}(-{\tau }_{i})\quad {{{\rm{for}}}}\quad {\tau }_{i}\,\ne \,0,$$

(23)

because both values ± τ_i give the same square. In the special case of τ_i = 0, the probability ${{{\rm{P}}}}({\tau }_{i}^{2})$ is just given by P(τ_i = 0). Since S is defined as the sum of squares, a particular value of S can be realized in many ways, e.g. the value S = 1 can be achieved if any one ${\tau }_{i}^{2}=1$ and all the other squared correlations are zero, or when all ${\tau }_{i}^{2}=1/M$, etc. The same reasoning applies to the linear case. Assuming that all the squared correlations ${\tau }_{i}^{2}$ are independently distributed, the values of S satisfy:

$${{{\rm{P}}}}(S)=\mathop{\sum}\limits_{{\tau }_{1}^{2}+\cdots +{\tau }_{M}^{2}=S}{{{\rm{P}}}}({\tau }_{1}^{2})\cdots {{{\rm{P}}}}({\tau }_{M}^{2}).$$

(24)

These distributions were used in the “Results” section to derive mean values and variances of the introduced quantities.

Estimating the probability distributions

For a simple hypothesis test, determining validity and efficiency requires the calculation of the two likelihoods P(Q∣sep) and P(Q∣ent). In practice, however, the parameter space is more complicated and one has to consider the full subspaces ${{{{\mathcal{H}}}}}_{{{{\rm{sep}}}}}$ and ${{{{\mathcal{H}}}}}_{{{{\rm{ent}}}}}$ of separable and entangled states. These spaces would be modelled via distributions P(ρ_j∣sep) and P(ρ_j∣ent) of separable and entangled states ρ_j and each of those states would give rise to certain likelihood of outcomes P(Q∣ρ_j). Although one option would be to expand the discussion towards so-called composite hypothesis tests, the simple scenario can be recovered by reducing these state spaces to single parameters (‘sep’ and ‘ent’). This is done by considering the average distributions of outcomes given a separable or entangled state with

$${{{\rm{P}}}}(Q| {{{\rm{sep}}}})=\mathop{\sum}\limits_{j}{{{\rm{P}}}}(Q| {\rho }_{j}){{{\rm{P}}}}({\rho }_{j}| {{{\rm{sep}}}}),$$

(25)

$${{{\rm{P}}}}(Q| {{{\rm{ent}}}})=\mathop{\sum}\limits_{j}{{{\rm{P}}}}(Q| {\rho }_{j}){{{\rm{P}}}}({\rho }_{j}| {{{\rm{ent}}}}).$$

(26)

Unfortunately, evaluating such expressions is notoriously daunting²⁸.

In some cases, it is possible to avoid the modelling of the state space and instead consider a worst-case distribution of values Q. For P(Q∣sep) this distribution can either be calculated point-wise for each Q, denoted as G_Q, or for the whole acceptance interval ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$, denoted as G_acc. These distributions have to satisfy

$${\forall }_{\rho \in {{{{\mathcal{H}}}}}_{{{{\rm{sep}}}}}}\,{\forall }_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}}\quad {{{{\rm{G}}}}}_{Q}(Q)\ge {{{\rm{P}}}}(Q| \rho )\quad \,{{\mbox{and}}}\,$$

(27)

$${\forall }_{\rho \in {{{{\mathcal{H}}}}}_{{{{\rm{sep}}}}}}\,\quad {{{{\rm{G}}}}}_{{{{\rm{acc}}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}})\ge {{{\rm{P}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}| \rho ),$$

(28)

where G_Q is stronger in the sense that it satisfies the condition for G_acc as well, i.e. ${{{{\rm{G}}}}}_{{{{\rm{acc}}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}})\le {\sum }_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}}{{{{\rm{G}}}}}_{Q}(Q)$. The strategy to find the upper bounds is based on finding a set of M correlations T_j corresponding to $\tilde{\rho }$ such that ${{{\rm{P}}}}(Q| \tilde{\rho })$ satisfies the criterion in question. Note that $\tilde{\rho }$ might not even be a proper density matrix which is why we refer to such correlations as compatible with separability. Furthermore, in the case of G_Q the value at each Q can correspond to a different $\tilde{\rho }$.

G_acc can be used straightforwardly to choose Frequentist acceptance sets by modifying condition (2) to

$$\mathop{\sum}\limits_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}}{{{\rm{P}}}}(Q| {{{\rm{sep}}}})\le \mathop{\sum}\limits_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}}{{{{\rm{G}}}}}_{{{{\rm{acc}}}}}(Q| {{{\rm{sep}}}})\le 1-{q}_{{{{\rm{F}}}}}.$$

(29)

If the criterion is satisfied for G_acc it will also be satisfied for the actual distribution P(Q∣sep). Thus, the estimation can only lead to a stricter choice of the acceptance set and ensures that the minimal level of significance is attained.

In the Bayesian case for each outcome Q the minimal validity q_B is assured as in Eq. (4), by providing a lower bound on P(ent∣Q) using G_Q as

$${{{\rm{P}}}}({{{\rm{ent}}}}| Q)=\frac{{{{\rm{P}}}}(Q| {{{\rm{ent}}}}){{{\rm{P}}}}({{{\rm{ent}}}})}{{{{\rm{P}}}}(Q| {{{\rm{sep}}}}){{{\rm{P}}}}({{{\rm{sep}}}})+{{{\rm{P}}}}(Q| {{{\rm{ent}}}}){{{\rm{P}}}}({{{\rm{ent}}}})}\ge \frac{{{{\rm{P}}}}(Q| {{{\rm{ent}}}}){{{\rm{P}}}}({{{\rm{ent}}}})}{{{{{\rm{G}}}}}_{Q}(Q){{{\rm{P}}}}({{{\rm{sep}}}})+{{{\rm{P}}}}(Q| {{{\rm{ent}}}}){{{\rm{P}}}}({{{\rm{ent}}}})}\ge {q}_{{{{\rm{B}}}}}.$$

(30)

To estimate the expected loss, however, it is sufficient to employ the weaker worst-case estimate G_acc since the scenario is considered a priori, i.e. before a particular outcome Q is known. It is thus sufficient to only maximize the probability in the whole interval ${{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}$, independently of how G_acc(Q) is related to P(Q∣sep) point-wise. The loss (6) is thus estimated as

$$\begin{array}{ll}L&={q}_{{{{\rm{B}}}}}{{{\rm{P}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}| {{{\rm{sep}}}}){{{\rm{P}}}}({{{\rm{sep}}}})+(1-{q}_{{{{\rm{B}}}}}){{{\rm{P}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{rej}}}}}| {{{\rm{ent}}}}){{{\rm{P}}}}({{{\rm{ent}}}})\\ &\le {q}_{{{{\rm{B}}}}}{{{{\rm{G}}}}}_{{{{\rm{acc}}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}){{{\rm{P}}}}({{{\rm{sep}}}})+(1-{q}_{{{{\rm{B}}}}}){{{\rm{P}}}}(Q\in {{{{\mathcal{O}}}}}_{{{{\rm{rej}}}}}| {{{\rm{ent}}}}){{{\rm{P}}}}({{{\rm{ent}}}}).\end{array}$$

(31)

Since different worst-case estimates are employed, i.e. G_Q to find the acceptance interval and G_acc to estimate the loss, it is no longer the case that the optimal choice of acceptance intervals minimizes the loss. However, the worst-case estimate of the loss with G_acc is still useful to compare scenarios with different numbers of copies and distributions of copies over settings, for each of which G_Q has been used as a rule to find the acceptance sets.

In analogy to P(Q∣sep) in principle, both the Frequentist and the Bayesian approach could proceed given a lower bound on P(Q∣ent). However, this is completely unfeasible for several reasons. For example, for the non-linear witness S: (i) entangled states can easily yield very small values of S due to statistical fluctuations, (ii) some entangled states have small (or even vanishing) S^∞^29,30,31,32, (iii) there are bad choices of measurement settings, e.g. ones that give S^∞ close to 1. Thus, P(S∣ent) is obtained by modelling a certain state distribution P(ρ_j∣ent), which fits the currently employed apparatus for state generation, as illustrated in Results (Optimizing Entanglement Certification Strategies). Similar problems appear for linear witnesses. Note, that the probability distributions P(ρ_j∣sep) and P(ρ_j∣ent) do not describe a statistical uncertainty of the form encountered in mixed states. Instead, they capture prior expectations as to what separable or entangled states are to be submitted to the certification procedure. In other words, due to nonlinearity of the function P(Q∣ρ_j) it does not hold that ${{{\rm{P}}}}(Q| {{{\rm{sep}}}})={{{\rm{P}}}}(Q| {\bar{\rho }}_{{{{\rm{sep}}}}})$ with a mean separable state ${\bar{\rho }}_{{{{\rm{sep}}}}}={\sum }_{j}{\rho }_{j}{{{\rm{P}}}}({\rho }_{j}| {{{\rm{sep}}}})$ and similarly for P(Q∣ent).

Correlations compatible with separability

We have written an optimization algorithm in order to find the set of correlation functions that maximize the probabilities in the acceptance sets while being compatible with separability, i.e. the correlations do not violate the witness in the limit of infinite sample size. It verifies that for equal distribution of copies over the settings, i.e. each n_i = n for each i = 1, …, M, and sufficiently high bound on the linear witness ${{{\mathcal{C}}}}$ (${{{\mathcal{B}}}}$ for the non-linear witness) all of the correlations can be taken as T_i = 1/M (${T}_{i}^{2}=1/M$). The algorithm can be also applied to a model probability distribution over Q for separable states and arbitrary distributions of copies as well as more complicated disjoint acceptance sets (see the “Results” subsection “Optimizing entanglement certification strategies”).

The basic algorithm works as follows. We compute P(Q) and the cumulative probability function. They are parameterized with to-be-determined correlations {T₁, T₂, . . . , T_M} as well as the bound defining the acceptance set. The heart of the computation is the numerical maximization of ${\sum }_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}}{{{\rm{P}}}}(Q)({{{\rm{P}}}}(Q){\forall }_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}})$, for the fixed value of bound, with respect to {T₁, T₂, . . . , T_M}. We constrain the set of possible correlations to those that yield W^∞ ≥ 0 for the linear and S^∞ ≤ 1 for the non-linear witness, with each T_i ∈ [ − 1, 1]. Two widely available out-of-the-box methods were used for finding the maximum: simulated annealing and Nelder-Mead algorithm^33,34. Both gave similar results with the Nelder–Mead method being more precise in the considered problem. The output of the algorithm is a list of {T₁, T₂, . . . , T_M} that maximizes ${\sum }_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}}{{{\rm{P}}}}(Q)({{{\rm{P}}}}(Q){\forall }_{Q\in {{{{\mathcal{O}}}}}_{{{{\rm{acc}}}}}})$ for the given bound, or any other acceptance set, under the stated constraints.

Effects of different types of noise

We analyse here the effects due to different forms of noise in Eq. (15) on the efficiency of the entanglement certification. In particular, we consider the following unbiased noises

$${\rho }_{{{{\rm{wn}}}}}={\mathbb{1}}/{2}^{N},\quad \,{{\mbox{white noise}}}\,$$

(32)

$${\rho }_{{{{\rm{rp}}}}}=\left\vert {\psi }_{{{{\rm{r}}}}}\right\rangle \left\langle {\psi }_{{{{\rm{r}}}}}\right\vert ,\quad \,{{\mbox{Haar random pure state noise}}}\,$$

(33)

$${\rho }_{{{{\rm{wn}}}}}={\rho }_{{{{\rm{r}}}}},\quad \,{{\mbox{random mixed states}}}\,,$$

(34)

where Haar randomness refers to the sampling of states according to the rotationally invariant measure. Data for the random states was computed numerically by scanning p in the range [0.5, 1] with steps of 0.01, sampling 1000 random four-qubit states for each p, choosing the five biggest correlations, computing ${S}_{5}^{\infty }$ and finally the probability of S = 5 in finite statistics. In this way, practically all the values of ${S}_{5}^{\infty }$ were scanned. These calculations have shown that the probability of success does not depend on the kind of noise and it is described solely by the value of p. The probability of success P(S = 5∣ρ) for a continuous choice of p is presented in Fig. 5.

**Fig. 5: The efficiency of entanglement detection for different noises ρ_n and their admixtures p in Eq. (15).**

Data availability

The data supporting figures are available at https://doi.org/10.48550/arXiv.2208.01983.

References

Horodecki, M., Horodecki, P. & Horodecki, R. Separability of mixed states: necessary and sufficient conditions. Phys. Lett. A 223, 1 (1996).
Article MathSciNet ADS Google Scholar
Horodecki, R., Horodecki, P., Horodecki, M. & Horodecki, K. Quantum entanglement. Rev. Mod. Phys. 81, 865 (2009).
Article MathSciNet ADS Google Scholar
Gühne, O. & Tóth, G. Entanglement detection. Phys. Rep. 474, 1–75 (2009).
Article MathSciNet ADS Google Scholar
Arrazola, J. M. et al. Reliable entanglement verification. Phys. Rev. A 87, 062331 (2013).
Article ADS Google Scholar
Blume-Kohout, R., Yin, J. O. S. & van Enk, S. J. Entanglement verification with finite data. Phys. Rev. Lett. 105, 170501 (2010).
Article ADS Google Scholar
Zhu, H. & Hayashi, M. Efficient verification of pure quantum states in the adversarial scenario. Phys. Rev. Lett. 123, 260504 (2019).
Article ADS Google Scholar
Wang, K. & Hayashi, M. Optimal verification of two-qubit pure states. Phys. Rev. A 100, 032315 (2019).
Article ADS Google Scholar
Zhang, W.-H. et al. Experimental optimal verification of entangled states using local measurements. Phys. Rev. Lett. 125, 030506 (2020).
Article ADS Google Scholar
Dimić, A. & Dakić, B. Single-copy entanglement detection. npj Quantum Inf. 4, 11 (2018).
Article ADS Google Scholar
Saggio, V. et al. Experimental few-copy multipartite entanglement detection. Nat. Phys. 15, 935 (2019).
Article Google Scholar
Morris, J., Saggio, V., Gočanin, A. & Dakić, B. Quantum verification and estimation with few copies. Adv. Quant. Tech. 5, 2100118 (2022).
Article Google Scholar
Saggio, V. & Walther, P. Few-copy entanglement detection in the presence of noise. Ann. Phys. 534, 2100597 (2022).
Article MathSciNet Google Scholar
Lehmann, E. L. & Romano, J. P. Testing Statistical Hypotheses (Springer, Cham, 2005).
Robert, C. P. The Bayesian Choice (Springer, New York, NY, 2013).
Berger, J. O. Statistical Decision Theory and Bayesian Analysis (Springer, New York, NY, 2010).
Jungnitsch, B. et al. Increasing the statistical significance of entanglement detection in experiments. Phys. Rev. Lett. 104, 210401 (2010).
Article ADS Google Scholar
Tran, M. C., Dakić, B., Arnault, F., Laskowski, W. & Paterek, T. Quantum entanglement from random measurements. Phys. Rev. A 92, 050301R (2015).
Article MathSciNet ADS Google Scholar
Knips, L. et al. How long does it take to obtain a physical density matrix? arXiv:1512.06866 (2015).
Badziag, P., Brukner, C., Laskowski, W., Paterek, T. & Żukowski, M. Experimentally friendly geometrical criteria for entanglement. Phys. Rev. Lett. 100, 140403 (2008).
Article ADS Google Scholar
Hassan, A. S. M. & Joag, P. S. Separability criterion for multipartite quantum states based on the Bloch representation of density matrices. Quantum Inf. Comp. 8, 773 (2008).
MathSciNet Google Scholar
Hassan, A. S. M. & Joag, P. S. Experimentally accessible geometric measure for entanglement in n-qubit pure states. Phys. Rev. A 77, 062334 (2008).
Article ADS Google Scholar
Hassan, A. S. M. & Joag, P. S. Geometric measure for entanglement in n-qudit pure states. Phys. Rev. A 80, 042302 (2009).
Article MathSciNet ADS Google Scholar
Tran, M. C., Dakić, B., Laskowski, W. & Paterek, T. Correlations between outcomes of random measurements. Phys. Rev. A 94, 042302 (2016).
Article ADS Google Scholar
Knips, L., Schwemmer, C., Klein, N., Wieśniak, M. & Weinfurter, H. Multipartite entanglement detection with minimal effort. Phy. Rev. Lett. 117, 210504 (2016).
Article ADS Google Scholar
Knips, L. et al. Multipartite entanglement analysis from random correlations. npj Quantum Inf. 6, 51 (2020).
Article ADS Google Scholar
Flammia, S. T. & Liu, Y.-K. Direct fidelity estimation from few Pauli measurements. Phys. Rev. Lett. 106, 230501 (2011).
Article ADS Google Scholar
Pittenger, A. O. & Rubin, M. H. Geometry of entanglement witnesses and local detection of entanglement. Phys. Rev. A 67, 012327 (2003).
Article ADS Google Scholar
Li, W., Han, R., Shang, J., Ng, H. K. & Englert, B.-G. Sequentially constrained Monte Carlo sampler for quantum states. Preprint at https://arxiv.org/abs/2109.14215 (2021).
Kaszlikowski, D., De, A. S., Sen, U., Vedral, V. & Winter, A. Quantum correlations without classical correlations. Phys. Rev. Lett. 101, 070502 (2008).
Article ADS Google Scholar
Laskowski, W., Markiewicz, M., Paterek, T. & Wieśniak, M. Incompatible local hidden-variable models of quantum correlations. Phys. Rev. A 86, 032105 (2012).
Article ADS Google Scholar
Schwemmer, C. et al. Genuine multipartite entanglement without multipartite correlations. Phys. Rev. Lett. 114, 180501 (2015).
Article ADS Google Scholar
Tran, M. C. et al. Genuine n-partite entanglement without n-partite correlation functions. Phys. Rev. A 95, 062331 (2017).
Article ADS Google Scholar
Inc., W. R. Mathematica, Version 13.1 (W.R. Inc., Champaign, IL, 2022).
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
Article Google Scholar

Download references

Acknowledgements

We thank Borivoje Dakić for useful discussions. This research was supported by the DFG (Germany) and NCN (Poland) within the joint funding initiative ‘Beethoven2’ (2016/23/G/ST2/04273, 381445721), by the DFG under Germany’s Excellence Strategy EXC-2111-390814868 and by the BMBF project QuKuK (16KIS1621). P.C. is supported by the National Science Centre (NCN, Poland) within the Preludium Bis project (Grant No. 2021/43/O/ST2/02679). W.L. acknowledges partial support from the Foundation for Polish Science (IRAP project ICTQT, Contract No. 2018/MAB/5, co-financed by EU via the Smart Growth Operational Programme). T.P. is supported by the Polish National Agency for Academic Exchange NAWA Project No. PPN/PPO/2018/1/00007/U/00001. J.D. acknowledges support from the Ph.D. programme IMPRS-QST.

Author information

Authors and Affiliations

Institute of Theoretical Physics and Astrophysics, Faculty of Mathematics, Physics, and Informatics, University of Gdańsk, 80-308, Gdańsk, Poland
Paweł Cieśliński, Waldemar Kłobus, Tomasz Paterek, Harald Weinfurter & Wiesław Laskowski
Max Planck Institute for Quantum Optics, 85748, Garching, Germany
Jan Dziewior, Lukas Knips, Jasmin Meinecke & Harald Weinfurter
Faculty of Physics, Ludwig Maximilian University, 80799, Munich, Germany
Jan Dziewior, Lukas Knips, Jasmin Meinecke & Harald Weinfurter
Munich Center for Quantum Science and Technology, 80799, Munich, Germany
Jan Dziewior, Lukas Knips, Jasmin Meinecke & Harald Weinfurter
School of Mathematics and Physics, Xiamen University Malaysia, 43900, Sepang, Malaysia
Tomasz Paterek
International Centre for Theory of Quantum Technologies, University of Gdańsk, 80-308, Gdańsk, Poland
Wiesław Laskowski

Authors

Paweł Cieśliński
View author publications
You can also search for this author in PubMed Google Scholar
Jan Dziewior
View author publications
You can also search for this author in PubMed Google Scholar
Lukas Knips
View author publications
You can also search for this author in PubMed Google Scholar
Waldemar Kłobus
View author publications
You can also search for this author in PubMed Google Scholar
Jasmin Meinecke
View author publications
You can also search for this author in PubMed Google Scholar
Tomasz Paterek
View author publications
You can also search for this author in PubMed Google Scholar
Harald Weinfurter
View author publications
You can also search for this author in PubMed Google Scholar
Wiesław Laskowski
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.C., W.L. and L.K. started the project. P.C. and L.K. formulated the finite statistics model for entanglement witnesses. J.D. analysed and performed a careful statistical analysis, further developed by J.D., P.C., L.K. and W.K. P.C., J.D., L.K. and J.M. prepared detailed examples and perfected the resource optimization strategy. W.L., T.P. and H.W. supervised the project. All authors contributed to analysing the results, as well as writing the manuscript.

Corresponding authors

Correspondence to Paweł Cieśliński or Wiesław Laskowski.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Cieśliński, P., Dziewior, J., Knips, L. et al. Valid and efficient entanglement verification with finite copies of a quantum state. npj Quantum Inf 10, 14 (2024). https://doi.org/10.1038/s41534-024-00810-3

Download citation

Received: 20 July 2023
Accepted: 11 January 2024
Published: 24 January 2024
DOI: https://doi.org/10.1038/s41534-024-00810-3

Valid and efficient entanglement verification with finite copies of a quantum state

Subjects

Abstract

Similar content being viewed by others

Mirrored entanglement witnesses

Experimental few-copy multipartite entanglement detection

Quantifying entanglement in a 68-billion-dimensional quantum state space

Introduction

Results

Entanglement certification as hypothesis testing

Frequentist approach

Bayesian approach

Effects of finite statistics

Limited significance of entanglement certification

Resource optimization

Optimizing entanglement certification strategies

Linear witness

Quadratic witness

Discussion

Methods

Certifying entanglement

Witnesses in finite statistics

Estimating the probability distributions

Correlations compatible with separability

Effects of different types of noise

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

Mirrored entanglement witnesses

Experimental few-copy multipartite entanglement detection

Quantifying entanglement in a 68-billion-dimensional quantum state space

Introduction

Results

Entanglement certification as hypothesis testing

Frequentist approach

Bayesian approach

Effects of finite statistics

Limited significance of entanglement certification

Resource optimization

Optimizing entanglement certification strategies

Linear witness

Quadratic witness

Discussion

Methods

Certifying entanglement

Witnesses in finite statistics

Estimating the probability distributions

Correlations compatible with separability

Effects of different types of noise

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links