Multi-party quantum fingerprinting with weak coherent pulses: circuit design and protocol analysis

Quantum communication has been leading the way of many remarkable theoretical results and experimental tests in physics. In this context, quantum communication complexity (QCC) has recently drawn earnest research attention as a tool to optimize the amounts of transmitted qubits and energy that are required to implement distributed computational tasks. On this matter, we introduce a novel multi-user quantum fingerprinting (QF) protocol that is ready to be implemented with existing technology. Particularly, we extend to the multi-user framework a well-known two-user coherent-state fingerprinting scheme. This generalization is highly non-trivial for a twofold reason, as it requires not only to extend the set of protocol rules but also to specify a procedure for designing the optical devices intended for the generalized protocol. Much of the importance of our work arises from the fact that the obtained QCC figures of merit allow direct comparison with the best-known classical multi-user fingerprinting protocol, of significance in the field of computer technologies and networking. Furthermore, as one of the main contributions of the manuscript, we deduce innovative analytical upper bounds on the amount of transmitted quantum information that are even valid in the two-user protocol as a particular case. These original analytical bounds are of interest for estimating the realistic protocol performance prior to experimental realizations. Ultimately, comparative results are provided to contrast different protocol implementation strategies and, importantly, to show that, under realistic circumstances, the multi-user protocol can achieve tasks that are impossible by using classical communication alone. Our work provides relevant contributions towards understanding the nature and the limitations of QF and, on a broader scope, also the limitations and possibilities of quantum-communication networks embracing a node that is accessed by multiple users at the same time.

Obviously, the two users Alice and Bob together with the referee can always trivially achieve the goal of computing (1) by communicating to the referee the entire N -bit inputs x and y. However, if they choose instead to send fingerprints F (x) and F (y) of the original inputs x and y, they can always succeed with a sought reduced communication cost when an arbitrarily small probability of error is tolerated. We remark that, according to our notation, F (x) (and analogously F (y)) may refer to both classical or quantum cases. In a classical protocol, fingerprint F (x) consists of a bit string shorter than x that may be computed as a hash function. Conversely, in the quantum case, F (x) represents a bit string longer than x that is then encoded and transmitted as qubits in the form of quantum states, which are ultimately quantum-processed by the referee. In particular, optimal classical fingerprinting protocols are known to require fingerprints of at least Ω( √ N ) bits [26,27], which is a fundamental lower bound. By sending quantum states, in comparison, Alice and Bob may, under certain conditions, require fingerprints of just O(log 2 N ) qubits to solve the same problem subject to an identical probability of error as in the classical protocol, which represents an exponential reduction [12,15]. The direct comparison between bits and qubits is fully justified by virtue of Holevo's theorem [30], which establishes that M -bit classical messages cannot be encoded into, and then decoded from, quantum messages comprising less than M qubits.
Besides the achievable quantum supremacy in the field of communication complexity, research in QF was also sparked by some other relevant attainments that quantum fingerprints can bestow but are beyond the scope of this document. In particular, QF was also applied to construct a theoretic quantum automaton with an exponential improvement in size when compared to a classical randomized automaton [31,32]. Another application consists of utilizing QF as a proposed cryptographic hash function [33]. Finally, as an eventual application already noted in [15], QF may also play a pivotal role improving certain extant schemes for quantum digital signatures in [34,35].
The first successful experimental demonstrations of QF protocols were reported in [21][22][23] and they consisted of distributed implementations of the equality problem (1). A downside of all these initial 1 if x y f x y x y   =  =   ≠   Fig. 1: Schematic illustration of a general two-user fingerprinting protocol. The represented scheme is applicable to both classical and quantum protocols. Alice and Bob receive, or already possess in their custody, raw classical binary inputs x and y, respectively, comprising messages of N bits. They apply a certain mathematical function F (·) to their respective bit strings x and y. The binary outputs of this function represent M -bit fingerprints that both users encode as quantum or classical information. The referee receives the incoming signals from Alice and Bob and, after applying a certain classical or quantum procedure, concludes whether the original N -bit sequences coincide or differ. Fig. 1: Schematic illustration of a general two-user fingerprinting protocol. The represented scheme is applicable to both classical and quantum protocols. Alice and Bob receive, or already possess in their custody, raw classical binary inputs x and y, respectively, comprising messages of N bits. They apply a certain mathematical function F (·) to their respective bit strings x and y. The binary outputs of this function represent M -bit fingerprints that both users encode as quantum or classical information. The referee receives the incoming signals from Alice and Bob and, after applying a certain classical or quantum procedure, concludes whether the original N -bit sequences coincide or differ.
experimental efforts lies in the fact that their fingerprint states must be extremely entangled, even when the input size N is small. These experimental demands greatly surpass those that are achievable with current technology, except when restricting the transmitted information to a few qubits per user. Therefore, their current practical interest is very limited. A different approach for implementing quantum fingerprinting was proposed in [24,25]. However, these other theoretical approaches demand the preparation of quantum states of fixed photon number, which is still a challenging task from the experimental point of view [36]. Recently, another innovative theoretical proposal for a QF protocol that is suitable to be implemented with present-day technology without requiring entanglement was published in [15]. In this avant-garde protocol, Alice and Bob send coherent states of low amplitude that the referee interferes in a balanced beamsplitter. On the basis of this protocol, a pioneering proof-of-principle implementation that needs to send less information than the best-known classical protocol [26] was reported in [16]. However, this experiment in [16] employs an improved referee strategy tacitly accompanied by numerical techniques for its analysis, instead of the original analytical method in [15]. Finally, a recent further enhanced version of the experiment in [16] was detailed in [17]. This enhanced experimental setup makes use of ultralow-noise SNSPDs (superconducting nanowire single-photon detectors) and it beats not only the best-known classical protocol but also the classical theoretical limit discussed in [26,27]. As our main contribution, the present work is focused on extending the two-user coherent-state QF protocol in [15] to multiple users (K ≥ 2), including the essence of all the aforementioned enhancements advocated in [16,17]. In particular, our K-party proposal retains the analytical character of the two-party methods in [15] while also preserving the benefits of the improved referee's rules in [16,17]. Just for the sake of clarity, in the general K-party framework for the equality problem, each user receives a binary input sequence x k , with 1 ≤ k ≤ K. They then send to the referee node their respective fingerprints F (x k ) encoded as either classical or quantum information. The referee's task is to determine if all the original K inputs x k are the same or not, as sketched in figure 2. Our proposed extension of coherent-state QF for more than two users has many noteworthy implications of intrinsic importance in the study of quantum networks. First, the communication cost analysis of our extended QF protocol can be directly compared to results that exist for an analogous classical K-user protocol [37], which are of pragmatic interest in the field of distributed computational algorithms involving multiple users. Either in the classical or in the quantum case, the main goal ultimately consists of minimizing the required energy expenditures, which are related to the amount of transmitted information. Furthermore, our work represents a contribution towards the comprehension of multi-user quantum networks, in similar fashion as other unrelated recent proposals such as [38,39], which are committed to introduce multipartite QKD (quantum key distribution) protocols that entail a central network node. Finally, since it is accepted that QCC is connected to some foundational aspects of quantum mechanics [14,15], our work may also have an impact in helping to expand the knowledge related to some underlying physical phenomena present in the quantum world. This fundamental knowledge includes, as an example, the per-user information-carrying capacity of a quantum channel, and the relationship between multipartite entanglement and nonorthogonality.  Fig. 2: Schematic of a generalized fingerprinting protocol consisting of K users, or parties. Every user receives, or already owns, a raw N -bit sequence x k , with 1 ≤ k ≤ K. Afterwards, each party transmits signals that encode M -bit fingerprints F (x k ), as either classical or quantum information. By processing the incoming signals, the referee establishes whether the original binary strings x k are coincident to each other, or whether at least one of them is different from the rest of strings.
Before concluding this introduction, we present the structure of the document's remainder, and, at the same time, we also introduce some other accompanying prominent contributions of this work. The rest of the document starts briefly describing in section 2 the groundwork basics of coherent-state quantum fingerprinting. Most of these preliminaries are essential to become acquainted with notation and concepts used later in the subsequent development of our QF extension. Section 3 provides various linear-optics innovative generalizations of the ordinary 50:50 beamsplitter concept that thus far was used for the twouser protocols in previous works [15][16][17]. Section 4 is devoted to the analysis of our multi-user quantum fingerprinting protocol, presenting various suitable referee strategies and making use of the new generalized circuit designs exposed in the preceding section. In this section 4, we also introduce original analytical upper bounds on the amount of transmitted quantum information. Resembling the two-user bounds in [15], but unlike the numerical methods employed in [16,17], our novel bounding method is entirely analytical in nature. This fact allows us to compute with a low computational cost upper bounds on the amounts, per user, of both transmitted qubits and energy. Remarkably, these new analytical bounds may be applied, as a particular case, to the conventional two-user setup in [16,17]. Further, another relevant feature is that they can be easily used in an experimental setting, just by taking a few preliminary measurements in the classical optical regime. Next, section 5 compares our multi-user protocol with the best-known analogous classical protocol described in [37], and with the classical limit deduced in this manuscript's appendices. Importantly, we assess in this section the protocol resilience against experimental errors, and prove that the quantum protocol can beat, under certain circumstances, the best-known classical protocol and the classical limit. In closing the regular part of the manuscript, the last section is committed to present main conclusions and future work perspectives. In addition, the paper contains three appendices as well. A detailed list of symbols used in the paper is exposed in appendix A. Regarding appendix B, it includes exhaustive mathematical derivations of all the upper bounds in section 4. Finally, the multi-party classical limit used in section 5 is derived in appendix C.

Fundamentals of coherent-state quantum fingerprinting
Besides introducing relevant notation, this section contains an abridged description, including some novel explanatory contributions, of the two-user coherent-state protocol proposed in [15], which was later adapted with improvements for the experimental deployments in [16,17]. Hereof, as a requirement inherited from all the other preceding QF protocols in [12,[21][22][23][24][25], this protocol demands the users to apply an error-correcting code (ECC) for the sole purpose of amplifying the Hamming distance between the input bit strings of Alice and Bob 1 . An ECC can be mathematically modelled as a function E : {0, 1} N → {0, 1} M such that 1 For example, if the original inputs x and y perfectly match except for one single bit, then the ECC will output binary strings that differ in a much larger number of bits. Unlike traditional applications of ECC in digital communication systems, only the Fig. 2: Schematic of a generalized fingerprinting protocol consisting of K users, or parties. Every user receives, or already owns, a raw N -bit sequence x k , with 1 ≤ k ≤ K. Afterwards, each party transmits signals that encode M -bit fingerprints F (x k ), as either classical or quantum information. By processing the incoming signals, the referee establishes whether the original binary strings x k are coincident to each other, or whether at least one of them is different from the rest of strings.
Before concluding this introduction, we present the structure of the document's remainder, and, at the same time, we also introduce some other accompanying prominent contributions of this work. The rest of the document starts briefly describing in section 2 the groundwork basics of coherent-state quantum fingerprinting. Most of these preliminaries are essential to become acquainted with notation and concepts used later in the subsequent development of our QF extension. Section 3 provides various linear-optics innovative generalizations of the ordinary 50:50 beamsplitter concept that thus far was used for the twouser protocols in previous works [15][16][17]. Section 4 is devoted to the analysis of our multi-user quantum fingerprinting protocol, presenting various suitable referee strategies and making use of the new generalized circuit designs exposed in the preceding section. In this section 4, we also introduce original analytical upper bounds on the amount of transmitted quantum information. Resembling the two-user bounds in [15], but unlike the numerical methods employed in [16,17], our novel bounding method is entirely analytical in nature. This fact allows us to compute with a low computational cost upper bounds on the amounts, per user, of both transmitted qubits and energy. Remarkably, these new analytical bounds may be applied, as a particular case, to the conventional two-user setup in [16,17]. Further, another relevant feature is that they can be easily used in an experimental setting, just by taking a few preliminary measurements in the classical optical regime. Next, section 5 compares our multi-user protocol with the best-known analogous classical protocol described in [37], and with the classical limit deduced in this manuscript's appendices. Importantly, we assess in this section the protocol resilience against experimental errors, and prove that the quantum protocol can beat, under certain circumstances, the best-known classical protocol and the classical limit. In closing the regular part of the manuscript, the last section is committed to present main conclusions and future work perspectives. In addition, the paper contains three appendices as well. A detailed list of symbols used in the paper is exposed in appendix A. Regarding appendix B, it includes exhaustive mathematical derivations of all the upper bounds in section 4. Finally, the multi-party classical limit used in section 5 is derived in appendix C.

Fundamentals of coherent-state quantum fingerprinting
Besides introducing relevant notation, this section contains an abridged description, including some novel explanatory contributions, of the two-user coherent-state protocol proposed in [15], which was later adapted with improvements for the experimental deployments in [16,17]. Hereof, as a requirement inherited from all the other preceding QF protocols in [12,[21][22][23][24][25], this protocol demands the users to apply an error-correcting code (ECC) for the sole purpose of amplifying the Hamming distance between the input bit strings of Alice and Bob 1 . An ECC can be mathematically modelled as a function E : is the so-called codeword associated with Alice's input x. Ultimately, Alice encodes E(x) as quantum information that she transmits through her channel; the description on Bob's side is analogous. The ratio between the lengths of E(x) and x is called the rate of the ECC and it is defined in this work as c = M N > 1. Another important parameter of the ECC is the minimum Hamming distance between any two different codewords. Related to this ECC distance, we define an ECC parameter δ that designates the maximum fraction of bits in which any two codewords E(x) and E(y), satisfying the requirement E(x) = E(y), have the same bit values. As a consequence, the minimum distance of the ECC can be simply computed using δ as (1 − δ)M . Without any loss of generality, we assume, just as in all the previous works on the subject, that an entire N -bit input x can always be mapped by the ECC into an M -bit codeword E(x). If this were not the case, the same statistical behaviour studied in this paper could be reproduced by slicing the input bit strings into smaller blocks.
The coherent-state QF protocol, as noted above, overcomes all the implementation issues present in earlier QF proposals and it makes quantum fingerprinting practical with current technology. In lieu of requiring either entangled states or a fixed number of photons, Alice and Bob each send a so-named "coherent state in the fingerprint mode" [15]. This coherent fingerprint state can be rigorously defined for Alice as The total mean photon number corresponding to the entire train of pulses is µ = |α| 2 , whereas the mean photon number per each individual pulse in the sequence is µ pulse = µ M . It is worth noting that all the coherent states that form together the fingerprint state (2) have the same amplitude, but their individual phases that encode the information are determined by the specific binary codeword E(x), which itself depends on the particular raw input string x.
In the two-user coherent-state QF protocol, the referee must rely on a quantum measurement to verify if the phases of pairs of arriving pulses are equal or different. A simple practical way of implementing such a measurement involves a standard balanced beamsplitter wherein the incoming individual pulses interfere as depicted on figure 3. In the ideal case, whenever a click is recorded on the output detectors, the referee unambiguously knows whether the phases in a pair of incoming pulses are the same or not. It must not escape our notice that, in order to produce a correct interference at the referee's beamsplitter, Alice and Bob need a certain method for establishing a common phase reference. This phase reference may be established before starting the protocol itself or, alternatively, the referee can incorporate phase-locking techniques into her setup. In fact, this latter alternative may be implemented by exploiting different practical methods already developed in [40][41][42] within the mature field of quantum key distribution (QKD).
For the sake of an easy generalization to multi-user instances in this document, we label the two-user protocol detectors as "1" and "2". By convention, a click in detector "1" reveals lack of relative parity in the two phases, whereas a click in detector "2" indicates that the two phases are coincident. In this manner, we can now summarize the basic coherent-state QF protocol steps in an ideal implementation: (i) Alice and Bob agree to use a common ECC and a common value of α.
(ii) They prepare coherent fingerprint states |α x and |α y using their respective input sequences x and y according to the quantum state in (2).
(iii) Both parties send these pulse trains to the referee through their respective quantum channels.
(iv) The referee interferes the individual pulses using a standard 50:50 beamsplitter and she announces that the original inputs x and y are different, i.e. f (x, y) = 1, if and only if at least one click is observed encoding part of the ECC implementation is used in quantum fingerprinting protocols, not the decoding part. Fig. 3: Illustration of a two-user coherent-state quantum fingerprinting protocol. Alice and Bob each send a sequence of M weak coherent pulses whose phases (0 or π) are modulated according to codewords of an error correcting code (ECC). These codewords are determined depending on the raw binary inputs Alice and Bob want to fingerprint. The incoming individual pulses interfere in a standard 50:50 beamsplitter (BS) located at the referee's circuit. By observing at least one of the two detectors and counting its clicks, the referee infers if the two complete trains of coherent states from Alice and Bob are either the same or different. In an ideal implementation, one of the detectors may only fire if two incoming individual pulses are equal, and the other detector may only fire if two incoming individual pulses are different.
in detector "1". As is apparent from this last statement, it suffices for the referee to observe just one detector when the analysis is constrained to an ideal (defectless) implementation 2 .
In the absence of experimental imperfections, such as a flawed beamsplitter or dark counts in the detectors, the referee always announces the correct outcome f (x, y) = 0 with certainty, whenever the original inputs of Alice and Bob are equal, i.e. x = y. This errorless referee behaviour for the case x = y is due to the fact that the only possible detector responses are either clicks in detector "2" only or no clicks at all in both detectors. For the other case x ̸ = y, error probability p error is the same as the probability of obtaining no clicks in detector "1". In particular, after the individual pulses interfere in the ideal referee beamsplitter, independently of their relative phases, there will always be a coherent state ± √ 2 α √ M ⟩ going into one detector and the vacuum entering the other detector. The click probability is calculated from the Poissonian statistics of the coherent states as p click = 1 − exp . Accordingly, the worst-case error probability is simply p error = (1 − p click ) (1−δ)M , because the minimum amount of pulses that may potentially produce clicks in detector "1" is (1 − δ)M , as dictated by the distance of the ECC. Introducing into this last equality the above expression of p click and then solving for |α| 2 , we obtain This equation in (3) provides the minimum mean photon number of each entire train of pulses that is needed to get a desired error level in the referee outcomes, in the ideal case. Notice that this minimum value of the total mean photon number |α| 2 depends exclusively on the error probability and on the chosen ECC.
Under the ideal premises considered so far, for fixed p error and δ, mean photon number |α| 2 remains constant regardless of the raw message length N of Alice's and Bob's binary inputs. Also, we bring into attention that no mathematical approximations were invoked in this work in order to obtain (3).
Up to this point, we have considered an ideal scenario only; however, any practical QF implementation will inevitably be affected by experimental imperfections. These imperfections render unusable, or at least highly impractical, the decision rule presented above. This is so because detector "1" may fire even if the individual input pulses at the beamsplitter are equal. Nonetheless, in case of small imperfections, we may expect the total number of clicks registered in "1" when x ̸ = y to be much larger than when x = y. Similarly, we presume the opposite behaviour for detector "2", regarding the number of clicks for the respective cases x ̸ = y and x = y. Exploiting these statistical behaviours in the detectors, different decision rules can be contrived to make the QF protocol robust to experimental errors. Particularly, the rule proposed in [15] is based on calculating a fraction of clicks f 2 = D2 D1+D2 , where D k is the total number of clicks in detector Fig. 3: Illustration of a two-user coherent-state quantum fingerprinting protocol. Alice and Bob each send a sequence of M weak coherent pulses whose phases (0 or π) are modulated according to codewords of an error correcting code (ECC). These codewords are determined depending on the raw binary inputs Alice and Bob want to fingerprint. The incoming individual pulses interfere in a standard 50:50 beamsplitter (BS) located at the referee's circuit. By observing at least one of the two detectors and counting its clicks, the referee infers if the two complete trains of coherent states from Alice and Bob are either the same or different. In an ideal implementation, one of the detectors may only fire if two incoming individual pulses are equal, and the other detector may only fire if two incoming individual pulses are different.
in detector "1". As is apparent from this last statement, it suffices for the referee to observe just one detector when the analysis is constrained to an ideal (defectless) implementation 2 .
In the absence of experimental imperfections, such as a flawed beamsplitter or dark counts in the detectors, the referee always announces the correct outcome f (x, y) = 0 with certainty, whenever the original inputs of Alice and Bob are equal, i.e. x = y. This errorless referee behaviour for the case x = y is due to the fact that the only possible detector responses are either clicks in detector "2" only or no clicks at all in both detectors. For the other case x = y, error probability p error is the same as the probability of obtaining no clicks in detector "1". In particular, after the individual pulses interfere in the ideal referee beamsplitter, independently of their relative phases, there will always be a coherent state ± √ 2 α √ M going into one detector and the vacuum entering the other detector. The click probability is calculated from the Poissonian statistics of the coherent states as p click = 1 − exp −2 |α| 2 M . Accordingly, the worst-case error probability is simply p error = (1 − p click ) (1−δ)M , because the minimum amount of pulses that may potentially produce clicks in detector "1" is (1 − δ)M , as dictated by the distance of the ECC. Introducing into this last equality the above expression of p click and then solving for |α| 2 , we obtain This equation in (3) provides the minimum mean photon number of each entire train of pulses that is needed to get a desired error level in the referee outcomes, in the ideal case. Notice that this minimum value of the total mean photon number |α| 2 depends exclusively on the error probability and on the chosen ECC.
Under the ideal premises considered so far, for fixed p error and δ, mean photon number |α| 2 remains constant regardless of the raw message length N of Alice's and Bob's binary inputs. Also, we bring into attention that no mathematical approximations were invoked in this work in order to obtain (3).
Up to this point, we have considered an ideal scenario only; however, any practical QF implementation will inevitably be affected by experimental imperfections. These imperfections render unusable, or at least highly impractical, the decision rule presented above. This is so because detector "1" may fire even if the individual input pulses at the beamsplitter are equal. Nonetheless, in case of small imperfections, we may expect the total number of clicks registered in "1" when x = y to be much larger than when x = y. Similarly, we presume the opposite behaviour for detector "2", regarding the number of clicks for the respective cases x = y and x = y. Exploiting these statistical behaviours in the detectors, different decision rules can be contrived to make the QF protocol robust to experimental errors. Particularly, the rule proposed in [15] is based on calculating a fraction of clicks f 2 = D2 D1+D2 , where D k is the total number of clicks in detector k = 1, 2 for the entire quantum pulses from Alice and Bob. Applying Hoeffding's inequality [43] under this rule, an analytical expression analogous to (3), but including the effects of imperfections, was deduced in [15]. Though valid, this referee strategy in [15] was found in [16] to be extremely sensitive to small variations in the parameters that quantify the errors caused by imperfections, which hinders its experimental applicability. The improved referee strategy proposed in [16] for coping with experimental errors takes into consideration the amount D 1 only; no fraction of clicks is needed, unlike the other rule mentioned above. This referee strategy consists on using a threshold value r such that outcome f (x, y) = 1 is announced if and only if D 1 > r is observed. The value of r is defined as ensuring the equality p error|x=y = p error|x =y , with p error | x=y = Pr D E 1 > r and p error | x =y = Pr D D 1 ≤ r . In these two probabilities, D E 1 (D D 1 ) represents the random variable that models the number of clicks at output "1" when the entire input sequences are equal (different). The value of r may be numerically computed by approximating the numbers of clicks D E 1 and D D 1 by binomial distributions, and then computationally looking up in inverse distribution tables. In particular, the binomial distributions that include imperfection effects may be defined as D E 1 ∼ Bin M, p E click,1 and where p E click,1 and p D click,1 are the probabilities of detector "1" firing for the cases of equal and different individual input pulses, respectively. These probabilities are given by In these two equations above, p dark is the dark count probability and v is the visibility, which quantifies the actual contrast of the interferometer. We emphasize that the traditional definition of interferometric visibility (also known as fringe contrast; see, for example, page 12 of [44]) commonly used in optics and quantum photonics is not the same that applies in the present case for v. Also, even though visibility v is extensively utilized in [15][16][17], a concise definition is lacking in these references. Here, we provide such a definition for the two-user case, which later in the manuscript is extended to K users with an arbitrary K ≥ 2: This last equation contains the equal-input gain g E 1 = M µ E 1 |α| 2 , which can be expressed as the ratio of the mean photon number µ E 1 at output "1" and the nominal photon number |α| 2 M at each input. The other gain |α| 2 is analogously defined, but for the case of different inputs, i.e. when the phases of two incoming pulses are dissimilar. The fundamental reason why v is defined here in terms of photonic gains is to promote an easy experimental estimation through measurements in the classical optical regime, actually even before starting the protocol itself. In general, for the two-user case only, the relationships between v and the gains are clearly g D 1 = 2v and g E 1 = 2(1 − v). Using all the above notation in this section, we sketch next a computational iterative algorithm for numerically calculating both r and |α| 2 , assuming the same statistical model and protocol rules introduced in [16]. This numerical algorithm, or another equivalent algorithm that produces the same results in [16,17], is not explicitly detailed in these references, but we include it in this document in order to facilitate the comparison with our analytical method that requires neither numerical iterations nor solving nonlinear equations. In the algorithm stated below, F −1 Bin denotes the binomial inverse cumulative distribution function, which can be calculated, for instance, using the BinoInv function available in Matlab R . Algorithm 2.1. (Computational method for calculating r and |α| 2 in a realistic two-user coherent-state QF protocol) (1) Fix ECC parameter δ, ECC rate c = M/N , dark count probability p dark , visibility v, input size N and target value of p error . Also, fix, as initial value, |α| 2 = 0.
(4) If r D = r E , STOP, take threshold r = r D and keep as final result the current updated value of |α| 2 . If r D < r E , increase |α| 2 by a small amount ∆ α and repeat (2) to (4). For input sizes N ≥ 10 −5 , such as those considered in previous literature and in this work, an increment ∆ α = 1, or even greater, is adequate as typically |α| 2 > 100.
Applying this algorithm, it can be shown that, in general, contrary to what happens for the ideal-case solution in (3), the minimum required |α| 2 is now no longer independent of M for a fixed p error . This observation is apparent from the results in [16,17]. However, for input sizes below a certain value of M that strongly depends on p dark , it is also observed that the protocol is still able to countenance a constant |α| 2 and concomitantly maintain the desired target error probability, as happens in the ideal scenario.
Prior to ending this explanation of the realistic two-user QF protocol, we note that, in any real implementation, we must also take into consideration the effect of losses. To do so, we define a parameter η that combines the effect of the overall losses present in the whole experiment, such as detector efficiencies and losses in the quantum channel. The effect of η is equivalent to transforming state |α x in (2) into another state √ ηα x , which can always be compensated just by increasing the total transmitted mean photon number as |α| 2 → |α| 2 η . Thus, the protocol exhibits robustness to losses, in the sense that the scaling properties of |α| 2 with respect to the rest of protocol parameters remain unchanged if losses rise. As a concluding outline, we next formally summarize how to quantify the amount of transmitted quantum information Q, measured in qubits. In order to do so, once |α| 2 has been computed, the information quantification draws upon the following upper bound. This bound is valid for any coherent-state QF protocol, either ideal or realistic, and it can be plainly inferred from the demonstration of Theorem 1 contained in [15]: [A good and simple approximation confirmed in many of our simulation results plotted at log-log scale is: Q ≈ |α| 2 log 2 M . ] In this upper bound, parameter ∆ must be minimized from the following non-linear inequality using a numerical method, such as fzero in Matlab R : The value of in (7) is fixed and it may be understood as an indicator of the accuracy of (6). In our simulations, following the same convention as in previous works, we take = 10 −6 . This parameter should not be confused by any means with the error probability p error of the protocol, inasmuch as refers solely to the probability of getting a certain inaccurate result in a Q prediction given by (6) for any complete realization of the QF protocol.
In concluding this section, we provide a brief discussion on the communication cost of the family of QF protocols introduced here. On this subject, because M and N are linearly related by the ECC rate, the result in (6) stipulates that the scaling properties of Q may be described as Q ∼ O(log 2 N ) or, more precisely, as Q ∼ O(|α| 2 log 2 N ), except for an arbitrarily small . For the discussion in progress, we must also consider, as explained in this section, that as long as p error stays set to a specific value, |α| 2 remains constant and independent of N in the ideal QF protocol. This statement is also true in the nonideal case for values of N below a certain threshold that depends on imperfections. Consequently, taking for granted a constant mean photon number, it is often found in the literature the recurring assertion that a coherent-state QF protocol provides exponential savings in the transmitted information when compared to classical protocols, which in turn require fingerprints of no less than Ω( √ N ) bits for a fixed error probability [26,27]. We remark that, even though this affirmation is factually true for a constant |α| 2 , it is still possible to achieve huge savings even if, in order to maintain the desired target p error , imperfections force a variation of |α| 2 as a function of N . With regard to this assertion, we present in section 5 results that show improvements of several orders of magnitude when compared to the best-known classical protocol, even under circumstances that make the strict exponential savings unattainable.

Multi-party referee circuit designs
This section addresses the non-trivial task of extending the beamsplitter device concept to multiple users, in such ways that our extended device circuits can then be used in the implementation of a QF protocol analogous to the pioneering K-user classical protocol in [37]. The generalized optical circuits, for K ≥ 2 users, must drive the output photonic detectors in a manner that the clicks registered in these detectors provide enough information for the referee. The referee's task is then to conclude if at least one of the binary sequences x k , with 1 ≤ k ≤ K, differs in at least a single bit when compared to the rest of the sequences.
Before introducing the beamsplitter generalization itself, we describe in essence the signal states that interfere at the referee's circuit. In this regard, we adopt the same phase-encoding scheme prescribed in earlier two-user QF protocols that was previously detailed in section 2, which is based on transmitting coherent states whose phase modulation is furnished by an error correcting code (ECC) [15][16][17]. Therefore, each of the K users sends through the respective quantum channel a train of M weak coherent pulses characterized by an adapted version of (2): where, in this case, label k, satisfying 1 ≤ k ≤ K, is assigned to identify each user and also each user's sequence x k . Again, E(x k ) m tags the mth bit of an ECC codeword corresponding, on this occasion, to a binary string x k . Given that the generalized protocol involves K users transmitting via K separate channels, it seems mandatory for the referee to employ an optical multiport device in which the number of inputs is K as well. Moreover, we take into consideration the fact that circuits built with linear-optics elements, such as phase shifters and beamsplitters, can always be described by means of a unitary matrix, and vice-versa, if the number of outputs in the multiport is also K [45]. Specifically, restricting ourselves for now to an ideal and lossless scenario, the family of multiport circuits in our proposal can be effectively represented by a general unitary matrix in which one row contains the same element value repeated K times. Further, the rest of this matrix's rows have the trait of adding up to zero, as shown in (9). For descriptive purposes, we have chosen in (9) an arbitrary row k as the only one that sums up to √ K instead of zero: u k−1,1 u k−1,2 · · · · · · · · · · · · u k−1,K u k+1,1 u k+1,2 · · · · · · · · · · · · u k+1,K .
u K,1 u K,2 · · · · · · · · · · · · u K,K If we assume monochromatic light with the same polarization in every input beam, the generic unitary scattering matrix U K describes a classical-optics transformation that is performed on electric fields as E out = U K E in . Analogously, in quantum optics, matrix U K performs a transformation that linearly relates the creation operators of the input modes to the corresponding operators of the output modes. Let us assume, for now, that all the K detectors connected to the circuit output ports are, as the device circuit itself, ideal. The only row adding up to √ K in the unitary matrix above instinctively corresponds to a multiport output that, in general, is entitled to yield clicks in any case whatsoever, i.e. we do not impose any particular conditions on the individual K input phases. Conversely, the remaining K − 1 rows that sum up to zero correspond to circuit outputs that cannot lead to clicks when the K incoming phases are the same. Accordingly, our general multiport proposal consists of K − 1 output detectors that may produce clicks only for the case of different inputs x k , and 1 output detector without restrictions. This latter detector is the single one that may click if all the K individual input pulses that arrive at the referee multiport have the same phase. It may also fire, however, when these phases differ. Thus, in order to summarize the decision rule, the referee ideally announces that the bit sequences are different, i.e. f (x 1 , x 2 , . . . x K ) = 1 (at least one x k is dissimilar), if and only if she observes at least one click in the K − 1 detectors associated with the zero-sum rows. The transition matrix in (9) may be understood as corresponding to a generalization of a standard 50:50 beamsplitter, for a system described in a K-dimensional Hilbert space. Notice that a standard 50:50 beamsplitter matrix is just a particular case in two dimensions. The schematic black-box representation of a generic referee's multiport is depicted in figure 4.
Multi-party quantum fingerprinting with weak coherent pulses 10 If we assume monochromatic light with the same polarization in every input beam, the generic unitary scattering matrix U K describes a classical-optics transformation that is performed on electric fields as E out = U K E in . Analogously, in quantum optics, matrix U K performs a transformation that linearly relates the creation operators of the input modes to the corresponding operators of the output modes. Let us assume, for now, that all the K detectors connected to the circuit output ports are, as the device circuit itself, ideal. The only row adding up to √ K in the unitary matrix above instinctively corresponds to a multiport output that, in general, is entitled to yield clicks in any case whatsoever, i.e. we do not impose any particular conditions on the individual K input phases. Conversely, the remaining K − 1 rows that sum up to zero correspond to circuit outputs that cannot lead to clicks when the K incoming phases are the same. Accordingly, our general multiport proposal consists of K − 1 output detectors that may produce clicks only for the case of different inputs x k , and 1 output detector without restrictions. This latter detector is the single one that may click if all the K individual input pulses that arrive at the referee multiport have the same phase. It may also fire, however, when these phases differ. Thus, in order to summarize the decision rule, the referee ideally announces that the bit sequences are different, i.e. f (x 1 , x 2 , . . . x K ) = 1 (at least one x k is dissimilar), if and only if she observes at least one click in the K − 1 detectors associated with the zero-sum rows. The transition matrix in (9) may be understood as corresponding to a generalization of a standard 50:50 beamsplitter, for a system described in a K-dimensional Hilbert space. Notice that a standard 50:50 beamsplitter matrix is just a particular case in two dimensions. The schematic black-box representation of a generic referee's multiport is depicted in figure 4. Fig. 4: Operation of the referee's optical multiport circuit, portrayed here as a "black box," in a proposed general K-party quantum fingerprinting protocol implemented with weak coherent pulses. Each party transmits a sequence of coherent states ⟩ m whose phases are modulated following an identical procedure as in the two-user coherent-state protocol. Under ideal assumptions, if all the K input pulses that arrive at the referee from the users at a given moment have the same phase, just one of the K detectors may fire. On the contrary, any detector may fire if at least one of these K input pulses is different.
Before concluding this section introduction, let us be clear about two details: (i) for a fixed K and keeping into consideration not breaking the unitary condition, the matrix elements in the zero-sum rows of (9) may be chosen in many distinct ways that lead to different multiport circuit designs; (ii) even if all the elements in the unitary matrix are already fixed, diverse design rules can be applied that also produce different multiport circuit implementations, all of them represented by the same matrix [45,46]. Relatedly, the rest of the chapter describes various circuit designs aimed at being used by the referee in the multi-party QF protocol. Suboptimal designs (from the point of view of dealing with experimental imperfections) are also presented in the chapter's remainder, for a twofold reason: (i) for comparison purposes, and (ii) because some of these designs, though not optimal, have some other interesting experimental benefits that we shall discuss in brief. that arrive at the referee from the users at a given moment have the same phase, just one of the K detectors may fire. On the contrary, any detector may fire if at least one of these K input pulses is different.
Before concluding this section introduction, let us be clear about two details: (i) for a fixed K and keeping into consideration not breaking the unitary condition, the matrix elements in the zero-sum rows of (9) may be chosen in many distinct ways that lead to different multiport circuit designs; (ii) even if all the elements in the unitary matrix are already fixed, diverse design rules can be applied that also produce different multiport circuit implementations, all of them represented by the same matrix [45,46]. Relatedly, the rest of the chapter describes various circuit designs aimed at being used by the referee in the multi-party QF protocol. Suboptimal designs (from the point of view of dealing with experimental imperfections) are also presented in the chapter's remainder, for a twofold reason: (i) for comparison purposes, and (ii) because some of these designs, though not optimal, have some other interesting experimental benefits that we shall discuss in brief.

Generalized beamsplitter designs
Generalized beamsplitters, also called multiport beamsplitters or multiport interferometers, were first formally addressed by Zeilinger et al in [47]. After this description, Reck et al released in [45,46] the first known systematic procedure for designing the corresponding device. This method takes a unitary matrix characterization as starting point and then provides an optical network of two-input beamsplitters and phase shifters that implements the unitary transformation. Interestingly, this design proposal was later used in [48] in order to construct real experiments for testing diverse EPR correlations. In general, a generalized beamsplitter with K input ports (and an equal number of output ports) is characterized by a K × K matrix U K built exclusively by taking powers of the Kth root of unity γ K = exp (i 2π K ). Explicitly, the matrix elements of U K are given by It is immediate to check that U K is in fact a unitary matrix that, furthermore, satisfies all the requirements in (9) concerning the family of multiport circuits in our proposal. This demonstration can be done by taking into account the following manifest property of the roots of unity: and δ ij is a Kronecker delta. Therefore, we can use the optical realizations of this kind of matrices as a possible multi-party referee circuit.
Up to this date, just two systematic methods are known for the design of devices implementing the transformation given by the characteristic matrix in (10) for any value of K. The method by Reck et al in [45,46] was devised in 1994, whereas the second known method was made available by Clements et al in [49], nearly two decades later. Both methods are based on decomposing any unitary matrix as a product of simpler matrices, each of which corresponds to a two-input unbalanced (generic) beamsplitter. Then, each method's decomposition leads to a different multiport design consisting of a regular mesh of beamsplitters and phase shifters. We emphasize the fact that these design methods are universal in the sense that they can provide optical circuit realizations not only for the specific generalized beamsplitter matrices considered here but also for any unitary matrix.
The layout in figure 5 exemplifies a generalized beamsplitter for K = 4 inputs, obtained according to the Reck design procedure. In general, for an arbitrary K, this design always leads to an optical circuit consisting of a triangular layout, as in the particular one depicted here. In contrast, the corresponding Clements design portrayed in figure 6 exhibits a different architecture for the photonic circuit, in which every input channel meets its closest neighbour at the very first possible intersection.
Both designs chiefly require an identical number of K(K−1) 2 beamsplitters in order to construct the multiport device. However, Clements layout achieves a smaller optical depth, which, as per the exhaustive comparative analysis in [50], is a key parameter highly correlated with errors caused by fabricative imperfections and by optical losses inside the multiport. We define the optical depth parameter as the maximum number of beamsplitters, i.e. counted by traversing the longest path across the multiport, considering all paths from any input port to any output port. In particular, Clements design dispenses an optical depth of K beamsplitters, whereas the depth intrinsic to Reck design is 2K − 3. As K grows, the latter requires roughly twice the depth of the former design.
Even though Clements layout presents, in general, a superior error tolerance in any realistic operational conditions, which is indeed extremely important for experimental implementations, Reck design can be advantageous when using programmable circuits for fulfilling the unitary transformation. In particular, configuring methods exist for the Reck design that can be applied to program integrated photonic chips, without requiring a full characterization of the internal circuit components [51,52].   shifters marked inside the diagonal dashed enclosure may be transferred to the output ports, by modifying the beamsplitter parameters following the procedure described in [49]. However, this transfer was not implemented on the depicted diagram for simplicity and because it did not impact qualitative comparisons later presented in this work.

Extendable design
As opposed to the generalized beamsplitter designs, which were already-known and we merely found a new application for them, we start here presenting original multiport layouts specifically devised for our Multi-party quantum fingerprinting with weak coherent pulses 12   shifters marked inside the diagonal dashed enclosure may be transferred to the output ports, by modifying the beamsplitter parameters following the procedure described in [49]. However, this transfer was not implemented on the depicted diagram for simplicity and because it did not impact qualitative comparisons later presented in this work.

Extendable design
As opposed to the generalized beamsplitter designs, which were already-known and we merely found a new application for them, we start here presenting original multiport layouts specifically devised for our Fig. 6: Example of generalized beamsplitter, implemented according to Clements design procedure for K = 4. The phase shifters marked inside the diagonal dashed enclosure may be transferred to the output ports, by modifying the beamsplitter parameters following the procedure described in [49]. However, this transfer was not implemented on the depicted diagram for simplicity and because it did not impact qualitative comparisons later presented in this work.

Extendable design
As opposed to the generalized beamsplitter designs, which were already-known and we merely found a new application for them, we start here presenting original multiport layouts specifically devised for our QF protocol proposal. For each of these novel designs, we first present our optical circuit realization and then introduce the associated unitary matrix. We remark that the previously discussed Reck and Clements procedures can also be applied to obtain valid designs corresponding to our newly introduced unitary matrices. These designs, however, turn out to be far from optimal when compared with our own specific circuits in terms of both number of beamsplitters and optical depths. Thus, we shall discard these other designs in the upcoming discussions regarding our proposals. As a first innovative design approach for producing novel referee circuit architectures, we present what we call the extendable design. This proposed layout consists of a chain of K − 1 concatenated unbalanced beamsplitters, epitomized by the sequence in figure 7. As per convention, we assign a label k, with 2 ≤ k ≤ K, to each of them. A beamsplitter k contains the input of the quantum channel from user k, except for beamsplitter k = 2, which also contains the input for the user with label k = 1. Moreover, each unbalanced beamsplitter k is characterized by an individual power transmittance t k = k−1 k .
Multi-party quantum fingerprinting with weak coherent pulses 13 QF protocol proposal. For each of these novel designs, we first present our optical circuit realization and then introduce the associated unitary matrix. We remark that the previously discussed Reck and Clements procedures can also be applied to obtain valid designs corresponding to our newly introduced unitary matrices. These designs, however, turn out to be far from optimal when compared with our own specific circuits in terms of both number of beamsplitters and optical depths. Thus, we shall discard these other designs in the upcoming discussions regarding our proposals. As a first innovative design approach for producing novel referee circuit architectures, we present what we call the extendable design. This proposed layout consists of a chain of K − 1 concatenated unbalanced beamsplitters, epitomized by the sequence in figure 7. As per convention, we assign a label k, with 2 ≤ k ≤ K, to each of them. A beamsplitter k contains the input of the quantum channel from user k, except for beamsplitter k = 2, which also contains the input for the user with label k = 1. Moreover, each unbalanced beamsplitter k is characterized by an individual power transmittance t k = k−1 k . An obvious benefit of the extendable design lies in the fact that fingerprinting users can be added and removed without requiring much effort to change the physical layout. For example, new users can be simply added without affecting the extant part of the circuit already in use for the previous users. In return, a serious drawback of this design arises from the asymptotic behaviour of the power transmittances. As the number of users K increases, the required power transmittances tend to be close to 1 and very close to each other. Consequently, in a realistic scenario, an important contribution of errors originating from mismatched transmittances is to be expected. This effect can be also apparent from observing the generic matrix coefficients in (11), which represent amplitude transmittances. The column and row indices for these coefficients correspond respectively to the input and output channel labels on the schematic view in figure 7:

Optimal design
In the following, we present an optical circuit topology that minimizes both the number of required beamsplitters and the optical depth. We call this novel topology optimal design, albeit this designation is, strictly speaking, a surmise based on the evidence provided by comparing with other known topologies. This optimal design certainly has the minimum depth of all the circuit architectures devised in this work for the multi-user coherent-state QF problem. We may conjecture that no other topology exists that achieves the shortest possible optical depth and the smallest number of standard (two-input) unbalanced (generic) beamsplitters for the problem under consideration. An obvious benefit of the extendable design lies in the fact that fingerprinting users can be added and removed without requiring much effort to change the physical layout. For example, new users can be simply added without affecting the extant part of the circuit already in use for the previous users. In return, a serious drawback of this design arises from the asymptotic behaviour of the power transmittances. As the number of users K increases, the required power transmittances tend to be close to 1 and very close to each other. Consequently, in a realistic scenario, an important contribution of errors originating from mismatched transmittances is to be expected. This effect can be also apparent from observing the generic matrix coefficients in (11), which represent amplitude transmittances. The column and row indices for these coefficients correspond respectively to the input and output channel labels on the schematic view in figure 7:

Optimal design
In the following, we present an optical circuit topology that minimizes both the number of required beamsplitters and the optical depth. We call this novel topology optimal design, albeit this designation is, strictly speaking, a surmise based on the evidence provided by comparing with other known topologies. This optimal design certainly has the minimum depth of all the circuit architectures devised in this work for the multi-user coherent-state QF problem. We may conjecture that no other topology exists that achieves the shortest possible optical depth and the smallest number of standard (two-input) unbalanced (generic) beamsplitters for the problem under consideration.
The objective of finding the optimal layout is motivated in order to reduce fabrication resources. In other words, working with compact circuit designs that need just a few beamsplitters is a key factor for the manufacture of planar waveguide photonic circuits. Additionally, internal propagation losses are reduced when the optical depth is small, and the cumulative effect of errors caused by fabrication imperfections, namely errors when setting the values of transmittances and phase shifts, is also expected to be lower for a shorter optical depth [50].
For a fixed number of users K, the construction of the optimized topology starts by assigning an integer label k = 1 . . . K to each user and then recursively allocating these labels into two groups. At each recursivedivision level, a beamsplitter corresponding to these two groups is then placed over the layout, beginning with the highest level of the group division diagram, and forming a characteristic tree arrangement. A simple self-explanatory example for K = 4 parties that only requires a certain type of 50:50 beamsplitters is illustrated in figure 8. A more intricate but self-evident example, for the case K = 7, is contained in figure 9. When the amount of labels at some division level is an odd number, as in this example for K = 7, we employ the ceiling division in order to split the labels into two groups. Replacing the ceiling division with integer division in the diagram is also acceptable; it would produce a different equivalent circuit with another similar tree topology. Finally, for each level contained in the division diagram presented on the figures' left, the power transmittances corresponding to the unbalanced beamsplitters must be calculated in the form of fractions obtained as the amount of labels in the first group over the total number of labels in the two groups. The distinctive tree design generated using this method can be described by a unitary matrix satisfying all the requirements in (9). In particular, the single matrix row in (9) that does not sum up to zero corresponds here to the output with label 1 in the layouts on figures 8 and 9.
The objective of finding the optimal layout is motivated in order to reduce fabrication resources. In other words, working with compact circuit designs that need just a few beamsplitters is a key factor for the manufacture of planar waveguide photonic circuits. Additionally, internal propagation losses are reduced when the optical depth is small, and the cumulative effect of errors caused by fabrication imperfections, namely errors when setting the values of transmittances and phase shifts, is also expected to be lower for a shorter optical depth [50].
For a fixed number of users K, the construction of the optimized topology starts by assigning an integer label k = 1 . . . K to each user and then recursively allocating these labels into two groups. At each recursivedivision level, a beamsplitter corresponding to these two groups is then placed over the layout, beginning with the highest level of the group division diagram, and forming a characteristic tree arrangement. A simple self-explanatory example for K = 4 parties that only requires a certain type of 50:50 beamsplitters is illustrated in figure 8. A more intricate but self-evident example, for the case K = 7, is contained in figure 9. When the amount of labels at some division level is an odd number, as in this example for K = 7, we employ the ceiling division in order to split the labels into two groups. Replacing the ceiling division with integer division in the diagram is also acceptable; it would produce a different equivalent circuit with another similar tree topology. Finally, for each level contained in the division diagram presented on the figures' left, the power transmittances corresponding to the unbalanced beamsplitters must be calculated in the form of fractions obtained as the amount of labels in the first group over the total number of labels in the two groups. The distinctive tree design generated using this method can be described by a unitary matrix satisfying all the requirements in (9). In particular, the single matrix row in (9) that does not sum up to zero corresponds here to the output with label 1 in the layouts on figures 8 and 9.  8: Example of our optimal fingerprinting circuit design for K = 4 parties. In general, the circuit design follows a tree structure obtained by recursively dividing into two groups the integer labels assigned to identify the players, as exemplified on the accompanying diagram. r denotes power reflectance and t denotes power transmittance.
Taking into consideration the previously-explained optimal design procedure, we present comparative results in table 1, where ⌈ log 2 K ⌉ is the ceiling function of log 2 K. This logarithmic function mathematically arises from the circuit's tree structure. The analytical comparisons with the rest of topologies clearly show that our optimal design inherently provides enormous exponential savings in terms of optical depth. Moreover, the reduction in the amount of required beamsplitters exhibits noteworthy quadratic savings when compared to the generalized beamsplitter layouts generated according to the Reck and Clements procedures.  8: Example of our optimal fingerprinting circuit design for K = 4 parties. In general, the circuit design follows a tree structure obtained by recursively dividing into two groups the integer labels assigned to identify the players, as exemplified on the accompanying diagram. r denotes power reflectance and t denotes power transmittance.
Taking into consideration the previously-explained optimal design procedure, we present comparative results in table 1, where log 2 K is the ceiling function of log 2 K. This logarithmic function mathematically arises from the circuit's tree structure. The analytical comparisons with the rest of topologies clearly show that our optimal design inherently provides enormous exponential savings in terms of optical depth. Moreover, the reduction in the amount of required beamsplitters exhibits noteworthy quadratic savings when compared to the generalized beamsplitter layouts generated according to the Reck and Clements procedures. 1,5 Fig. 9: Example of our optimal fingerprinting circuit design for K = 7 parties. In general, the design consists of a tree structure merely constructed by recursively dividing into two groups the integer labels allocated to the players. r denotes power reflectance and t denotes power transmittance. The tree structure is not symmetrical if K is odd, as it corresponds to the herein depicted explanatory example. Ceiling division (used in the represented example) or integer division is required to divide the users into blocks. Beamsplitter power transmittances and reflectances are easily calculated as fractions of total users that are included in each block, as recursively typified on the accompanying diagram.

Design Number of beamsplitters Optical depth
Our optimal design Generalized BS with Clements design In the absence of experimental data, we need a realistic theoretical imperfection model in order to study how the referee multiport circuit behaves, depending on experimental imperfections, i.e. the losses and the fabricative imperfections (fabrication noise) present in the circuit components. In the following, we develop such a model for the optimal design that concerns us here; nonetheless, we note that this model can straightforwardly be generalized to any multiport optical circuit. We first present a matrix decomposition, in which the action of every generic beamsplitter upon the quantum states in the ideal circuit is described by a matrix. Finally, based on this ideal matrix decomposition, we hand over the definitive model that contains the parameters that allow for imperfections. Fig. 9: Example of our optimal fingerprinting circuit design for K = 7 parties. In general, the design consists of a tree structure merely constructed by recursively dividing into two groups the integer labels allocated to the players. r denotes power reflectance and t denotes power transmittance. The tree structure is not symmetrical if K is odd, as it corresponds to the herein depicted explanatory example. Ceiling division (used in the represented example) or integer division is required to divide the users into blocks. Beamsplitter power transmittances and reflectances are easily calculated as fractions of total users that are included in each block, as recursively typified on the accompanying diagram.

Design Number of beamsplitters Optical depth
Our optimal design K − 1 Generalized BS with Clements design In the absence of experimental data, we need a realistic theoretical imperfection model in order to study how the referee multiport circuit behaves, depending on experimental imperfections, i.e. the losses and the fabricative imperfections (fabrication noise) present in the circuit components. In the following, we develop such a model for the optimal design that concerns us here; nonetheless, we note that this model can straightforwardly be generalized to any multiport optical circuit. We first present a matrix decomposition, in which the action of every generic beamsplitter upon the quantum states in the ideal circuit is described by a matrix. Finally, based on this ideal matrix decomposition, we hand over the definitive model that contains the parameters that allow for imperfections.
The effect of each beamsplitter in the optimal design can be described by a unitary matrix expressed as follows: where matrix size is K × K and the indices of the off-diagonal elements correspond to the labels of the input and output channels of each beamsplitter in the design. Also, t tags the power transmittance of the considered beamsplitter. For example, for the layout in figure 8, the matrix decomposition can be expressed as , and the corresponding unitary matrix that describes the entire optimal circuit in figure 8, for K = 4, is Similarly, for the case K = 7 depicted in figure 9, we can write the decomposition as , and the resulting matrix for the complete circuit is In order to finally include the effects of imperfections in the matrix decomposition, we first need to closely analyze how the generic beamsplitters are implemented in a real photonic circuit. The conventional way for achieving a general optical realization, totally equivalent to a generic beamsplitter, consists of a basic building block, comprising a Mach-Zehnder interferometer built with two cascaded symmetric 50:50 beamsplitters and two phase shifters [45,46,49,50]. For the particular case of our optimal design, this basic building block can be implemented as illustrated in figure 10. The value of phase ω in the block is related to the unbalanced power transmittance as t = sin 2 (ω), with ω ∈ [0, π/2].
Multi-party quantum fingerprinting with weak coherent pulses 17 and the resulting matrix for the complete circuit is In order to finally include the effects of imperfections in the matrix decomposition, we first need to closely analyze how the generic beamsplitters are implemented in a real photonic circuit. The conventional way for achieving a general optical realization, totally equivalent to a generic beamsplitter, consists of a basic building block, comprising a Mach-Zehnder interferometer built with two cascaded symmetric 50:50 beamsplitters and two phase shifters [45,46,49,50]. For the particular case of our optimal design, this basic building block can be implemented as illustrated in figure 10. The value of phase ω in the block is related to the unbalanced power transmittance as t = sin 2 (ω), with ω ∈ [0, π/2]. In other different circuit designs, the unbalanced BS's may also be implemented in a similar manner. Phase ω is related to the unbalanced BS power transmittance as ω = arcsin ( √ t). According to our design notation convention, phase π, depicted on the equivalent unbalanced BS (on the right part of this figure), always goes, upon reflection, from the smallest integer label ℓmin to the greatest integer label ℓmax in any general circuit tree structure, as it is clearly exemplified in figures 8 and 9.
Exploiting the model in figure 10, the ideal matrix associated to each generic beamsplitter in our design can be further decomposed as Fig. 10: Implementation of each unbalanced beamsplitter (BS) in our optimal circuit design (see, for instance, figures 8 and 9), for any pair of port labels in the design min, max ∈ [1, K] , min < max. As depicted within this figure, two symmetric 50:50 BS's and two phase shifters are used to mimic the behaviour of any unbalanced BS. In other different circuit designs, the unbalanced BS's may also be implemented in a similar manner. Phase ω is related to the unbalanced BS power transmittance as ω = arcsin ( √ t). According to our design notation convention, phase π, depicted on the equivalent unbalanced BS (on the right part of this figure), always goes, upon reflection, from the smallest integer label min to the greatest integer label max in any general circuit tree structure, as it is clearly exemplified in figures 8 and 9.
Exploiting the model in figure 10, the ideal matrix associated to each generic beamsplitter in our design can be further decomposed as where, for simplicity, we only represent the matrix elements whose indices belong to the non-zero off-diagonal elements in (12). Flip matrix 0 1 1 0 in (13) is simply for accommodating the output channel labels to those in our design, as typified in figures 8 and 9. Both matrices correspond to standard symmetric beamsplitters, commonly used in quantum optics. Finally, matrix e i(ω+π) 0 0 e −iω in (13) models the shifters, characterized by a phase value ω in figure 10.
In order to accurately simulate fabricative imperfections and losses, we follow a Monte Carlo method similar to that in [50], in which the ideal matrix decomposition in (13) is replaced with an analogous expression that includes various parameters that quantify imperfection levels: The losses associated to each symmetric beamsplitter are denoted by η BS ; we assume in (14) that both beamsplitters in figure 10 undergo identical losses, even though this restriction may be straightforwardly worked around. The ideal amplitude transmittance 1 √ 2 is now substituted in (14) with its realistic equivalent τ 1 , τ 2 = 2 −1/2 (1 + σ T · randn), where each of the subscripts 1, 2 corresponds to a different 50:50 beamsplitter. Parameter σ T accounts for the fabrication noise level affecting the transmittances τ 1,2 , modelled as the standard deviation of a zero-mean normal random variable [50]. Similarly, σ P is the fabrication noise level affecting the phase shifters present in the basic building block, also modelled as Gaussian noise [50]. We remark that the four different random variables in (14), dubbed as randn, actually represent four different realizations in a Monte Carlo simulation. To approach our analysis towards realistic values for the above discussed imperfection parameters, we consider femtosecond laser writing as the reference technology used for constructing the optical circuits. Additionally, we assume thermo-optic active control for the phase shifters. Under this fabrication assumptions, it is currently possible to achieve tolerances of 0.01 for the amplitude transmittances and 0.01 rad for the phase shifters [53,54], i.e. present-day fabricative technology permits to describe all the imperfections by a single value as low as σ T = σ P = σ = 0.01. In consequence, for simplicity, we shall use the same parameter σ to denote both phase and amplitude noise levels.
Finally, before concluding this section, we emphasize that the statistical imperfection model developed here is fully independent from, but it can be combined with, the analytical method for upper-bounding the transmitted information, which is introduced in section 4 for a realistic multi-party QF scheme. For instance, the upper-bounding method explained in the next section may be applied in a real experiment wherein, once the circuit topology is chosen and implemented, the imperfection model is irrelevant because the performance results rely on measurements only.

Quantum fingerprinting protocol analysis
In this section, we provide a description of various decision rules that govern the referee announcements in the multi-party QF protocol. Based on these rules, we develop the mathematical formalism for upperbounding both the required amount of abstract information and, importantly towards estimating resource expenditures, also the energy consumption of the protocol. First, we analyze the multi-party ideal case, which allows us to gain an initial insight into the protocol capabilities, in a similar fashion as the ideal analysis carried out in section 2 for the well-known two-user case.
The second part of the section addresses the more interesting realistic scenario, in which we present a method for upper-bounding the transmitted information that takes into account any kinds of circuit imperfections, as well as the detector dark count rates. The application of this method relies on determining certain gains that ultimately include the effects of all imperfections and losses present in the circuit. We present a comprehensive analysis of the method, emphasizing the effect of different parameters on the protocol performance, but leaving for appendix B the mathematical details of the model, which is based on applying a certain form of the Chernoff bounds to the click probabilities at the detectors.
Following the already-detailed general matrix characterization in (9) and the generic portrayal in figure 4, which is applicable to every circuit design in section 3, we consider through this entire section that an integer label k, satisfying 1 ≤ k ≤ K, is assigned so as to identify each output detector. However, in order to make our analysis valid for any circuit without loss of generality, we introduce, from now on, the restriction that label k = K always corresponds to the only detector that losses photons, in comparison with all-equal inputs, when at least one of the input pulses differs from the rest. In other words, label k = K is allocated to the only detector that may click in the ideal case when all the input states are the same. This specific photonic detector corresponds to the row that does not add up to zero in (9). The restriction introduced here does not pose any limitations on the circuit design, as it implies a relabeling of the exit ports, which, if applicable, is equivalent to a mere permutation of the rows of the unitary matrix associated to the circuit.

Ideal scenario
Under ideal premises, the referee declares that at least one bit sequence x k is different, i.e. f (x 1 , x 2 , . . . x K ) = 1, if and only if at least one click happens in the K − 1 detectors related to the zero-sum rows in matrix (9). Therefore, an error never occurs if the K original binary inputs x k are the same. Otherwise, the worst possible situation clearly always corresponds to just one user sending a coherent-state sequence M m=1 ± α √ M m that differs from the rest K − 1 users' sequences. This situation is the most similar to K all-equal sequences of coherent states and, hence, the most difficult to distinguish by the referee. Besides this, for the purpose of carrying out the analytical calculations, the worst case takes place when the number of different coherent states in the sequence that differs is at its minimum. This theoretical minimum is exactly (1 − δ)M as imposed by the minimum distance of the used error correcting code, as detailed in section 2. Thus, we obtain that error probability p error can be calculated as with p D click,k being the click probability at output detector k when just one individual input state differs from the rest, and µ D k being the corresponding mean photon number of the coherent state impinging this aforesaid detector k. For the last equality, we have taken into account which can be proven multiplying matrix (9) by a vector filled with the same repeated value, except for a single entry with an opposite phase. Finally, solving for |α| 2 in (15), Fixing a specific desired error probability, this latest calculation gives an upper-bound on the mean photon number, and hence also on the energy required per user. The related amount of quantum information per user, suitable to confront a classical protocol, can be directly calculated introducing (17) into (6). Again, as in the analysis for deriving (3) in section 2, here we did not draw upon mathematical approximations concerning the intensity levels of the individual pulses. Also, we note that the known result given in (3) can be seen as an exact particular case for K = 2, predicted by the novel generalization in (17). A central observation can be made on (17) by noticing that, as the number of users K increases, the effect of K tends to be less influential on the predicted per-user statistics. Further, we also note that the particular circuit design implemented at the referee node is irrelevant under the ideal assumptions. Finally, we bring attention to the fact that, as in the particular case for K = 2 anticipated in section 2, the raw message length N (and, consequently, also the number of transmitted pulses M ) has zero impact on |α| 2 .

Realistic scenario
In this subsection's analysis, we account for any kinds of experimental errors, by means of our analytical method for upper-bounding |α| 2 . Before entering the analysis, we present the referee decision rules upon which the bounding method depends. The ideal-protocol decision rule in 4.1 is not applicable anymore if subject to realistic constraints, owing to the reasons extrapolated from section 2. As alternatives, we propose here two different referee strategies based on observing different ensembles of detectors attached to the circuit's exit ports. We shall show that either of these two separate rules leads, in general, to different figures of merit when analyzing the QF protocol.
In one of the proposed strategies, the referee counts the number of clicks in the first K − 1 detectors, labelled 1 ≤ k ≤ K − 1, which are the detectors that gain impinging photons when some of the input states differ from the rest. Denoting as D k the total number of clicks observed in every detector k during the entire protocol execution, the referee infers equal sequences if and only if K−1 k=1 D k ≤ r. Parameter r is a certain threshold, below which the outcome "equal inputs" is announced; we shall provide the details required to calculate r using a closed-form expression. Above the value of r, i.e. K−1 k=1 D k > r, the referee concludes that the input sequences are different.
The other proposed strategy consists of observing just one detector, labelled k = K. This is the single detector that losses photons in the different-sequence situation when compared to the equal-sequence case, under the normal circumstances that we shall mark off. Subject to this decision rule, the referee infers that the input sequences are different if and only if D K ≤ r. Complementarily, she announces equal input sequences if and only if D K > r.
In order to deduce the expressions for threshold r and the bounding limit for |α| 2 , we apply certain types of Chernoff bounds. All the detailed calculations are included in appendix B, but we sketch next the underlying statistical model. In particular, let X E k , with 1 ≤ k ≤ K, be a random variable with Bernoulli distribution that accounts for the number of individual clicks (0 or 1 click) at detector k when K coherent states arrive at the referee at the same time containing the same phase. In a similar fashion, X D k is an analogous random variable for the case when some of the K input states are different (they contain phase differences). An additional group of random variablesX D k,m is introduced in order to model the effect of the differences present in the K complete sequences of M pulses M m=1 ± √ µ in m , as follows: The baseline statistical model and the strategies introduced above imply that the referee always provides an erroneous announcement in the following situations. If the referee uses the strategy that consists of observing K − 1 detectors, an announcement error happens either whenever the input sequences are actually different and k,m ≤ r, or whenever the input sequences are actually equal to each other and In the same way, now under the referee's rule of taking into account just the clicks in detector k = K, an error occurs in the following two situations: whenever the input sequences are really equal to each other and D E K = M m=1 X E K ≤ r, or whenever they are different and K,m > r. We denote by p E error the probability of an error happening when the sequences sent by the users are actually equal to each other. Similarly, p D error is the analogous error probability for different sequences. We remark that, in this work, probabilities p E error and p D error are not the same as the desired target error probability, which we call p error . In particular, our specific manner of applying the Chernoff bounds guarantees that p E error , p D error ≤ p error . This is a key difference when comparing with all the methods for calculating |α| 2 in previous works [15][16][17], which always secure the equality p E error = p D error = p error (see, for example, Algorithm 2.1 in this work). As we shall observe in detail, our more relaxed constraint may produce upper-bounds that are not as tight as in the previously published methods. This strict lack of tightness is the price that one has to pay in exchange for a closed-form expression for both r and |α| 2 that is instructive and easy to implement for computational purposes. Nonetheless, we shall observe in next section that, for certain ranges of M , both approaches essentially yield the same predictions at the logarithmic scale.
Focusing now on the strategy in which the referee observes detectors from 1 to K − 1, and based on the above statistical description, we get closed-form equations in appendix B.1 that depend on the following two gains: where µ E k represents the mean photon number at an output detector k when the K individual input pulses have the same phase. Similarly, µ D k,P accounts for the photon number when at least one of the K phases of the individual input pulses is different from the rest. Gain g D [1,K−1],P depends onP , which is a vector whose elements are phase labels. As an explanatory instance, let us suppose that the referee receives pulses from K = 4 users and that the individual input states are α Then, in this particular example, we haveP = (1, −1, −1, 1). In general, asP corresponds to different input states, at least 1 component inP must have a different phase than the rest of components, i.e.P = − 1, 1. We denote by L the integer number that indicates the minimum amount of phase labels inP that differ from the rest K − L labels. In the previous example, we have L = 2. Restriction L ≤ K 2 is imposed, because it is not difficult to realize that values L > K 2 introduce zero additional different cases, from the point of view of photon statistics.
Assuming Kµ in = K|α| 2 M 1, which we strictly verify later in the manuscript for the cases of interest, the mathematical development in appendix B.1 gives the following analytical upper bound: Parameter η in (20a) is the combined efficiency that includes the losses of the quantum channel and the detector efficiencies. It does not include, however, the effects of the insertion losses for the beamsplitters, because these are unbalanced losses corresponding to different paths across the multiport circuit. The effect of beamsplitter losses is fully incorporated within the gains in (19). On another note, we shall clearly show later in this section that the minimization required for (20a) and (20b) can be accomplished just by simulating (19b) for the K vectorsP that correspond to L = 1 and then taking the smallest of these K simulated values of g D [1,K−1],P . This process is identical when the gains are measured in a real experiment. We observe in (20a) that, unlike the ideal case, this bound depends on the number of input pulses M . Further, we can also notice that the dark count rate µ dark foists a strong influence that, moreover, is aggravated when both M and the number of users K grow. This worsening consists of an increase in the predicted mean photon number provided by (20a) and, hence, also in an increment of the energy consumption per user. On a separate note, the condition in (20c) emerges from the core of the Chernoff bounds themselves (see appendix B.1). This restriction is completely congruent with the desired behaviour of the detectors attached to the exit ports, provided that the experimental error level in the circuit is low enough. We may refer, therefore, to the restriction in (20c) as the "normal circumstances of operation". To conclude our commentaries about (20), we notice that, once |α| 2 has been calculated according to (20a), it is straightforward to compute the transmitted information, measured in qubits per user, just by applying the result in (6). Most of the comments provided in this paragraph for the strategy involving K − 1 detectors are also relevant for the other referee strategy considered in this work, with a few exceptions that we shall note soon.
The referee threshold that corresponds to the rule analyzed so far is In the following, we move on to presenting the final results deduced in appendix B.2 for the other rule, in which the referee takes into account detector k = K only. The governing gains for just one detector are Under the same assumptions as in the other decision rule, the bound is now An evident statement can be made by observing (23a) and comparing it with (20a). It is clear that, unlike the rule with K − 1 detectors, the effect of dark counts provided by µ dark in (23a) is not directly worsened as the number of users rises. However, as in the other rule, this effect of µ dark is made worse by the action of M albeit now not aggravated by K. Again, the condition imposed by the Chernoff bounds that enables fingerprinting feasibility, summarized here in (23c), is in full agreement with our expected behaviour of detectors, as long as a reasonable experimental error level is kept in the referee circuit. As occurs in the minimization for (20), the maximization required for (23) is practicably achievable with near zero computational cost. Finally, the corresponding threshold for the referee strategy is given now by We have hitherto presented a method that allows us to compute upper bounds on |α| 2 , subject to a maximum desired error level p error in any realistic case. Of course, this computation also depends on the protocol parameters, such as K, M and δ, and on the physical characteristics and imperfections of the photonic components (beamsplitters, detectors, etc). As an interesting remark, table 2 shows how previous approaches compare to our innovative method, regarding various aspects: numbers of users considered in the protocol, referee strategies, and the procedures for calculating the mean photon number.
In the rest of the manuscript, we pay a particular attention, amongst other aspects, to the figures of merit (transmitted quantum information, amount of energy, etc.) computed when the raw input size N , and consequently also M , is arbitrarily large. In particular, this regime corresponds to the situation in which the term with M and µ dark inside the square roots of (20a) and (23a) is the leading addend in the sum. This assumption is specially relevant because it is very well known, from all the two-user coherent-state protocols in [15][16][17], that the dark count rate is a dominant limiting factor as M grows, and identifying and mitigating its effects is still a pressing issue. In our particular model, these premises make (20a) and (23a) more dependent on the subtractions of gains min(g D [1,K−1],P ) − g E [1,K−1] and g E K − max(g D K,P ) , respectively. These gain differences thus become more relevant rather than the absolute levels of the gains. We point out that this special devotion for the case of arbitrarily large M , however, does not imply, by any means, that we are restricting our study to the limit of an infinite input size where the protocol operates in the asymptotic regime.
For the referee strategy involving K − 1 detectors, a dominant term in M and µ dark happens when and, for the strategy involving a single detector k = K, a dominant term happens when Inasmuch as the quantities min(g D [1,K−1],P ) − g E [1,K−1] and g E K − max(g D K,P ) become relevant, we may easily provide K-party generalizations of the two-user visibility v in (5) that depend on these above-stated subtractions. In this way, these visibility generalizations may be seen as additional figures of merit not only for the quantum protocol by itself, but also with regards to choosing the best suitable design for the referee circuit. A different visibility generalization must be provided for each of the referee strategies analyzed in this work: These visibilities v [1,K−1] and v K extend (5) in such a way that they are calculated by taking the ratios between the realistic gain differences and the ideal ones. Thus, using the fact that the ideal gain values are we may eventually write We provide in figure 11 graphical representations that show how the generalized visibilities v [1,K−1] and v K vary with respect to the number of parties K, for the two groups of exit ports (k ∈ [1, K − 1] and k = K) considered in the referee's decision rules. Two representative values σ = 0.01 and σ = 0.1 were selected for the fabricative error-level parameter σ, introduced in (14) and in the subsequent explanation there. Either of these two values can be accomplished with present-day optical circuit fabrication technology, as discussed at the end of section 3. Losses per beamsplitter were chosen to be η BS = −0.2 dB/BS, which is a standard value reported in contemporary experiments and practical implementations; see for example [49,55]. In the context of this work, this value of η BS is applicable to the symmetric 50:50 beamsplitters in figure 10, which are used as the building blocks necessary to implement the generic unbalanced beamsplitters in the circuit designs. The results plotted in figure 11 highlight the clear superiority of our optimal design, introduced in subsection 3.3, when compared to the rest of circuit designs in section 3. Actually, this is not surprising at all, judging by the exponential savings in optical depth displayed in table 1 for our optimal layout. A small depth is extremely advantageous because the smaller the number of beamsplitters crossed by different internal paths through the circuit, the less error level is carried into the photonic gains of (19) and (22).
In concluding this section, figure 12 contains various visibility plots calculated with different values of the gains g D [1,K−1],P and g D K,P obtained changing the number of input phases that differ. That is, calculated with different sets of labels in vectorP . This variation in vectorP impacts the maximization and minimization procedures required for (20) and (23), and also for the visibilities in (29). In particular, different values of the relevant parameter L were chosen for the plots in figure 12. We recall that L ≤ K 2 is the minimum amount of phase labels inP that differ from the rest K − L labels. It is easy to perceive in the plots that, for realistic fabrication noise levels characterized by parameter σ, the worst case unequivocally corresponds to L = 1. This observation is to be expected, as all these situations, where a single input phase (L = 1) of the K individual pulses is different, are the most similar to the case of K identical phases. Accordingly, in all these situations corresponding to L = 1, gain g D [1,K−1],P reaches the closest value to g E [1,K−1] , and gain g D K,P reaches the most similar value to g E K . As a common sense conclusion, for computing quantities min(g D [1,K−1],P ) and max(g D K,P ), it suffices to calculate the values of the gains using the K different vectors P that have L = 1. Then, the minimum or maximum of these K calculated values should be taken, as corresponds to each gain. This procedure entails an insignificant computational time. A final observation can be made on figure 12 concerning the fact that some of the plotted visibilities are greater than 1. This is so because, in order to provide a fair level comparison amongst the distinct values of L in the realistic cases, we employed the ideal-case gains for L = 1 in (27) to (29), even for the nonideal cases where L > 1.

Comparative results
A plethora of plot results is presented and discussed in brief here. These results can be reproduced by applying the methodology in the prior section endowed with the statistical outcomes, in the form of simulated photonic gains, from the imperfection model in subsection 3.3. When computing the plot results here, we include as sources of experimental errors: fabricative imperfections (phase shifter and transmittance mismatches), beamsplitter insertion losses, channel losses, detector efficiencies, and the dark count rates present in the detectors. The first two error sources mentioned above are modelled through the statistical imperfection model. The rest of the sources are directly handled by the equations of the analytical methods for upperbounding the mean photon number. In a realistic scenario, however, additional sources of errors, such as polarization and phase mismatches, can also be directly considered within the statistical model.
The Monte Carlo method underlying the imperfection model was applied averaging the results simulated with 500 unitary matrices corresponding to our optimal design. This is the same number of stochastic realizations used for the main results in [49,50]. In our simulations, each unitary matrix is randomly modified according to the statistical imperfection model. Given the relatively low order of magnitude of the standard deviations compared to the absolute mean magnitudes at the logarithmic scale (see caption in figure 11), they are not represented on the plots.
In the following, some common values used for the simulations presented in this section are discussed. Except otherwise stated, the combined efficiency, which excludes beamsplitter losses, was set to η = 0.5. This value might seem quite unrealistic; nevertheless, we note that all the results can be straightforwardly scaled because of the fact that |α| 2 η = 1 2η |α| 2 , where |α| 2 η is the result with an arbitrary η and |α| 2 η= 1 2 is the specific result used for our calculations. With regard to the foregoing, the main goal of plotting the results is not to provide precise quantitative estimations, but rather to provide a qualitative overview of how

Comparative results
A plethora of plot results is presented and discussed in brief here. These results can be reproduced by applying the methodology in the prior section endowed with the statistical outcomes, in the form of simulated photonic gains, from the imperfection model in subsection 3.3. When computing the plot results here, we include as sources of experimental errors: fabricative imperfections (phase shifter and transmittance mismatches), beamsplitter insertion losses, channel losses, detector efficiencies, and the dark count rates present in the detectors. The first two error sources mentioned above are modelled through the statistical imperfection model. The rest of the sources are directly handled by the equations of the analytical methods for upperbounding the mean photon number. In a realistic scenario, however, additional sources of errors, such as polarization and phase mismatches, can also be directly considered within the statistical model.
The Monte Carlo method underlying the imperfection model was applied averaging the results simulated with 500 unitary matrices corresponding to our optimal design. This is the same number of stochastic realizations used for the main results in [49,50]. In our simulations, each unitary matrix is randomly modified according to the statistical imperfection model. Given the relatively low order of magnitude of the standard deviations compared to the absolute mean magnitudes at the logarithmic scale (see caption in figure 11), they are not represented on the plots.
In the following, some common values used for the simulations presented in this section are discussed. Except otherwise stated, the combined efficiency, which excludes beamsplitter losses, was set to η = 0.5. This value might seem quite unrealistic; nevertheless, we note that all the results can be straightforwardly scaled because of the fact that |α| 2 η = 1 2η |α| 2 is the specific result used for our calculations. With regard to the foregoing, the main goal of plotting the results is not to provide precise quantitative estimations, but rather to provide a qualitative overview of how the different involved variables affect the protocol performance and to prove that quantum supremacy is, in principle, already experimentally achievable. With regard to the dark count probabilities, we use the two discreet values p dark = 10 −9 and p dark = 10 −11 , except when analyzing the effects of a continuous distribution of p dark , in which case values as high as p dark = 10 −7 were used in the computations. Again, these values, specially 10 −11 , may seem difficult to achieve in practice with today's technologies. It is this author's belief, however, that in the near future such devices will be commercially available at large scale. Actually, SNSPDs with p dark = 10 −11 were already reported, for example, in [56,57]. Just to put all these dark count rates in perspective, the dark count probability for the QF experiment in [17] is about 4.4 × 10 −9 .
For the ECC, we chose the same optimized random linear code (RLC) in [16], whose generator matrix is a Toeplitz matrix. In particular, for this type of ECCs, the relationship between the rate c > 1 and the minimum-distance parameter δ is determined as The particular ECC values selected for this work are δ = 0.78 and c = M N = 4.17. In our plots, for the purpose of adequately confronting the represented QF communication cost, we need on hand the expressions for the analogous cost of a classical protocol. The best classical fingerprinting protocol known to date, valid for K = 2 only, is detailed in [26] and its communication cost can be expressed in closed-form as We remark that there are other works [28,29], independent from [26], that lead to the result in (31) as well.
The protocols explained in [28,29], though, are not the same as the simple protocol in [26]. The best classical fingerprinting protocol known to date that is valid for any K ≥ 2 was recently reported in [37]. It is based on a generalization of a 2-user protocol in [26]. This 2-user protocol, however, is not the same as the protocol that gives the result in (31). In the generalization, each user sends 4 randomly chosen blocks of size 3N K bits. For each pair of blocks from the users, the node applies a 4-time repeated version of the so-called "2-user symmetric protocol" described in [26]. Additionally, in order to identify the blocks, every user also sends labels comprising log 2 (3N/ 3N/K ) bits. The resulting K-user communication cost can be stated as As a classical "no-go" result, there is a classical limit on the communication cost, below which it is known that no classical protocol may operate, even protocols that could be unknown to date. For K = 2, this limit was found in incomplete form (some multiplicative factors are missing) in [26,27]. The complete closed-form version was first provided in the supplementary material of [17]. A possible generalization to K ≥ 2 users is derived in appendix C of the present work and it is given by Together with the cost of the best-known classical protocol, we shall also plot the limit in (33) for the sake of completeness when comparing quantum communication costs. For our quantum upper bounds and for the classical quantities in (31) to (33), we set the target error probability to a common value p error = 10 −5 .

Transmitted information comparison with previously published two-user referee strategy
As our first set of protocol simulation results, and serving the purpose of strengthening the correctness verification of our analytical method in section 4, we particularize the referee strategy to the case K = 2 and then compare the results to those dispensed by other methods in previous publications. In this regard, our referee decision strategy for the case with K − 1 detectors, when particularized to K = 2, is mostly the same as in the paper by Xu et al [16]. The only difference lies in the referee threshold. In particular, for a target error probability p error , Xu's referee strategy uses a threshold r whose value satisfies the equalities p error = Pr(D E 1 > r) = Pr(D D 1 ≤ r). In contrast, we recall that, in our strategy, we can only guarantee p error ≥ Pr(D E 1 > r), Pr(D D 1 ≤ r). This less strict condition, in turn, allows for a compact mathematical analysis extended to the multi-user case.
The specific method for ultimately calculating r is not explicitly detailed in [16]. To this regard, we employed in our simulations the algorithm in section 2 of this paper. This algorithm assumes the same conditions to calculate r as in [16], so it should always produce the same results as the method actually used in this aforesaid reference, even if different. For comparison purposes, we simulated our two multi-user strategies for the separate detector ensembles k ∈ [1, K − 1] and k = K. When particularized to K = 2, the case k ∈ [1, K − 1] implies observing detector "1", just as in Xu's strategy according to our notation, and the case k = K implies observing detector "2". Figure 13 shows the simulation results, where the transmitted information is represented as a function of N at log-log scale. The fabrication noise level σ = 0.01 and the beamsplitter losses η BS = −0.2 dB/BS utilized here provide a visibility v = 0.98 for all the realistic protocols. The best-protocol classical information was calculated with (31), and the classical limit corresponds to (33). The ideal QF protocol bound can be either from (3) or from (17) and it assumes zero losses, zero imperfections and no dark counts.
In view of the results in figure 13, it is clear that Xu's upper bound is tighter than those provided by our two separate strategies. This behaviour occurs when our strategies are really ensuring an actual error probability below the target p error , and the users need to send more information than required by p error . This has to do with our more relaxed condition on the calculation of r. However, after the "elbow" of the curves, where the slopes become more vertical, all the three plotted functions are basically undistinguishable. This situation corresponds in our analytical method to a dominant term of M = c N and p dark , which occurs when conditions (25) and (26) are satisfied. Another relevant comment on the results has to do with the strong influence of p dark on the required transmitted information. In particular, a more favorable dark count rate pushes the curve's "elbow" towards a point where the input size N is larger. Finally, we perceive that, in these particular simulations, all the QF protocol strategies beat the classical limit for most of the range of N , while the best-known classical protocol is beaten nearly for all the represented range of N .

Assessment of the impact of dark counts and visibility on the transmitted information
Focusing now only on protocol realizations with more than two users, we assess in this subsection 5.2 the impact of the fabricative imperfections and the dark count rates on the amounts of transmitted information.
To this intent, we provide four plots in figure 14, resulting from taking K = 7, while the other four plots in figure 15 were produced by taking K = 50. Each plot shows the evolution of the transmitted quantum information for our two referee strategies, as a function of N = M c at log-log scale. The amount of information required by the best classical protocol for K ≥ 2 is also represented using (32), whereas the classical limit comes from (33). The fabricative noise σ = 0.01 provides visibilities v [1,K−1] = 0.96 and v K = 0.93, and σ = 0.1 provides v [1,K−1] = 0.87 and v K = 0.85. All these values correspond to K = 7. For the case K = 50, the visibilities diminish as shown on the graphs in figure 15.
As a first evident assertion, observing the four plots of each figure and comparing the two figures to each other, the effect of increasing imperfections (and hence reducing the visibility) has a very small impact on the communication cost for the smallest number of users. It becomes much more noticeable for the largest number of users, specially in the region of the curves before the "elbow", where the term of M and p dark is not the dominant one in (20a) and (23a). Additionally, this effect of increasing σ is more prominent when the referee uses K − 1 detectors than when she uses just one.
As a second observation, in all the simulated cases, the strategy involving K − 1 detectors is clearly superior than the other one before the "elbow". However, as the term of M and p dark becomes dominant, the strategy with just detector k = K provides the smallest communication cost. This is to be expected, by comparing (20a) and (23a). These differences, between the two referee strategies in our work, are much more noticeable as the number of users rises. Further, it can be also observed that the strategy with K − 1 detectors reaches the elbow point for a smallest raw message length N .
Also, we note that overcoming the classical limit is much more difficult than beating the best multi-party classical protocol known to date. When attempting to beat the classical limit, dark counts are a key limiting factor, much more dominant than σ. In particular, for our simulation with K = 50, we can only achieve less information than the classical limit if we use p dark = 10 −11 .

Relationship between transmitted information and transmitted energy
We study here the relationship between the transmitted information and the required energy. To this end, figures 16 and 17 include the following four plots as functions of N , each for the same simulation case: (a) information per user at log-log scale; (b) total mean photon number |α| 2 at natural scale compared to the amount of photonic bits required at the classical limit; (c) |α| 2 at log-log scale compared to an ideal quantum protocol; (d) amount K |α| 2 M at log-log scale. Figure 16 corresponds to K = 7, while figure 17 corresponds to K = 15. The photonic classical fingerprinting protocol refers to a classical protocol in which a bit is assigned to a photon, hence the term "photonic bit," as introduced in [15,16]. The amount K |α| 2 M is represented in order to confirm the strict validity of the assumption Kµ in = K|α| 2 M 1 upon which the mathematical results in (20) and (23) rely. All these simulations were carried out taking a reasonable dark count rate p dark = 10 −9 .
We observe in figures 16 and 17, for the region of non-dominant term of M and p dark before the "elbow" in (a), that the total mean photon number |α| 2 required per user remains constant in (b) and (c). This is the same behaviour exhibited by the ideal protocol in (c), although the |α| 2 level of this latter protocol is much lower. After the "elbow", the increment in |α| 2 becomes exponential in the realistic protocols. This happens because the combined effect of M and p dark becomes dominant and the users need to send more energy to keep the error probabilities below the target value p error . When the term of M and p dark governs the required value of |α| 2 , the clicks at the detectors become dominated by p dark and the gains in (19) and (22) become close to each other.
Interestingly enough, there is a region in plots (b), of both figures under consideration, in which |α| 2 remains practically constant for the QF protocol but the amount of energy of the photonic-bit classical protocol grows exponentially, even at the classical limit. Thus, before the "elbow", the QF protocol requires an exponential reduction in terms of energy consumption, which is indeed remarkable. Finally, we also observe in plots (d) that the premise Kµ in = K|α| 2 M 1 is comfortably met. The greater the raw message length N , the strongest the validity of the assumption on which our analytical model is constructed. 5 10 15 20 Raw message length log 10 (N) Best classical protocol valid for any K

Classical limit (lower bound)
Realistic quantum protocol, detectors [1, K-1], v [ Raw message length log 10 (N) Best classical protocol valid for any K

Classical limit (lower bound)
Realistic quantum protocol, detectors [1, K-1], v [1, K-1] =0.87 Realistic quantum protocol, detector K only, v K =0.85 Ideal quantum fingerprinting protocol 5 10 15 20 Raw message length log 10 (N) Best classical protocol valid for any K Raw message length log 10 (N) Best classical protocol valid for any K

Classical limit (lower bound)
Realistic quantum protocol, detectors [1, K-1], v [1, K-1] =0.75 Realistic quantum protocol, detector K only, v K =0.7 Ideal quantum fingerprinting protocol Fig. 15: Variation of the transmitted information per user as a function of the raw message length N for K = 50 users. For comparison purposes, we provide the classical lower bound and also the amount of bits per user required by the best-known classical protocol valid for any K ≥ 2. The realistic multi-party quantum fingerprinting protocols were analyzed for the following common parameters: p error = 10 −5 , combined efficiency excluding beamsplitter (BS) losses η = 0.5, η BS = −0.2 dB/BS. In order to evaluate the influence of manufacturing imperfections and dark count rates, the following four cases were computed: (a) σ = 0.01, p dark = 10 −9 ; (b) σ = 0.1, p dark = 10 −9 ; (c) σ = 0.01, p dark = 10 −11 ; (d) σ = 0.1, p dark = 10 −11 .  Raw message length log 10 (N) Transmitted Information per user

(a)
Best classical protocol valid for any K

Classical limit (lower bound)
Realistic quantum protocol, detectors [1, K-1], v [ Raw message length log 10 (N)  Raw message length log 10 (N) Transmitted Information per user

(a)
Best classical protocol valid for any K

Quantum advantages in terms of transmitted information
Starting from this subsection, we focus exclusively on the strategy involving just one detector. This choice is made because such strategy delivers the best performance in terms of energy consumption and transmitted information when the raw input size N is arbitrarily large. To the purpose of intuitively represent on a 2D plane, as a color plot, how the protocol behaves, we define the quantum advantages in terms of information as C limit /Q and C best /Q. Here, Q is the quantum information defined in (6), C best is the number of bits per user in the best-known K-user classical protocol given in (32), and C limit is the classical limit in (33). Figure 18a exhibits a representation on a 2D plane of the maximum quantum advantages as a function of K and p dark when σ = 0.01. The white dashed curve represents a lower bound below which quantum supremacy may be achievable. The black dashed curve is analogous to the white one, but for an ideal circuit with η BS = 0 and σ = 0. Note that this ideal case is not the same as in previous figures, because here we solely consider an ideal circuit and the detector dark counts are still on. These two dashed curves call again attention to the fact that, with today's technology, the dark counts are a much more limiting factor for the QF protocol than the fabrication defects in the circuit. We can check in the graph of figure 18b how the fabrication noise level degrades the visibility as the number of users is increased, from v K = 0.98, for K = 2, down to a still relatively high value v K = 0.85 for K = 100.
Multi-party quantum fingerprinting with weak coherent pulses 34

Quantum advantages in terms of transmitted information
Starting from this subsection, we focus exclusively on the strategy involving just one detector. This choice is made because such strategy delivers the best performance in terms of energy consumption and transmitted information when the raw input size N is arbitrarily large. To the purpose of intuitively represent on a 2D plane, as a color plot, how the protocol behaves, we define the quantum advantages in terms of information as C limit /Q and C best /Q. Here, Q is the quantum information defined in (6), C best is the number of bits per user in the best-known K-user classical protocol given in (32), and C limit is the classical limit in (33). Figure 18a exhibits a representation on a 2D plane of the maximum quantum advantages as a function of K and p dark when σ = 0.01. The white dashed curve represents a lower bound below which quantum supremacy may be achievable. The black dashed curve is analogous to the white one, but for an ideal circuit with η BS = 0 and σ = 0. Note that this ideal case is not the same as in previous figures, because here we solely consider an ideal circuit and the detector dark counts are still on. These two dashed curves call again attention to the fact that, with today's technology, the dark counts are a much more limiting factor for the QF protocol than the fabrication defects in the circuit. We can check in the graph of figure 18b how the fabrication noise level degrades the visibility as the number of users is increased, from v K = 0.98, for K = 2, down to a still relatively high value v K = 0.85 for K = 100.  Fig. 18: (a) Maximum quantum advantages regarding transmitted information as a function of the number of parties K and dark count rate. Two ratios are presented: between the classical limit information C limit and the quantum information Q, and between the best-known classical protocol information C best and Q. The following parameters were used together with the optimal referee strategy: perror = 10 −5 , combined efficiency η = 0.5 (Q may be easily scaled for any η as Qη = 1 2η Q η= 1 2 ), ηBS = −0.2 dB/BS, σ = 0.01. The white dashed curve indicates a lower bound below which a positive quantum advantage is attainable. For comparison purposes, the black dashed curve corresponds to an ideal referee circuit with ηBS = 0, σ = 0. (b) Visibility as a function of K corresponding to the realistic optical circuit implementation used in (a).

Quantum advantages in terms of transmitted energy
Quantum advantages in terms of transmitted energy, analogous to those defined above for the information, are analyzed here. Figure 19a shows that, indeed, quantum supremacy for energies is commonplace even for ordinary photonic detectors. This reality represents huge energy savings of several orders of magnitude when compared to any classical protocol implemented using photonic bits.
In the following, we deduce an approximate expression for calculating the maximum number of users K for which quantum supremacy is achievable in terms of classical limit energy, as a function of µ dark , p error , c, δ and visibility v K . We assume that the condition in (26), for arbitrarily high M = c N , holds. Then, we and dark count rate. Two ratios are presented: between the classical limit information C limit and the quantum information Q, and between the best-known classical protocol information C best and Q. The following parameters were used together with the optimal referee strategy: perror = 10 −5 , combined efficiency η = 0.5 (Q may be easily scaled for any η as Qη = 1 2η Q η= 1 2 ), ηBS = −0.2 dB/BS, σ = 0.01. The white dashed curve indicates a lower bound below which a positive quantum advantage is attainable. For comparison purposes, the black dashed curve corresponds to an ideal referee circuit with ηBS = 0, σ = 0. (b) Visibility as a function of K corresponding to the realistic optical circuit implementation used in (a).

Quantum advantages in terms of transmitted energy
Quantum advantages in terms of transmitted energy, analogous to those defined above for the information, are analyzed here. Figure 19a shows that, indeed, quantum supremacy for energies is commonplace even for ordinary photonic detectors. This reality represents huge energy savings of several orders of magnitude when compared to any classical protocol implemented using photonic bits.
In the following, we deduce an approximate expression for calculating the maximum number of users K for which quantum supremacy is achievable in terms of classical limit energy, as a function of µ dark , p error , c, δ and visibility v K . We assume that the condition in (26), for arbitrarily high M = c N , holds. Then, we rewrite |α| 2 in (23a) as a function of visibility v K in (29) instead of as a function of the gains in (22): .
Now, assuming K 1, the version with photonic bits of the classical limit in (33) can be well approximated, leaving out the term in 1/K, as Finally, equating these two previous expressions and solving for K, we arrive at This latest equation takes into account the supposition that v K is independent from K. In practice, this is not the case. However, we may fix an expected worst-case experimental value of the visibility and then obtain a lower bound on the maximum number of users for which the quantum protocol requires less energy than a hypothetical classical protocol matching the classical limit energy. Figure 19b shows an example of the result in (36) at work. This result may be also of interest for determining the required dark count probability as a function of the desired number of users K in an experimental realization.
Multi-party quantum fingerprinting with weak coherent pulses 35 rewrite |α| 2 in (23a) as a function of visibility v K in (29) instead of as a function of the gains in (22): Now, assuming K ≫ 1, the version with photonic bits of the classical limit in (33) can be well approximated, leaving out the term in 1/K, as Finally, equating these two previous expressions and solving for K, we arrive at This latest equation takes into account the supposition that v K is independent from K. In practice, this is not the case. However, we may fix an expected worst-case experimental value of the visibility and then obtain a lower bound on the maximum number of users for which the quantum protocol requires less energy than a hypothetical classical protocol matching the classical limit energy. Figure 19b shows an example of the result in (36) at work. This result may be also of interest for determining the required dark count probability as a function of the desired number of users K in an experimental realization.  Fig. 19: (a) Quantum advantages concerning transmitted power, or energy, as a function of the number of parties K and dark count rate. Two ratios are presented: between the classical limit protocol energy E C,limit and the quantum protocol energy EQ, and between the best-known classical protocol energy E C,best and EQ. The following parameters were used together with the optimal referee strategy: perror = 10 −5 , combined efficiency η = 0.5 (the results here may be easily scaled for any value of η), ηBS = −0.2 dB/BS, σ = 0.01. (b) Lower bounds on the maximum number of users that allows a positive quantum advantage in terms of energy as a function of the dark count rate, for four assumed worst-case visibility values. Fig. 19: (a) Quantum advantages concerning transmitted power, or energy, as a function of the number of parties K and dark count rate. Two ratios are presented: between the classical limit protocol energy E C,limit and the quantum protocol energy EQ, and between the best-known classical protocol energy E C,best and EQ. The following parameters were used together with the optimal referee strategy: perror = 10 −5 , combined efficiency η = 0.5 (the results here may be easily scaled for any value of η), ηBS = −0.2 dB/BS, σ = 0.01. (b) Lower bounds on the maximum number of users that allows a positive quantum advantage in terms of energy as a function of the dark count rate, for four assumed worst-case visibility values.

Conclusions and future perspectives
In this paper, we have proposed and investigated a K-user QF protocol based on coherent states. One of the main incentives is on the fact that an analogous classical protocol is known and can be used for comparison purposes to attest quantum supremacy. Our work constitutes a step towards a deeper understanding of quantum networks embracing a central processing node. As one of the main contributions of this work, we have provided innovative optical circuit designs required for the protocol, and have discussed the benefits and issues of each design. Then, we have proposed and detailed two separate referee strategies for the central node. Also, we have introduced a fully-analytical method to perform the estimations of the amounts of qubits and energy required for each user. These analytical expressions are, indeed, very convenient to understand how the different involved quantities affect the protocol execution, even at the experimental level. Further, simulations are presented that certify quantum supremacy under certain circumstances. This advantage of the quantum protocol over the classical one is especially noticeable when comparing energy consumptions, which paves the way for the deployment of quantum networks implementing data-processing "green" protocols. In doing the simulations, we also determined under which conditions one of the proposed strategies is more efficient than the other.
An instinctive approach to continue our research would exploit the fact that the protocol sends pulses with very low amplitude, mostly empty coherent states. This means that the expected time between clicks at each individual detector is large, and detector dead times are usually not a problem. Besides this benefit, the referee node can be adjusted to process many signals in parallel. This idea was first proposed in [58] for the two-user protocol and, very recently, it was experimentally demonstrated with improvements in [59].
Another natural step for the continuation of this work would investigate different ways of defining the trains of pulses sent by the users. In the present work, we stuck to the same scaling properties and to the same number of pulses per user as in the standard two-user protocol. Perhaps, some improvements in the communication cost can be attained by redefining the coherent states in such a way that exploits more efficiently the peculiarities of the multi-party scenario. Q Amount of transmitted information measured in qubits/user that is required in a quantum fingerprinting protocol.
C Amount of transmitted information measured in bits/user that is required in a classical fingerprinting protocol.
E Q Transmitted energy that is required for each user in a quantum fingerprinting protocol.
E C Transmitted energy that is required for each user in a classical fingerprinting protocol implemented with photonic bits.
K Total number of users.
k Integer label 1 ≤ k ≤ K assigned to each user, and also label allocated to each referee input port and output port.  Power transmittance of each unbalanced beamsplitter (BS) in the referee optical circuit. The corresponding power reflectance of the BS is r = 1 − t.
σ P Fabrication noise level affecting phase shifters inside the referee circuit. If φ ideal represents an ideal phase, then σ P models a phase deviation as φ realistic = φ ideal + σ P · randn.
σ General fabrication noise level of the referee circuit. Present-day technology allows achieving minimum values σ T = σ P = 0.01. Accordingly, for simplicity, we always consider that both tolerances have the same value, and we simply define σ = σ T = σ P .
Mean photon number at referee circuit output port k, with 1 ≤ k ≤ K, when all the individual input states (one state from each user) have the same phase.

Symbol
Description η Combined efficiency that includes losses of the quantum channel and detector efficiencies. It does not include beamsplitter (BS) losses, as these BS losses affect differently each path from any circuit input to any circuit output. The effect of beamsplitter losses is fully included (either by simulation or by measurement) in gains g E [1,K−1] , g D [1,K−1],P , g E K , g D K,P and in visibilities v [1,K−1] , v K .

η BS
Losses of each of the 50:50 beamsplitters that are used for implementing all the unbalanced beamsplitters in the referee circuit. The effect of these losses η BS is fully included (either by simulation or by measurement) in all the gains and in the visibilities defined above.
Random variables with Bernoulli distribution for each output k, with 1 ≤ k ≤ K. Variables X E k model the click or no click at output detector k when K input states equal (E) to each other arrive at the referee.
Analogous to the previous definition, but pertaining to the situation when at least one input state is different (D) from the rest.

Appendix B. Upper bounds on the total mean photon number per user
This appendix, mostly self-contained in nature, covers the detailed mathematical steps required to obtain closed-form analytical upper bounds on the total mean photon number |α| 2 , per user, required for a successful implementation of a multi-party quantum fingerprinting protocol. Two different upper bounds are derived in Appendix B.1 and Appendix B.2 that are applicable to separate referee strategies (decision rules). We assume a realistic optical circuit at the referee involving imperfections of any kinds. In our circuit model, the combined effect of such general imperfections is fully included in certain gains that establish relationships between the mean photon number at any circuit input and the mean photon number of certain sets of outputs.
In order to accomplish the aforesaid goal of upper-bounding |α| 2 , we employ a particular version of the Chernoff bounds [60] as described in detail in [61,62]. We first describe Chernoff bounds as applied to generic random variables. Afterwards, we define the concise random variables that are required in our physical model. Theorem B.1 (Chernoff Bounds) Let X = n i=1 X i be a random variable obtained as the sum of X i , 1 ≤ i ≤ n, independent Bernoulli random variables. Let µ = E(X) be the mean, or expected value, of X. Then In our particular application, we shall need to use an identical threshold value r for the two tails stated in Theorem B.1 above, in order to calculate both the upper tail as Pr[X ≥ r] and the lower tail as Pr[X ≤ r]. Furthermore, we shall apply each tail to a different random variable; hence we write below X upper and X lower , using the subscript labels to emphasize the fact that both still-generic random variables X are different. To sum up, we may rewrite Theorem B.1 in a more convenient and clear way for our specific purposes, as follows: Pr[X lower ≤ r] ≤ exp − (r − µ lower ) 2 2µ lower ∀ 0 < r < µ lower . (B.1) Hereinafter, we describe in brief the circuit at the referee node and define the physical random variables that are required to judiciously apply the above-explained Chernoff bounds. The referee's circuit comprises K optical input ports and K optical output ports, with K being also the total number of parties involved in the protocol. An integer label k with 1 ≤ k ≤ K is assigned to each input and output port. Without any loss of generality, we assume in this appendix that the last label k = K always corresponds to the only output that losses photons when not all K input states ± √ µ in have the same phase (see section 3 for a detailed description of the referee's circuit). Now, let X E k , with 1 ≤ k ≤ K, be a random variable with Bernoulli distribution for the number of clicks (0 or 1 click) at output detector k when K coherent states with identical phases arrive at the referee at the same time from K users. Similarly, X D k is an analogous random variable for the case when some of the K input states are different, i.e. have phases that differ from the rest. The K different random variables X E k are independent from each other, for any fixed coherent states that are inputted to the referee at the same arrival time, because the average photon number at each output is also fixed. The same argument is also applicable to the other set of variables X D k . Random variables X D k depend on a vectorP that contains the phases of K simultaneous input pulses ± √ µ in , as detailed in Appendix A. This dependency is not explicitly included in the notation of X D k just for the sake of simplicity. We additionally introduce an integer L to specify the number of phases inP that are different from the rest K − L phases. Throughout the mathematical development in this appendix, we do not anticipate an analytical worst-case value for L. However, numerical evaluation for determining the worst-case L on the grounds of analytical visibility models is carried out in subsection 4.2. The results there clearly show that, considering present-day technology parameters in any realistic referee circuit design, the worst-case scenario consistently corresponds to L = 1. As a consequence, we may assert that it suffices to take into account in our upper-bound analysis the K instances of vectorP that contain just one phase difference.
Let us remark the fact that variables X D k are used for modelling the effect of differences in K individual pulses arriving at the same time at the referee, that is, K k=1 ± √ µ in k . As this set X D k is not enough for our purposes, an additional ensemble of random variablesX D k,m needs to be introduced. These latter variables are for modelling the effect of differences in the K complete sequences of M pulses M m=1 ± √ µ in m sent by the users, and not just in K individual simultaneous input pulses entering the circuit: X D k,m = X D k for any (1 − δ) · M indices m, X E k for any δ · M indices m.

(B.2)
Parameter δ in the definitions above represents the "distance parameter" of the error correcting code (ECC). The ECC is used in the quantum protocol for amplifying differences in the transmitted coherent-pulse sequences. δ is the maximum fraction of bits in which two ECC codewords have the same bit values. The minimum distance of the ECC can be simply expressed as (1 − δ)M . Equation (B.2) corresponds to any worst-case scenarios in which the number of instances of K simultaneous dissimilar states arriving at the referee is the same as the minimum ECC distance. In other words, this worst-case different-input scenario intuitively corresponds to the case where the differing sequences are the most similar to all equal sequences. Thus, this described situation is the most difficult to distinguish by the referee. Before fully entering into mathematical elaboration, we present in detail the two separate referee strategies that we consider in the analytical developments in Appendix B.1 and Appendix B.  M m=1 E(X D k,m ). The meaning of this inequality in the physical world establishes that K −1 circuit outputs must gain photons when we switch from equal input states to different input states. This is the desired behaviour of the circuit under normal realistic circumstances. Moreover, the aforesaid inequality enables an easy comparison between the denominators inside the exponentials in (B.6) and (B.7). The worst-case bound clearly corresponds always to p D error in (B.7), as it provides the greatest upper bound for the error probability. Consequently, the remainder of this Appendix B.1 is aimed at obtaining an upper bound for |α| 2 based on (B.7), and we dismiss (B.6).
Henceforth, we assume that condition Kµ in = K|α| 2 M 1 holds, where µ in is the photon number of each individual input pulse. This approximation is always correct if K M , which corresponds to the cases of interest addressed in this manuscript, and was checked to be valid for all the realistic scenarios analyzed in Section 5. Under the considered assumption, we can approximate click probabilities at output detectors as p E click,k µ E k , p D click,k,P µ D k,P . A subscriptP is used to emphasize the fact that the photon number at each output k depends on vectorP that contains the information of the K input pulse phases.
For simplicity, detector efficiencies and dark count rates are not yet specified in the definitions of the click probabilities; below, we introduce a combined efficiency quantity that includes detector efficiencies as well as channel losses.
By defining now the gains of the first K − 1 circuit outputs as Gain g E [1,K−1] in (B.12) is theoretically a constant magnitude, since we assume that all the input pulses have the same or approximately the same amplitude. Conversely, gain g D [1,K−1],P depends on the particular phases of the K input states. In order to infer a worst-case value for gain g D [1,K−1],P , we next perform an optimization assuming that g D [1,K−1],P is a continuous variable denoted as g D . This is just a "mathematical license" taken to analyze the behaviour of the varying gain. A function f (g D ) is introduced in the phase argument of (B.12) as p error ≤ exp − 1 8 f (g D ) : f (g D ) = (1 − δ) 2 (g D − g E ) 2 |α| 4 δ · g E + (1 − δ) · g D |α| 2 + (K − 1)M µ dark . (B.13) By equating the derivative to zero, ∂f (g D ) ∂g D = 0, it is easy to find two critical points. One of these points is g D 1 = g E and the other critical point g D 2 verifies This inequality in (B.14) clearly poses a contradiction on condition (B.11) and, as a result, only critical point g D 1 = g E stands in our analysis. Moreover, it is easy to prove that function f (g D ) decreases as g D shrinks closer to the other gain g E . As a consequence of this analysis, the worst-case value of g D Combined efficiency η takes into account detector efficiencies and channel losses. Detector efficiencies are assumed to be the same for all detectors. In practice, this may be a rather good realistic approximation; nevertheless, different quantum efficiencies may also be easily considered just by transferring their effects from η to the gains g E [1,K−1] and g D [1,K−1],P . A closed-form expression for the referee threshold is obtained by taking (B.9) and (B.10) into (B.5) and by including the minimum value of g D [1,K−1],P (the value that maximizes error probability, as proven above): In this second subsection, we address the referee decision rule consisting of counting clicks in just the last detector. By convention, this last detector has a label k = K assigned. The notation employed throughout the present mathematical elaboration is identical to that in Appendix B.1. We apply the upper tail case in (B.1) to random variableX D K,m (note the subscript k = K corresponding to the last variable in the ensembleX D k,m , 1 ≤ k ≤ K, defined in (B.2)) in order to upper bound error