Quantum detector tomography of a 2x2 multi-pixel array of superconducting nanowire single photon detectors

We demonstrate quantum detector tomography of a commercial 2x2 array of superconducting nanowire single photon detectors. We show that detector-specific figures of merit including efficiency, dark-count and cross-talk probabilities can be directly extracted, without recourse to the underlying detector physics. These figures of merit are directly identified from just four elements of the reconstructed positive operator valued measure (POVM) of the device. We show that the values for efficiency and dark-count probability extracted by detector tomography show excellent agreement with independent measurements of these quantities, and we provide an intuitive operational definition for cross-talk probability. Finally, we show that parameters required for the reconstruction must be carefully chosen to avoid oversmoothing the data.

In order to use such detectors effectively, particularly in the field of quantum optical technologies, a quantum mechanical description of these devices is necessary. Typically, a "bottom-up" approach is employed, based on modelling the underlying physics governing the working principles of such detectors. Detector parameters such as efficiency and noise are put into these models a priori. On the other hand, "top-down" techniques such as quantum detector tomography can provide an operational description of the device, including many of the figures of merit such as efficiency, dark-count and cross-talk probabilities, without recourse to an underlying model of the detector's working principle, geometry or readout scheme.
Quantum detector tomography concerns finding the elements of the so-called positive operator valued measures (POVMs), which fully characterises the detection operation. In the context of photon counting, these elements represent probabilities of different outcomes of the detector, given a particular number of incident photons. This treats the detector itself as a black box, which is characterised by known inputs and measured outputs. Quantum detector tomography has been applied to several different single-photon detectors, including avalanche pho- * timons@mail.upb.de todiodes [34][35][36] and SNSPDs [32], in both single-channel and time-multiplexed geometries. It has also been applied to transition-edge sensors, which resolve energy at the single-photon level [37], as well as coherent detection schemes [38,39]. Spatial arrays of single-photon detectors have been studied in a "bottom-up" approach, based on modelling their operating principle [40,41]. However, "top-down" detector tomography has not, to the best of our knowledge been carried out on spatial arrays. Such arrays are important since they exhibit additional noise sources such as cross-talk, which is not only essential to characterise for accurate measurements, but also challenging to measure as the size of the array increases [40][41][42][43][44]. Furthermore, from a fundamental perspective, they represent quantum objects spanning an extremely large Hilbert space, and are among the largest objects to be described in a fully quantum-mechanical manner. It is therefore necessary to demonstrate that quantum detector tomography works in principle for such arrays, and that the techniques can be scaled with their size.
In this paper, we address the first of these issues, by presenting a tomographic reconstruction of the response of a 2×2 array of SNSPD pixels. Using this technique, we can directly quantify the effects of cross-talk, as well as distinguish this from dark noise and determine the detector efficiency. Whilst the salient physics is present in the 2×2-array, this method can be readily generalised to much larger arrays, where characterising pixel-by-pixel becomes highly challenging (and characterising the interpixel correlations prohibitive) with increasing array size.

II. DETECTOR TOMOGRAPHY
As introduced by Lundeen et al. [34], the aim of quantum detector tomography is to reconstruct the set of positive operator valued measure (POVM) elements {π n }. (1) If the input states {ρ} and outcome statistics p ρ,n are known, then this equation can be inverted to find the POVM set {π n }, under the positive semidefinite constraints π n > 0 and n π n = 1, in order to represent physically meaningful probabilities. The choice of input states is in principle arbitrary, with the sole requirement that they span the same Hilbert space as the detector. Coherent states are ideal candidates since they are both overcomplete and straightforward to produce in the laboratory. Coherent states are fully characterised by their mean photon number, which must be precisely determined prior to characterising the detector under test. While a single coherent state is in principle sufficient to span the Hilbert space of the detector, it is also important to obtain enough outcome statistics across the whole Hilbert space in a reasonable measurement time, therefore a set of coherent states with different mean photons numbers are chosen.
Under the assumption that the detector is insensitive to the phase of the incoming light field, its POVMs contain only diagonal elements, i.e.
where θ (n) i represents the probability of outcome n given i incident photons. Using this notation, Eq. (1) can be recast as the matrix equation where P is a matrix containing all the measurement statistics, arising from D input states and yielding N outcomes. The matrix F contains the photon number distributions of all D probe states, truncated at the maximum such that the columns correspond to the diagonal elements of the POVMs π n given in Eq. 2, with the sum truncated at M − 1. Thus, following [34], detector tomography can be cast as a matrix inversion problem, where the task is to determine the unknown Π from known P and F.

A. Input state preparation and characterisation
The experiment is conducted using the setup shown in Fig. 1. Experimentally, the set of input states are created from a 1556 nm laser emitting 9 ps pulses at a repetition rate of 500 kHz. The pulse energy is set using two variable optical attenuators. Assuming Poissonian statistics of the laser pulses (which was separately determined using an autocorrelation measurement yielding g (2) (0) = 1.00006 (17) across the power range), the mean photon number per pulse is determined bȳ where η cal = 83 ± 5 % is the detection efficiency and p (click) is the click probability, respectively, of the calibration detector (a single-element SNSPD), with the dark-count probability per pulse (of 2 × 10 −7 ) neglected. A total of D = 19 coherent states were used, the photon number distributions of which span a range of 0 to 332 photons per pulse. To express these coherent states in a finite matrix F, their dimension was truncated at M = 443. The elements of F are given by the Poisson distributions  Figure 1. A 1556 nm pulsed laser produces coherent states at a repetition rate of 500 kHz, which are then attenuated using computer-controlled variable optical attenuators and detected by a 2×2 array of SNSPDs. A time-tagger is used to measure the electronic response from the detector. For further details see text (Sec. II B).

B. Defining outcomes
The detector under test is a commercial device (Photon Spot) comprising a 2×2 array of SNSPDs electrically connected in series. The output of this device is a voltage signal proportional to the number of pixels which fire. We read out this device using a time-tagger, which counts the number of times a particular voltage threshold is exceeded, with different threshold settings corresponding to the different number of pixels that fire. Further details on the detector itself can be found in Ref. [10].
We obtain data in an ensemble measurement: we sequentially cycle through the threshold settings, thereby obtaining count rates c n , corresponding to at least n pixels firing. For each coherent state amplitude and threshold setting, click statistics were obtained for 5 × 10 6 pulses, measured in a coincidence window of 15 ns synchronised to the pulse train from the laser. In this analysis c 0 (at least 0 clicks) corresponds to the repetition rate of the experiment. Note that these outcomes are not orthogonal, since events contributing to c n are also contained within c j>n . Orthogonal outcomes c n are obtained using the transformation c n = c n − c n+1 , such that c n gives the rate at which exactly n detectors click. The probabilities P d,n of each outcome, evaluated given an input state d, are thus given by P d,n = c n c0 d . The elements P d,n make up the outcome matrix P.

C. Matrix inversion and smoothing
Given the matrices of input states F and outcomes P, the matrix Π corresponding to the POVM set {π n } can be found by inversion. In order to maintain physical POVMs, this inversion can be recast as the optimisation [34]: where ||·|| 2 indicates the Frobenius norm [45] and the function scaled by a factor , ensures that the result is smooth [35]. We choose a smoothing parameter of = 0.1; the effects of choosing a different smoothing parameter on the POVM elements are discussed in more detail in Section III B. The code to perform this inversion was written using the CVXPY module in Python [46,47] and is available online [48].

III. RESULTS
The reconstructed POVM elements for all five outcomes are shown in Fig. 2(a). The inset shows the same data for zero, one and two photons incident, on a logscale. Assuming phase-insensitive POVMs, Wigner functions corresponding to the five outcomes (0-4 clicks) may also be reconstructed, as shown in Figs. 2(b)-2(f). Error bars are calculated based on assuming 5% uncertainty in the amplitudes of the coherent states. This reflects the uncertainty in the calibration procedure; uncertainty due to finite counting statistics is negligible in comparison.
Since the elements θ the reconstruction can yield bounds on detector parameters such as dark counts, cross-talk and overall efficiency. Crucially, this can be achieved without any underlying assumptions about the detector geometry or circuitry.

A. Efficiency
Efficiency is intuitively defined as the probability that the detector clicks given that a single photon was incident, i.e.
where N is the total number of outcomes. For our detector, p (0|1) = 0.37 ± 0.04, which results in an efficiency of η = 63% ± 4%. The error arises from the uncertainty in the determining the mean photon number used for each of the input states. This agrees well with the independently measured efficiency of η = 65% ± 4%, based on direct comparison with a calibrated single-pixel detector for fixed incident power. The uncertainty in this measurement stems from the calibration procedure.

B. Dark counts
As shown in the inset in Fig. 2(a), the dark-count probability is defined as the single-click probability (red) when zero photons are incident. For the device as a whole, this corresponds to the probability For the detector under test, the element p (1|0) = (5.9 ± 1.6) × 10 −6 . This agrees well given the dark count probability measured independently to be p dark = (6.34 ± 0.15) × 10 −6 . The close agreement clearly shows that the intuitive definition of how dark counts manifest in the POVM elements is reasonable. The importance of choosing an appropriate smoothing parameter is manifest in the dark count estimation. The smoothing factor can take values between zero and one. Previous work [35] has shown that the value itself is relatively unimportant, since the error associated with the reconstruction is largely independent of the smoothing factor. However, in cases where neighboring POVM elements are expected to vary by several orders of magnitude, choosing a smoothing factor that is too large may significantly overestimate the smaller of the two elements. SNSPDs are a pertinent example of a detector whose tomographic reconstruction may be susceptible to an inopportune choice of the smoothing factor.
To illustrate this, in Fig. 3 we plot the element θ (1) 0 as a function of smoothing factor . Below a threshold of = 0.17, the POVM element is independent of ; however, above this point, the smoothing factor causes an overestimate. We arbitrarily choose a smoothing factor of = 0.1 to be below this threshold.

C. Cross-talk
In contrast to dark counts, cross-talk is regarded as conditional noise: additional counts arising due to a pixel firing. As can be seen in the inset to Fig. 2(a), cross-talk is most clearly manifest in the much larger probability of two clicks (green) given one photon (compared with one click given zero photons). In principle, these effects contributes to all POVM elements p (j|i), for j > max (i, 1). For example, one POVM element containing this effect is the element p (2|1), i.e. where one incident photon causes two clicks. In the absence of cross-talk, one expects the p (2|1) term to comprise only clicks given a photon is incident p (1|1), and a single dark count arising from the three remaining detectors. Any additional counts in the scenario we attribute to cross-talk. As such, we define the single-pixel cross-talk probability as In our case, p (2|1) = 0.14 ± 0.01, p (1|1) = 0.49 ± 0.03 and p (1|0) = (5.9 ± 1.6)×10 −6 from above. We therefore estimate a single-pixel cross-talk probability of 14 ± 1%, which is dominated by the leading term since the darkcount probability is significantly smaller. In general, an independent measurement and/or modelling of the cross-talk probability is very challenging, since it depends on a range of factors, including the number of pixels which have fired, the number and location of remaining pixels, and correlations between particular pixels [40][41][42][43][44][49][50][51]. Nevertheless, our method can provide an estimate of this probability in a relatively straightforward manner.

IV. CONCLUSION
Quantum detector tomography is a powerful tool to characterise a measurement process without recourse to the underlying physics of the detector. Nevertheless, certain physical properties of a detector can be inferred, such as efficiency, dark counts, and cross talk, where the latter is exclusive to array detectors. We applied this technique to characterise a commercial four-pixel array of superconducting nanowire single photon detectors, and were able to identify these figures of merit directly using just four POVM elements. This is particularly useful for identifying cross-talk probability, which may be otherwise difficult to do based on underlying models of the detector electronics. Furthermore, as the size of arrays become increasingly large, the need to characterise the device as a whole, rather than on a per-pixel basis, will become increasingly important.