Accurate and precise characterization of linear optical interferometers

We combine single- and two-photon interference procedures for characterizing any multi-port linear optical interferometer accurately and precisely. Accuracy is achieved by estimating and correcting systematic errors that arise due to spatiotemporal and polarization mode mismatch. Enhanced accuracy and precision are attained by fitting experimental coincidence data to curve simulated using measured source spectra. We employ bootstrapping statistics to quantify the resultant degree of precision. A scattershot approach is devised to effect a reduction in the experimental time required to characterize the interferometer. The efficacy of our characterization procedure is verified by numerical simulations.


Introduction
Linear optics is important in quantum computation and communication. The simulation of a linear optical interferometer is computationally hard classically subject to reasonable conjectures [1]. Single-photon detectors and linear optical interferometers allow for efficient universal quantum computation via linear optical quantum computing (LOQC) [2]. Linear optics can simulate the quantum quincunx [3] and quantum random walks [4]. Linear optics coupled with laser-manipulated atomic ensembles enables longdistance quantum communication [5]. A wide class of communication protocols can be realized with coherent states and linear optics [6].
The accurate and precise characterization of linear optics is important in quantum information processing tasks such as BosonSampling, LOQC and quantum walks. BosonSampling involves sampling from the output photon coincidence distribution of an interferometer on single photon inputs to each mode. Sampling from this distribution is computationally hard classically but is easy with a linear-optical interferometer. The classical hardness of the BosonSampling problem crucially depends on the error in the linear optical interferometer [24]. Similarly, the practical applications of BosonSampling, in quantum metrology and in the computation of molecular vibronic spectra, rely on the accurate implementation and characterization of linear optics [25,26].
Accurate and precise characterization is important in LOQC because a high success probability of the employed non-deterministic linear-optical gates relies on implementing the desired gates with high fidelity [27]. Furthermore, linear interferometers used in photonic quantum walks, which display strong non-classical correlations, require accurate characterization especially if quantum walks are employed for solving classically hard problems [28][29][30]. In other words, that accurate and precise characterization of interferometers enables a verifiable quantum speedup of linear-optical protocols over classical computers.
Classical-light procedures [31,32] for linear optics characterization are unsuitable for Fock-state based experiments because the interferometer parameters change during the coupling and decoupling of classical light sources and of homodyne detectors at the interferometer ports. This change could result from drift of interferometer parameters in the time required to couple sources and detectors or as the result of mechanical process of coupling itself. Characterization procedures that rely on Fock-state (rather than classicallight) inputs are thus more desirable in BosonSampling and LOQC implementations; such procedures would enable interferometer characterization without altering the experimental setup and would thus be accurate.
The Laing-O'Brien procedure [33] uses one-and two-photons for characterizing linear optical interferometers and is stable to the length scale of a photon packet. This procedure assumes perfect matching in source field and large-number statistics on the detected photons. Hence, implementations of this procedure are inaccurate due to spatiotemporal and polarization mode mismatch in the source field and imprecise due to shot noise. We aim to devise an accurate and precise procedure that uses one-and two-photons for the characterization of linear optical interferometers and to devise a rigorous method to estimate the standard deviation in the linear optical interferometer parameters [34]. Furthermore, we aim to provide a correct alternative to the χ 2 -test, which has been used to estimate the confidence in the characterized interferometer parameters in current BosonSampling implementations [11,35,36] ‡.
Here we devise a procedure to characterize a linear optical interferometer accurately and precisely using one-and two-photon interference. Four strengths of our approach over the Laing-O'Brien procedure [33] are that our procedure (i) accounts for and corrects systematic error from spatial and polarization source-field mode mismatch via a calibration procedure (Section 3); (ii) increases accuracy and precision by fitting experimental coincidence data to curve simulated using measured source spectra (Section 3); (iii) accurately estimates the error bars on the characterized interferometer parameters via a bootstrapping procedure (Section 4); and (iv) reduces the experimental time required to characterize interferometers using a scattershot procedure (Section 5).

Background
This section provides the background for our one-and two-photon characterization procedure. The action of a multi-port linear optical interferometer on single photons entering one or two input ports and vacuum entering the other ports is detailed. Specifically, we calculate the probability of detecting a photon at a given output port when a single photon is incident at a given input port. The section concludes with expressions for the probability of detecting a coincidence measurement when two controllably delayed photons are incident on the interferometer.

Action of a linear optical interferometer
In this subsection, we define linear optical interferometers by their action on single photons. We parameterize the unitary transformation effected by an interferometer and ‡ The χ 2 -test [37][38][39] is used to quantify the goodness of fit between probability distribution functions of two categorical variables, which can take a fixed number of values. Coincidence-count curves and visibilities are not probability distribution functions of categorical variables, but rather are collections of many categorical variables (variables that can take on one of a fixed finite number of possible values), one variable corresponding to each time-delay value chosen in the experiment. Hence, quantifying the goodness of fit between two coincidence curves using the χ 2 -test is incorrect. This incorrectness undermines the claim that the data are consistent with quantum predictions and disagree with classical theory [11,36] and leaves the choice of unitary matrices [35] unjustified. present our treatment of losses and dephasing at the interferometer ports.
Consider a single photon entering the i-th mode of an m-mode interferometer. The monochromatic photonic creation and annihilation operators acting on the i-th and the j-th ports obey the canonical commutation relation § for positive real frequencies ω 1 , ω 2 . The state of a single photon entering the i-th mode is where f i (ω) is the normalized square integrable spectral function, |0 is the m-mode vacuum state. The state of two photons entering modes i and j = i of the interferometer is with exchange symmetry holding if f i (ω) = f j (ω). One-and two-photon states are transformed into superpositions of one-and of two-photon states respectively under the action of the linear interferometer. We treat linear interferometers as unitary quantum channels acting on the state of the incoming light. The interferometer transforms the photonic creation and annihilation operators according to and its complex conjugate, where V (ω) is the transformation matrix of the interferometer. Photon-number conservation imposes unitarity of the transformation matrix V (ω) for all real ω. In general, the elements {V ij (ω)} of the transformation matrix depend on the frequency of transmitted light. We assume that the spectral functions of the incoming light are narrow compared to frequencies over which the entries {V ij } change noticeably and thus treat V to be frequency-independent. If only Fock states are incident at the interferometer and only photon-numbercounting detection is performed on the outgoing light, then the measurement outcomes are invariant under phase shifts at each input and output port. That is, interferometer V = D 1 V D † 2 produces the same measurement outcome as V for any diagonal unitary matrices D 1 and D 2 . Mathematically, if D 1 , D 2 are diagonal unitary matrices, then is an equivalence relation. Members of the same equivalence class defined by this equivalence relation produce the same number-counting measurement outcomes on receiving Fock-state inputs. § Two monochromatic photons are distinguishable based on the ports that they occupy and on their respective frequencies ω 1 and ω 2 . U U lossy Figure 1: Schematic diagram of the interferometer. U effects a unitary transformation on a multimode state of light. The dotted lines represent the couplings of the interferometer with light sources and detectors. The beam splitters at the input and output modes model the linear losses because of imperfect coupling and detector inefficiency. The vacuum input to these beam splitters is not shown. One of the beam splitter outputs enters the interferometer while the other one is lost. The triangles represent the random dephasing at the input and output ports. The dashed box labelled U lossy represents the combined effect of the dephasing, the losses and the unitary interferometer.
The constraints θ i1 ≡ 0, θ 1i ≡ 0 ∀i ∈ {1, 2, . . . , m} on the input and output phases of the transformation matrix are obeyed in the following parameterization of U Thus, the values {λ i }, {α ij }, {θ ij }, {µ j } completely parametrize the class representative matrix U . The input and output ports of the interferometer are amenable to time-dependent linear loss and dephasing. We model losses using parameters ν j and κ i , which are the respective probabilities of transmission at the input mode j and output mode i. Dephasing is modelled using parameters ξ j and φ i , which are the arbitrary multiplicative phases at the input and output ports. Hence, the actual transformation effected by the interferometer is given by the matrix U lossy , which has matrix elements Figure 1 depicts the relation between the representative matrix U and the actual transformation U lossy ij that is effected by the interferometer. This completes our parameterization of the linear optical interferometer. Our characterization procedure employs one-and two-photon inputs to estimate the values of parameters {λ i }, {α ij }, {θ ij }, {µ j } of (9). In the next subsection, we recall the expectation values of measurements performed on interferometer outputs when one-and two-photon states are incident at the input ports. Figure 2: Schematic diagram of single-photon counting at the output of an interferometer when single photons are incident at one input port. The star symbol represents a source of single-photon pairs. Single photons are incident at one of the input ports while vacuum state is input to the remaining input ports (not shown in figure). The semicircles at the output ports represent single-photon detectors and the circles with the included # represent the photon-counting logic connected to the detectors.

One-and two-photon inputs to linear optical interferometer
Our characterization procedure employs single-photon counting to estimate the amplitudes {α ij } of the representative matrix U entries. The arguments {θ ij } of U are estimated using two-photon coincidence counts. In this subsection, we give expressions for one-and two-photon transmission probabilities, which are employed in our characterization procedure (Section 3).
We first consider the case of single-photon transmission. The interferometer transforms the single-photon input state (2) to the state at the output ports according to (9). A photon is detected at the i-th output port with a probability when a single-photon is incident on the j-th input port.
Whereas the values of {α ij } are estimated using single photon counting, {θ ij } values are estimated using two-photon coincidence measurement. We now present probabilities of detecting two-photon coincidence at the interferometer outputs when controllably delayed pairs of photon are incident at the input ports.
If a controllably delayed photon pair is incident at input ports j and j , then the probability C ii jj (τ ) of coincidence measurement at detectors placed at output ports i and i is On substituting according to (7), we obtain [40] where τ is the time delay between the two photons, f j (ω), f j (ω) describe the spectrum of light just before it enters the detectors and γ is the mode-matching parameter, which we described in the remainder of this section. Two-photon coincidence probabilities (12) depend on the mode matching in the source field. Spatial and polarization mode mismatch is quantified by the mode-matching parameter γ [40]. Perfectly indistinguishable light sources, such as light from a singlemode fibre, have relative mode matching γ = 1 whereas γ = 0 indicates that the sources are completely distinguishable. Figure 4 depicts how imperfect mode matching, i.e., γ < 1, alters the observed two-photon coincidence counts. Our calibration procedure estimates and accounts for imperfect mode matching, which is assumed to be constant over the runtime of the characterization experiment.
The calculation of the expected coincidence probabilities as a function of the time delay between the photons is detailed in Algorithm 1. The next section describes how single-photon transmission probabilities (10) and two-photon coincidence probabilities (12) are used for characterizing the linear optical interferometer. U Figure 3: Schematic diagram for coincidence measurement the interferometer output when single-photon pairs are incident on two different input ports of an interferometer. The star symbol represents a source of single-photon pairs and the semicircles at the output ports represent single-photon detectors. The coincidence logic, which is depicted by ⊗, counts two-photon coincidence events at the detectors.

Characterization of linear optical interferometer
In this section, we describe our procedure to characterize linear optical interferometers. The outline of this section is as follows. Subsection 3.1 describes the experimental data required by our characterization procedure. This experimental data are processed by various algorithms to determine the transformation matrix (8). The algorithm to determine the amplitudes {α ij } of the transformation-matrix elements is presented in Subsection 3.2. In Subsection 3.3, we describe the calibration of the source field by determining the mode-matching parameter γ. The estimation of {θ ij } using two-photon interference is detailed in Subsection 3.4. Maximum-likelihood estimation is employed to find the unitary matrix U that best fits the calculated {α ij }, {θ ij } values and serves as the representative matrix (8). We discuss the calculation of the best-fit unitary representative matrix in Subsection 3.5.

Experimental procedure and inputs to algorithms
Our characterization procedure relies on measuring (i) the spectral function f j of the source light, (ii) single-photon detection counts, (iii) two-photon coincidence counts from a beam splitter and (iv) two-photon coincidence counts from the interferometer. The measurement data constitute the inputs to our algorithms, which then yield the representative matrix. Before presenting the algorithms, we detail the experimental procedure and the inputs received by the algorithm in this subsection.
We characterize the spectral function f (ω i ) of the incoming light for a discrete set Ω = {ω 1 , ω 2 , . . . , ω k } of frequencies. The integer k of frequencies at which the spectral function is characterized is commonly equal to the ratio of the bandwidth to the frequency step of the characterization device. The characterized spectral function f (ω i ) is used to calculate the coincidence probabilities as detailed in Algorithm 1.
Algorithm 1 Coincidence: Calculates the expected coincidence rate for twophoton interference for a given 2 × 2 submatrix of an arbitrary SU(m) transformation. Input: Frequencies at which f 1 , f 2 are given.
Time delay values.
Mode-matching parameter of photon source. Output: • C : T → R + Two-photon coincidence probabilities correct up to multiplicative factor. for τ in T do 3: Numerically integrate RHS of (12) over ω i , ω j with κ i = κ i = ν j = ν j = 1. return C 6: end procedure The amplitudes {α ij } are determined by impinging single photons at the interferometer and counting single-photon detections at the outputs. Single-photon counting is repeated multiple (B ∈ Z + ) times in order to estimate the precision of the obtained {α ij } values. Specifically, the number of single-photon detection events are counted at all m output ports {i} for single photons impinged at the j-th input ports in the b j -th repetition. The counting is then performed for each of the input ports j ∈ {1, . . . , m} of the interferometer. Algorithm 2 uses N ijb j , b j ∈ {1, . . . , B} values to estimate α ij and the standard deviation of the estimate. The experimental setup for {α ij } measurement is depicted in Figure 2.
Arguments {θ ij } are calculated by fitting curves of measured coincidence counts to curves calculated using measured spectra according to (12). Appendix B elucidates the inputs and outputs of the curve-fitting procedure, such as the Levenberg-Marquardt algorithm [41,42], employed by our algorithms. Before calculating {θ ij }, we calibrate the source field for imperfect mode matching by measuring coincidence counts on a beam splitter of known reflectivity. Controllably delayed single-photon pairs are incident at the two input ports of the beam splitter and coincidence counting is performed on the light exiting from its two output ports. Algorithm 3 details the estimation of γ using coincidence counts C cal (τ ) for time delay τ between the incoming photons.
The absolute values and the signs of the arguments {θ ij ∈ (−π, π]} are calculated separately. To estimate the absolute values {|θ ij |} of the arguments, pairs of single photons are incident at two input ports 1 and j ∈ {2, . . . , m} and coincidence measurement is performed at two output ports 1 and i ∈ {2, . . . , m}. The choice of the input and output ports labelled by index 1 is arbitrary. The signs of the arguments are estimated using an additional (m − 1) 2 coincidence measurements. Algorithm 6 details the choice of input and output ports for estimating {sgn θ ij }. A schematic diagram of the experimental setup for {θ ij } estimation is presented in Figure 3.

Single-photon transmission counts to estimate {α ij } (Algorithm 2)
Now we present our procedure to estimate {α ij } values using single-photon counting.
Single-photon transmission probabilities are connected to the amplitudes {α ij } according to the relation The amplitudes {α ij } are determined by estimating transmission probabilities. The probabilities P 11 , P i1 , P 1j , P ij of single-photon detection at output ports 1, i when single photons are incident at input ports 1, j are expresses in terms of the α ij values according to The probabilities P 11 , P i1 , P 1j , P ij are estimated by counting transmitted photons. The definition (8) of α ij implies that α 11 = α i1 = α 1j = 1. Hence, the values of α ij are connected to the single-photon transmission probabilities according to which is independent of the losses at the input and the output ports. The transmission probabilities P ij are estimated by counting transmitted photons as follows. The estimated values of {α ij } are random variables that are amenable to random error from under-sampling and experimental imperfections. Thus, data collection is repeated multiple times. For accurate estimation of α ij and its standard deviation δα ij , the number B of repetitions is chosen such that the standard deviation of Algorithm 2 AmplitudeEstimation: Uses single-photon detection counts to calculate the amplitudes of the complex entries of the transformation matrix.• represents our estimate of •. Input: • m ∈ Z + , Number of modes of interferometer.
Number of times single-photon counting is repeated . Output: The probabilities P ij are estimated by counting single-photon detection events. Suppose N ijb j photons are transmitted from input port j to the detector at output port i when N b j photons are incident and b j ∈ {1, . . . , B}. For large enough B, the transmission probability converges according to Likewise, the amplitudes {α ij } are estimated by averaging the single-photon detection counts according to The estimate of α ij relies on single-photon counts measured by impinging photons at the first input port repeatedly (repetition index b 1 ∈ {1, . . . , B}) and independently at the j-th input port (with repetitions labelled by a different index b j ∈ {1, . . . , B}). Henceforth, we represent our estimate of any parameter • by•. The estimatẽ α ij calculated using (18) is independent of N b j and thus resistant to variations in the incident-photon number N b j over different input modes j and different repetitions b j . Thus, our estimates {α ij } are accurate in the realistic case of fluctuating light-source strength and coupling efficiencies.
Finally, the standard deviations σ(α ij ) of our estimates are calculated according to which converges for a large enough B. In line with standard nomenclature, we refer to these standard deviations as error bars. Algorithm 2 details the estimation of {α ij } and error bars on the obtained estimates.

Calibration to estimate mode-matching parameter γ (Algorithm 3)
In this subsection, we describe the procedure to calibrate our light sources for imperfect mode matching. The mode-matching parameter γ is estimated using one-and two-photon interference on an arbitrary beam splitter. First, the reflectivity of the beam splitter is determined using single-photon counting [33]. Next, controllably delayed photon pairs are incident at the beam splitter inputs and coincidence counting is performed on the beam splitter output . We introduce a curve-fitting procedure to estimate the value of γ such that (12) best fits the measured coincidence counts. The beam-splitter reflectivity, which is denoted by cos ϑ, is estimated as follows. A beam splitter of reflectivity cos ϑ effects the 2 × 2 transformation U bs = cos ϑ i sin ϑ i sin ϑ cos ϑ.
which is in the form of (8) with α 22 def = cot 2 ϑ. The value of α 22 is estimated using singlephoton counting as described in Algorithm 2. The estimated beam-splitter reflectivity is The error bar on cosθ is estimated by repeating the photon counting along the lines of Algorithm 2.

Algorithm 3 Calibration
Calculates the mode-matching parameter γ of sourcefield using a beam splitter of known reflectivity. Input: Frequencies at which f 1 , f 2 are given.

2:
A ← {cos ϑ, sin ϑ, sin ϑ, cos ϑ} Beamsplitter of reflectivity R (20) 3: Beamsplitter of reflectivity R (20) 4: Least-squares curve fitting to obtain the value of γ that minimizes . The argument 1/C cal (τ ) is the weight function [44] that accounts for experimental noise, which is assumed to be proportional to C(τ ). Ignore values of τ at which C(τ ) = 1. Appendix B details the choice of initial guesses to the algorithm. 6: end procedure Next we estimate γ using two-photon coincidence counting. Controllably delayed pairs of photons are incident at the two input ports of the beam splitter. Coincidence measurement is performed at the output ports for different values of time delay between the two photons. A curve-fitting algorithm is employed to find the best-fit value of γ, i.e., the valueγ that minimizes the squared sum of residues between the measured counts and the coincidence counts expected from (12) for the beam splitter matrix (20). Algorithm 3 details the calculations ofγ, which is used to estimate {θ ij } values accurately.
3.4. Two-photon interference to estimate {θ ij } (Algorithms 4-6) In this subsection, we describe our procedure to estimate the arguments {θ ij } of the representative matrix U (8). Our procedure requires the measurement of coincidence counts for 2(m − 1) 2 different choices of input and output ports. Of these measurements, (m − 1) 2 are used to estimate the absolute values {|θ ij |} of the arguments and the remaining (m − 1) 2 are used to estimate the signs {sgn θ ij }.
The absolute values {|θ ij |} are estimated as follows. Single-photon pairs are incident at input ports 1 and j and coincidence measurements are performed at output ports 1 and i for i, j ∈ {2, . . . , m}. The state (3) of a photon pair is transformed under the action of the 2 × 2 submatrix of U labelled by the rows 1 and i and columns 1 and j. The probability of detecting a coincidence at the output ports 1, i is which is obtained by setting i = j = 1 in (12). The measured coincidence counts are used to estimate the value of |θ ij | as follows. The shape of the coincidence-versus-τ curve (23) depends on the values of α ij and θ ij . The shape does not depend on the parameters κ 1 , κ i , λ 1 , λ i , µ 1 , µ j , ν 1 , ν j , which lead to a constant multiplicative factor to the coincidence expression. Furthermore, the shape is unchanged under the transformation θ ij → −θ ij for θ ij ∈ (−π, π] if the spectral functions are identical. Hence, |θ ij | can be estimated using the shape of the coincidence function (23) and the values {α ij } estimated using Algorithm 2. A curve-fitting algorithm estimates the value |θ ij | ∈ [0, π] that best fits the measured coincidence counts. The calculation of {|θ ij |} is detailed in Algorithm 4.
Our procedure computes the signs by using an additional (m − 1) 2 coincidence measurements. First we arbitrarily set θ 22 as positive sgn θ 22 = 1 (24) because of the invariance of one-and two-photon statistics under complex conjugation U → U * [33]. The signs of the remaining arguments {θ ij } are set using the coincidence Expectation values of Fock-state projection measurement with Fock-state inputs are unchanged under U → U * if the spectral functions are equal f 1 (ω) = f 2 (ω). Otherwise, the sign of −α 22 can be ascertained using the difference in the τ > 0 and τ < 0 coincidence counts in C 2,2,1,1 (τ ). Algorithm 4 Argument2Port: Calculates the unknown complex argument in the entries of a 2 × 2 transformation using a two-photon coincidence curve. Input: Three complex arguments of submatrix.

• γ
Mode-matching parameter of photon source. Output: • |θ ij | Estimated magnitude of the unknown complex argument.
Set of three known phases and one unknown phase. 3: . 5: end procedure counts between output ports {i, i } when photon pairs are incident at input ports {j, j } for a suitable choice of {i , j } as we describe below. The coincidence probability at the output ports i, i is where Curve fitting is employed to estimate the value of β ii jj that best fits the measured coincidence counts. The estimated value of β ii jj is employed by Algorithm 5 to ascertain the sign of θ ij . Algorithm 5 relies on the identity and on known values of to ascertain the sign of θ ij . If the sign of θ ij is positive, then β ii jj = β + ii jj and (27) returns a positive sgn θ ij . Otherwise, β ii jj = β − ii jj , in which case (27) gives a negative sign.
In summary, sgn θ ij is determined using the values of β ii jj , which are estimated by curve fitting, and of β ± ii jj , which are computed using the signs and amplitudes of θ ij , θ i j , θ i j . Algorithms 4-6 detail the step-by-step procedure to determine the absolute values and the signs of {θ ij }.
For certain interferometers U , the ordering of indices ii jj depicted in Figure 5 can lead to instability in the characterization procedure. Appendix A elucidates on this instability and presents strategies to counter the instability. This completes our procedure to characterize the matrix A for representative matrix U = LAM . In the next subsection, we present a procedure to estimate the matrix that is most likely for the characterized matrix A.
The experimentally determinedÃ is different from the actual A because of random and systematic error in the experiment, where we denote the experimentally determined   (33) forÃ (rather than A) differ from the actual L and M respectively. The estimatedŨ =LÃM is thus a non-unitary matrix and is not equal to U in general. Furthermore,Ũ is a random matrix, which depends on the random errors in the one-and two-photon experimental data.
We employ maximum-likelihood estimation to calculate the unitary matrix W that best fits the collected data. First, bootstrapping techniques are used to estimate the probability-density function (pdf) of the entries of the random matrixŨ [46,47]. Next, standard methods in maximum-likelihood estimation [48] are employed to find the unitary matrix W . Maximum-likelihood estimation simplifies under the assumption that the error onŨ is a Gaussian random matrix ensemble, i.e, that the matrix entries Ũ ij are complex independent and identically distributed (iid) Gaussian random variables centred at the correct matrix entries. In this case, the most likely unitary matrix W is the one that minimizes the Frobenius distance ¶ fromŨ [49]. The unitary matrix minimizes the Frobenius-norm distance fromŨ [50]. Thus, if the random errors {U ij −Ũ ij } in the matrix elements are iid Gaussian random variables with mean zero, then W is the best-fit unitary matrix. Figure 6 is a depiction of the actual, the estimated and the most likely transformation matrices. Algorithm 7 computes W . This completes our procedure to estimate the most-likely unitary matrix W that represents the linear optical interferometer. In the next section, we present a procedure to estimate the error bars on the entries of the estimated representative matrix W accurately.  Figure 6: A depiction of the error in reconstruction of the interferometer matrix U . The matrix U represents the unitary transformation effected by the interferometer.Ũ is the complex-valued transformation matrix returned by the reconstruction procedure. Algorithm 7 returns W , which represents the unitary matrix that is most likely to have generated the data collected in the characterization experiment. ¶ The Frobenius norm of a matrix A m×m is defined as

SU(m)
The Frobenius-norm distance between matrices U and V is defined as and is a symmetric, positive-definite and subadditive distance function on the set of matrices.

Bootstrapping to estimate error bars (Algorithm 8)
In this section, we present a procedure to estimate the error bars on the matrix entries {W ij } of the characterized representative matrix W . The entries {W ij } computed by Algorithms 1-7 are random variables because of random error in experiments. Obtaining accurate error bars on these random variables is important for using characterized linear optical interferometers in quantum computation and communication. Current procedures compute error bars under the assumption that Poissonian shot noise is the only source of error in experiment [21,23]. We choose to employ bootstrapping on the data determine error bars [46,47,[51][52][53]. Monte-Carlo simulation is widely used but this technique is not applicable here because the Poissonian shot noise assumption is not reliable given the presence of other sources of error some of which are not understood. Bootstrapping is preferred because the nature of the error need not be characterized and instead relies on random sampling with replacement from the measured data. Bootstrapping can be employed toyield estimators such as bias, variance and error bars.
Algorithm 8 calculates the error bars σ(W ij ) using estimates of the {W ij } pdf's, which are obtained using bootstrapping as follows. The algorithm simulates N characterization experiments using the one-and two-photon data, i.e., the inputs to Algorithms 1-7. In each of the N rounds, the one-and two-photon data are randomly sampled with replacement (resampled) to generate simulated data. The data thus simulated are given as inputs to Algorithms 1-7, which return the simulated representative matrices The pdf's of the simulated-matrix entries {W b ij : b ∈ {1, . . . , N }} converge to the pdf's of the respective elements {W ij } for large enough N [54,55].
The simulated data are obtained in each round by resampling from the one-and two-photon experimental data as follows. Single-photon detection counts are simulated by resampling from the set {N ijb j : b j ∈ {1, . . . , B}} of experimental detection counts (Line 17 of Algorithm 8). Two-photon coincidence counts are simulated by shuffling residuals obtained on curve-fitting experimental data. Specifically, the algorithm (Line 12) resamples from the set of residuals obtained by fitting experimentally measured coincidence counts to function C ii jj (τ ) (12). The resampled residuals are added to the fitted curve to generate the simulated data (Line 14) + . Algorithms 1-7 are used to obtain the simulated elements + The pdf of the residuals is different for different values of τ . We assume that the pdf's for different τ are of the same functional form, albeit with different widths. The distribution of the residuals for different values of τ are determined using standard methods for non-parametric estimation of residual distribution [56,57]. Algorithm 8 normalizes the residuals before resampling from the residual distribution.
W ij of the representative matrix. Finally, the error bars on the {W ij } are estimated by the standard deviation of the pdf of the elements. This completes the characterization of representative matrix W and the error bars on its elements. The next section details a procedure for the scattershot characterization of the interferometer to reduce the experimental time required for characterizing a given interferometer.

Scattershot characterization for reduction in experimental time
In this section, we present a scattershot-based characterization approach to effect a reduction in the characterization time [58,59]. Our scattershot approach reduces the time required to characterize an m-mode interferometer from O (m 4 ) to O (m 2 ) with constant error in the interferometer-matrix entries.
The straightforward approach of characterization involves coupling and decoupling light sources successively for each one-and two-photon measurement. In contrast, the scattershot characterization relies on coupling heralded nondeterministic single-photon sources to each of the input ports of the interferometer and detectors to each of the output ports. Controllable time delays are introduced at two input ports, which are labelled as the first and second ports. All sources and detectors are switched on and the controllable time-delay values are changed first for the first port and then for the second port.
Single-photon data are collected by selecting the events in which exactly one of the heralding detectors and exactly one of the output detectors register a photon simultaneously. Two-photon coincidence events at the outputs are counted when two heralding detectors register photons. The controllable time delays introduced at the first and second input ports ensure that each of the 2(m − 1) 2 coincidence measurements is performed. Note that our characterization procedure (Algorithms 1-8) yields accurate estimates of interferometer parameters even when photon sources with different spectral functions are used. In summary, the required characterization data are collected by selectively recording one-and two-photon events. The setup for the scattershot characterization of an interferometer is depicted in Figure 7. Now we quantify the experimental time required in the characterization of a linear optical interferometer. Our characterization procedure requires Bm 2 single-photon counting measurements and 2(m−1) 2 coincidence-counting measurements to characterize an m-mode interferometer. We estimate the time required for each of these measurements such that random errors in the A ← {cos ϑ, sin ϑ, sin ϑ, cos ϑ}, Φ ← {0, π/2, π/2, 0} 3: Assumption: Residuals cal (τ ) pdf width ∝ C fit (τ ).

5:
10: end for 11: for n = 1 to N do

18:
ShuffledNormalResiduals ii jj (τ ) ← |T | entries in NormalResiduals ii jj (τ ) 19: respectively. More photons need to be incident at the interferometer input ports to offset this decrease in transmission probabilities. Therefore, maintaining a constant standard deviation in the {α ij } and {θ ij } measurements requires O (m) and O (m 2 ) scaling respectively in the number of incident photons, which amounts to an overall O (m 4 ) scaling in the experimental time requirement. Scattershot characterization allows (m − 1) 2 different sets of the one-and two-photon data to be collected in parallel thereby reducing the time required to characterize the interferometer by a factor of (m − 1) 2 . The overall time required for the characterization decreases from O (m 4 ) to O (m 2 ) if the scattershot approach is employed.
Our analysis of scattershot characterization assumes that the coupling losses are small and that weak single-photon sources are used, i.e., that the probability of multiphoton emissions from the heralded sources is small as compared to single-photon emission probabilities. These assumptions are expected to hold for on-chip implementations of linear optics that have integrated single-photon sources and detectors.
Light sources used at each input port in our scattershot-based characterization procedure differ spectrally in generally. Our characterization procedure is accurate despite this difference because we measure source-field spectra and using these data in the curve-fitting procedure.
We have developed the scattershot approach which has advantages and disadvantages but on balance is a superior experimental approach to consecutive measurement. The advantage is that the time requirement for characterization if reduced by a factor that scales as O (m 2 ). The disadvantage is the overhead of requiring one source at each input port and one detector at each output port. The disadvantage is not daunting because these requirements are commensurate with other active investigations of QIP such as LOQC and scattershot BosonSampling. In fact, state of the art implementations [59] meet our increased requirements for scattershot characterization.

Summary of procedure and discussions
In this section, we summarize our characterization procedure for a less formally-oriented audience. We describe the processing of the collected experimental data by the various algorithms presented in Section 3. We compare our procedure with the existing procedure for the characterization of linear optics using one-and two-photons [33]. We provide numerical evidence that our characterization procedure promises enhanced accuracy and precision even in the presence of shot noise and mode mismatch.
The experimental data required by our procedure to characterize an m-mode interferometer includes the following one-and two-photon measurements. The number N ijb j (13) of single-photon detection events is counted at the j-th output port when single photons are incident at the i-th input port. This single-photon counting is repeated B times for each of the input ports and output ports, where B is chosen such that the cumulants of the set {N ijb j : b j ∈ {1, . . . , B}} converge. The single-photon counts {N ijb j } are received by Algorithm 2, which returns the {α ij } (8) estimates using Eq. (18).
The spectral function f j (ω) (2) of the light incident at each input port j is measured. This function is used by Algorithm 1 to calculate the expected two-photon coincidence curves using Eq. 12. Fitting experimental data to these coincidence curves yields an accurate estimate of the mode-matching parameter during calibration and the arguments {θ ij } in the argument-estimation procedure. Thus, the spectral function f j (ω) serves as an input to the algorithms for the estimation of the mode-matching parameter and of the arguments {θ ij } (Algorithms 3-6).
The mode-matching parameter γ is estimated by performing coincidence measurement on a beam splitter that is separate from the interferometer but is constructed using the same material. First, we use single-photon data to estimate the reflectivity cos ϑ of the beam splitter according to Eq. (21). Imperfect mode-matching changes the shape of the coincidence curve, and we find γ by comparing the shapes of (i) the curve expected for reflectivity cos ϑ and (ii) the curve obtained experimentally. The estimated beam splitter reflectivity, the measured spectra and the coincidence counts are received as inputs by Algorithm 3, which returns an estimate of γ.
Algorithm 6 uses two-photon coincidence counts to estimate the arguments {θ ij }. Coincidence counts are measured for the input ports j, j and output ports i, i for the 2(m − 1) 2 sets Bootstrapping is employed to test the goodness of fit between the experimental curve and expected curves [61]. Experiments [11,36] can employ bootstrapping instead of the incorrect χ 2 -confidence measure to test if the data are consistent with quantum predictions or with the classical theory.
Finally, we recommend a scattershot approach for reducing the experimental time required to characterize interferometers. The approach involves coupling heralded nondeterministic single-photon sources at each of the input ports and single-photon detectors at each of the output ports. All the sources and the detectors are switched on in parallel. Single-photon counts are recorded selectively as two-photon coincidences between the heralding detectors and the output detectors, while two-photon events are recorded when two heralding detectors and two output detectors record photons. Controllable time delays are introduced at the first and second input ports so coincidences between each of the 2(m − 1) 2 choices (39) of input and output ports are recorded. The scattershot approach reduces the experimental time required to characterize an m-mode interferometer from O (m 4 ) to O (m 2 ). Now we compare and contrast our procedure with the Laing-O'Brien procedure [33]. Our procedure is inspired by the Laing-O'Brien procedure in that it employs (i) a 'real-bordered' parameterization (8) of the representative matrix and modelling of linear losses at the interferometer ports, (ii) a ratio of single-photon data to estimate the complex amplitudes of the matrix elements and (iii) an iterative procedure that uses two-photon data to estimate the amplitudes of the complex arguments and to estimate the signs of the complex arguments.
Our procedure differs from the Laing-O'Brien [33] procedure in that we use averaged value (18) Figure 8: The fitting of coincidence data to curves obtained from spectral functions using (12) and to Gaussian functions. Coincidence counts are simulated using experimentally measured spectra.
Another advance in our method is the curve-fitting procedure for estimating complex arguments of interferometer matrices. The Laing-O'Brien procedure requires coincidencecurve visibilities to estimate complex arguments α ij . Whereas the Laing-O'Brien procedure recommends coincidence probabilities be measured at zero time delay and also at time delays large as compared to the temporal spread of the wave-packet, in practice, current implementations determine the visibilities by fitting experimental data to Gaussian curves [35,[63][64][65][66][67]. These implementations are flawed because source spectra differ from Gaussian in general. Our procedure is accurate because the data are fit to curves computed from spectral functions, rather than fitting to Gaussians. Figure 8 illustrates the distinction between fitting experimental coincidence counts to the coincidence function (12) simulated using spectra and fitting to Gaussian functions. Figure 9 demonstrates the increase in accuracy and precision of characterization by using the correct curve-fitting function.
We introduce the calibration subroutine, which relies on the estimation of the mode mismatch in the source field. Spatial and polarization mode mismatch is not an issue of major concern in waveguide-based interferometers, which typically operate in the single-photon regime. In these interferometers, the calibration step of our procedure can be neglected without decreasing accuracy. The mode-mismatch parameter γ, which is an input of the curve-fitting procedure, is set to unity.
In the context of bulk-optics, our calibration step ensures accuracy and precision if (i) γ is identified as the maximum-possible source overlap in the spatial and polarization degrees of freedom and (ii) the experimentalist adjusts the setup to maximize coincidence One-and two-photon interference data was simulated for a five-channel interferometer using experimentally measured spectra and simulated Poissonian shot-noise. Characterization was performed by fitting coincidence curves to Gaussians (red curve) and to correct curves according to our procedure (blue curve).
matlab code for the simulations depicted in this figure is available on GitHub [62] visibility for the calibrating beam splitter and for each choice of interferometer inputs ports. Such an adjustment will ensure that the source overlap acquires its maximumpossible value γ in each of the coincidence-curve measurements. This maximum value is a property of the sources used and is independent of source alignment and focus so is expected to remain unchanged between different confidence measurements. Figure 10 demonstrates the increase in accuracy and precision of characterization by using the calibration procedure Other advances made in our characterization procedure over existing procedures include (i) a maximum likelihood estimation approach to determine the unitary matrix that best fits the data (ii) a bootstrapping based procedure to obtain meaningful estimates of precision and (iii) a scattershot-based procedure to improve the experimental requirements of characterization.

Conclusion
In conclusion, we devise a one-and two-photon interference procedure to characterize any linear optical interferometer accurately and precisely. Our procedure provides an algorithmic method for recording experimental data and computing the representative transformation matrix with known error. The procedure accounts for systematic errors due to spatiotemporal mode mismatch in the source field by means of a calibration step and corrects these errors using an estimate of the mode-matching parameter. We measure the spectral function of the incoming light to achieve good fitting between the expected and measured coincidence counts, thereby achieving high precision in characterized matrix elements. We introduce a scattershot approach to effect a reduction in the experimental requirement for the characterization of interferometer. The error bars on the characterized parameters are estimated using bootstrapping statistics.
Bootstrapping computes accurate error bars even when the form of experimental error is unknown and is, thus, advantageous over the Monte Carlo method. Hence, our bootstrapping-based procedure for estimating error bars can replace the Monte Carlo method used in existing linear-optics characterization procedures. We open the possibility of applying bootstrapping statistics for the accurate estimation of error bars in photonic state and process tomography.

Appendix A. Removal of instability in characterization procedure
In this section, we describe an instability in our characterization procedure, which can yield large error in the {W ij } output for small error in the experimental data C exp ii jj (τ ) in case of certain interferometers W . We present a strategy to circumvent this instability by means of collecting and processing additional experimental data.
The instability in the characterization procedure arises because of an instability in estimation of {sgn θ ij } (Algorithm 5). Small error in the measured coincidence counts can lead to the wrong inference of sgn θ ij , which can lead to a large error W − U in the characterized matrix W . Recall that Algorithm 5 uses the identity ii jj (27) to determine the sign of the arguments, where β ± ii jj def = |θ i j −θ ij −θ i j ±|θ ij || and the values of β ii jj , θ i j , θ i j , θ ij , |θ ij | are estimated by curve fitting.
Random and systematic error in measured coincidence counts can lead to estimate of variables β ii jj , θ i j , θ i j , θ ij , |θ ij | differing from their actual values. The estimation of sgn θ ij is unstable if the θ i j − θ i j − θ ij term (27) is close to 0 or π because, in this case, a small error in the β ii jj estimate can lead to an incorrect sgn θ ij estimate. In other words, the sign estimates are unstable if the values of are small compared to the error in our β ii jj , θ i j , θ i j , θ ij , |θ ij | estimates. We mitigate the sign-inference instability by making two modifications to our characterization procedure; the first modification removes instability from the signinference of the second row and second column elements whereas the second modification prevents incorrect inference of the remaining signs. The inference of {sgn θ i2 }, {sgn θ 2j } (Lines 14-17, Figures 5b, 5c) is unstable if is small as compared to the error in the β i2j2 , θ 22 , θ 2j , θ i2 , |θ ij | estimates. Hence, we relabel the interferometer ports such that θ 22 is as far away from 0 and π as possible. Specifically, after the amplitudes of the phases have been estimated (Line 8 of Algorithm 6), we choose i, j for which |θ ij − π/2| is minimum, and we swap the labels of input ports 2, j and output ports 2, i. We measure two-photon coincidence counts based on this new labelling and process it using Algorithm 6. The instability in the procedure for estimation of the {θ i2 }, {θ 2j } signs is removed as a result of the relabelling. The second modification is aimed at removing the instability in the remaining signs. The procedure estimates the remaining signs by using {C exp ii jj (τ )} values for i = j = 2. The estimation of θ ij is unstable if θ ref i2j2 is small as compared to the error in the β i2j2 , θ 22 , θ 2j , θ i2 , |θ ij | estimates. We make a heuristic choice of a threshold angle θ T that accounts for the error in these variables, and we reject any sgn θ ij inferred using θ ref i2j2 ≤ θ T . Additional two-photon coincidence counting is performed and employed to estimate these values of θ ij , as detailed in the following lines that can be added to the algorithm to remove the instability if θ r i2j2 < θ T then 3: Choose i = 1, i and j = 1, j such that |θ i j − θ ij − θ j j | is closest to π/2.

4:
C exp ii jj (τ ) ← Coincidence counts for input ports j, j and output ports i, i .

Appendix B. Curve-fitting subroutine
Our characterization procedure employs curve fitting in Algorithm 3 to estimate the mode-matching parameter γ and in Algorithms 4-6 to estimate {θ ij } values. The curvefitting procedure determines those values of unknown parameters that maximize the fitting between experimental and expected coincidence data. In this section, we describe  Figure B1: Simulated coincidence counts for output ports i, i and input ports j, j of interferometer with α ii = α i j = √ 3/4 and α ii = α ij = 1/4 and for different values of β ii jj . The value of β ii jj in each respective figure is (a) π, (b) 0, (c) π/3 and (d) 2π/3. The coincidence counts corresponding to τ = 0 and τ → ∞ are marked on each plot by C exp (0) and C exp (∞) respectively. the inputs and outputs of the curve-fitting subroutines. We present heuristics to compute good initial guesses of the fitted parameters.
The curve-fitting subroutine receives as input (i) the choice of parameters to be fitted; (ii) the coincidence counts {C exp ii jj (τ )}; (iii) an objective function, which characterizes the least-square error between expected and experimental counts; and (iv) the initial guesses for each of the fitted parameters. The output of the curve-fitting subroutine is the set of parameter values that optimize the objective function.
The first input to the subroutine is the choice of the parameters to be fit. The curve-fitting subroutine fits three parameters. One of these three (namely the modematching parameter γ in Algorithm 3 or the |θ ij | or β ii jj value in Algorithm 6) is related to the shape of the curve, whereas the other two are related to the ordinate scaling and the abscissa shift of the curve respectively. The ordinate scaling factor comprises the unknown losses {κ i , ν j }, transmission factors {λ i , µ j } and the incident photon-pair count. The horizontal shift factor accounts for the unknown zero of the time delay between the incident photons. The algorithm returns the values of the shape parameter, the abscissa shift and the ordinate scaling that best fit the given coincidence curves.
The objective function quantifies the goodness of fit between the experimental data and the parameterized curve. We use a weighted sum τ ∈T w(τ )|C exp (τ ) − C (τ )| 2 (B.1) of squares between the experimental data and the fitted curve as the objective function [44] for weighs w(τ ). We assume that the pdfs of the residues are proportional to C exp (τ ) and we assign the weights to the squared sum of residues. In case the pdf's of the residuals for different values of τ is not known, standard methods for non-parametric estimation of residual distribution can be employed to estimate the pdf's [56,57]. Thus, the curve fitting algorithm returns those values of the fitting parameters that that minimize weighted sum of squared residues between experimental and fitted data. The curve-fitting procedure optimizes the fitness function over the domain of the fitting parameter values. Like other optimization procedures, the convergence of curve fitting is sensitive to the initial guesses of the fitting parameters. The following heuristics give good guesses for the three fitting parameters. We guess the ordinate scaling as the ratio C exp (∞) C ii jj (∞) (B.3) of the experimental coincidence counts to the coincidence probability C ii jj (∞) for large (compared to the temporal length of the photon) time-delay values. The γ value is guessed for Algorithm 3 as the ratio of the visibility of the experimental curve to the expected visibility in the curve. The initial guesses for ϑ ≡ |θ ij | and ϑ ≡ β ii jj are based on the known estimate of γ and the visibility V = 2γ cos 2 ϑ sin 2 ϑ cos 4 ϑ + sin 4 ϑ .
(B.5) of the curve. As there are four kinds of curves (see Figure B1) possible for different values of the shape parameter (γ, |θ ij |, β ii j ), another approach is to perform curve fitting four times, each time with a value from the set π/4, 3π/4, 5π/4, 7π/4 of initial guesses and choose the fitted parameters that optimize the objective function. Finally, the initial value of the abscissa shift parameter is guessed such that the global maxima or minima (whichever is further from the mean of the coincidence-count values over τ ) of the coincidence curve is at zero time delay. In summary, the curve fitting procedure uses the measured coincidence counts, the objective function and the initial guesses to compute the best fit parameters. This completes our description of the curve-fitting procedure and of heuristics that can be employed to computed the initial guesses for the fitted parameters.