On unitary reconstruction of linear optical networks

Linear optical elements are pivotal instruments in the manipulation of classical and quantum states of light. The vast progress in integrated quantum photonic technology enables the implementation of large numbers of such elements on chip while providing interferometric stability. As a trade-off these structures face the intrinsic challenge of characterizing their optical transformation as individual optical elements are not directly accessible. Thus the unitary transformation needs to be reconstructed from a dataset generated with having access to the input and output ports of the device only. Here we present a novel approach to unitary reconstruction that significantly improves upon existing approaches. We compare its performance to several approaches via numerical simulations for networks up to 14 modes. We show that an adapted version of our approach allows to recover all mode-dependent losses and to obtain highest reconstruction fidelities under such conditions.


I. INTRODUCTION
Linear optical quantum computing has attracted major attention since Knill, Laflamme and Milburn have introduced a scheme for efficient quantum computation in 2001 [1]. Notwithstanding tremendous technological progress, the experimental realization of such computers is still challenging with steep requirements for the generation, manipulation and detection of the quantum states of light. In the last decade, integrated quantum photonics [2] has become central to the technological progress by providing the means to miniaturize and mass fabricate vital components, such as quantum light sources [3][4][5][6], quantum storage devices [7,8] and highly efficient photon detection on chip [9][10][11][12]. Of particular importance is the manipulation of the quantum states of light via linear optical networks (LONs). Here, integrated quantum photonics enabled the fabrication of LONs with unprecedented levels of interferometric complexity. The inventory includes optical elements, facilitating either fixed [13][14][15][16][17] or tunable [18][19][20][21] single-qubit transformations but also novel hybride elements [22]. Arranging several of these elements allows fabrication of miniaturized versions of logic gates which are essential in quantum information processing. Those gates resemble small to medium scale interferometers and are either tailored to perform a particular task or can be reconfigured for a variety of tasks [13,15,20,21]. Here, the physical encoding scheme, e.g. a dual-rail polarization encoding, determines how these gates, unitary matrices acting on logical qubits, must be compiled from different optical elements. Hence, the overall transformations of these LONs are required to be unitary, too. This poses a challenge for experimental realizations which are inevitably afflicted by imperfections. Here, the major contributors are on one hand losses, which render these devices non-norm preserving. On the other hand fabrication imperfections stemming from the production itself cause deviations between an initially targeted transformation and the implemented one. Obtaining precise knowledge of an optical transformation at hand therefore is important and serves multiple purposes: It allows theoretical modelling of an experiment and allocation of the error budget to different sources of error. More important, identifying deviations or even defective optical elements is essential for troubleshooting experiments and improving future versions of an optical circuit. The ability to construct high-fidelity gate operations becomes a stringent requirement when circuits are scaled up beyond proof of principle implementations [23]. Since all waveguides are embedded into a bulk material, only the input and output ports and therefore the overall transformation is directly accessible. To acquire knowledge of such an overall transformation one can use different approaches relying on quantum resources [24][25][26][27][28] or classical resources [29][30][31][32]. The majority of these techniques fall in the category of quantum process tomography ('QPT'). While QPT is well established in quantum science, it faces challenges when applied to large networks due to quickly growing resource costs. Alternative techniques, reconstructing a transition matrix [31] or a unitary matrix description [28] of a LON, became prominent in the context of BosonSampling [33][34][35][36][37]. These methods scale more favourably with respect to the required measurements when compared to QPT, and reconstruct matrix descriptions of LONs omitting global phases at the input and output ports. Methods which rely on light that is scattered out of the LON are not further considered as the technique relies on loss that compromises the guiding properties of the whole structure [35]. Here, we present a new approach to characterize the transformation implemented by LONs. We choose to enforce unitarity from the start by parametrizing the LONs as interferometers composed of beam splitters and phase shifters [38]. The reconstructed unitary matrices are then obtained by optimizing the beam splitting ratios and phase shifts to best explain a set of data sensed via probe states injected into the network. We utilize a data set composed of two-photon interference visibilities [39], rendering the procedure insensitive to input and output loss. In this way both afore mentioned purposes of network reconstruction are fulfilled simultaneously; generating a description to model experimental data and gathering knowledge about the transformation of individual optical elements. Our method is a departure from the strategy of related approaches [27,31], which aim to reconstruct transition matrices, or do so in a first step [28,40]. Whereas transition matrices are sufficient to characterize a LON and to model experimental results, they do not allow to identify deviations of individual elements directly. Here a indirect route via a polar decomposition [28,40] and subsequent decomposition into individual elements must be chosen. The identification of faulty elements through a decomposition procedure requires LONs exhibiting a non-redundant layout, e.g. Reck et al. type networks.

II. DIFFERENT APPROACHES TO RECONSTRUCTION
In the following we will compare three approaches to LON characterization. 'Brisbane' [31] and 'Bristol' [28] 1 were chosen for their frequent usage in experiments relying on integrated LONs. Our approach, subsequently labelled 'Vienna', is formalizing ideas developed in [36,41]. Similar to [40,42], an over-complete set of primary data can be used to increase reconstruction fidelities, although we find the effect to be minor (see figure 2). In the following, primary data is referring to data sensed for reconstruction purposes in any of the approaches. The three compared approaches differ in their strategy to characterize a LON. 'Bristol' and 'Brisbane' first reconstruct the individual matrix entries of a scattering description and then require a polar decomposition step to recover the closest unitary matrix. Both utilize a sufficient set of data for this purpose and in the following we refer to this strategy as a passive approach. In contrast 'Vienna' follows an active approach utilizing a larger set of primary data in a global optimization routine which already implements the unitary constrains from the beginning. The approach 'Brisbane' aims to reconstruct a description of a black-box linear optical network from a sufficient set of primary data generated with coherent probe states. This data is then mapped one to one unto a scattering matrix representation M without the need to apply any further algorithms. The scattering matrix M represents a submatrix of a larger unitary matrix U, where the additional modes of U correspond to loss modes. Hence, the task of loss modelling translates to the task of finding a loss matrix that couples the interferometer modes of M to the loss modes of U. This necessarily includes input and output loss terms, as the approach processes transition amplitude data. 'Brisbane' covers the case of mode-dependent input loss in the following way: the loss term for each input mode k is directly given by the ratio between the total power exiting the LON and the power injected into mode k. Subsequently the input loss is modelled by virtual beam splitters, where the square root of the loss terms corresponds to the transmittivity of the virtual beam splitters. Note that such a loss modelling works only in the case when the mode-dependent output loss of the network is zero. In general, loss inside a network cannot be parametrized this way and thus it remains unclear how a more evolved loss modelling can be included in 'Brisbane'. Hence, only a partially loss modelling scattering description M , which is closer to the unitary description U than the matrix M, can be found via this method. Experimental environments exhibiting loss as detailed above cause M to be noticeably non-unitary and necessitate a polar decomposition if the closest unitary descriptionŨ is to be obtained. This polar decomposition can introduce further error dependent on the size of the interferometer under test [43]. Up to the deviations introduced by the polar decomposition the entries ofŨ reconstructed via this procedure are identical to the primary data generated. This self-consistency proves to be challenging when assessing the fidelity and the uncertainty of a reconstructed matrix in the presence of measurement errors. The complete set of single-input data and phase data is already used to fix the real entries and phases of M. To obtain realistic error estimations for the individual entries of M orŨ, the various error sources need to be studied in detail, including calibration uncertainty of the detector efficiencies and uncertainties introduced by fiber mating and coupling to a waveguide. Opposed to more complex algorithmic approaches like 'Bristol' or 'Vienna', the sensed data cannot be used to generate a quantifier for reconstruction success. An additional set of data generated by different means than coherent states is required for this purpose, e.g. a set of two-photon interference visibilities as done in [31]. The approach 'Bristol' aims to reconstruct a unitary description of a black-box linear optical network from a set of primary data, which in this case is generated via quantum probe states. The magnitude of each phase is sensed via a visibility of a non-classical two-photon interference but the sign of the phase needs to be calculated in relation to the other phases such that unitary constraints are obeyed. Note that a phase sensed this way is not a direct phase measurement like in the approach 'Brisbane'. Furthermore the visibility of each non-classical interference is also modified by the four contributing real entries τ jk of the scattering matrix, with j and k labelling the output and input modes, respectively. These transition amplitudes are calculated in a fashion insensitive to mode-dependent input and output loss: the loss terms drop out by relating all input and output single-photon count rates to each other. Therefore each entry of the reconstructed matrix, M jk = τ jk e iθ jk , becomes dependent on the whole set of primary data. All τ jk and θ jk are recovered by solving a linear system of non-linear equations. Again the closest unitary matrix,Ũ, can be found by applying a polar decomposition. Whereas the method is insensitive to loss at the input and output ports of a LON, it is sensitive to mode-dependent propagation loss. The latter introduces systematic error in the algorithm, already before applying the polar decomposition (Further detail on the influence of mode-dependent propagation loss for all three reconstruction approaches is given in Appendix D). Ultimately, the combination of the non-linear dependencies in the algorithmic approach with just a sufficient set of data leads to a lower performance of 'Bristol' with respect to 'Brisbane' and 'Vienna'.  . Primary dataṼ,τ ,θ from a LON described by the unitary matrix U (p) is measured. In general this data will be error afflicted denoted by the tilde. The layout and initial parameters p are either known from fabrication (dashed arrow) or are obtained by a reconstruction step, e.g. 'Brisbane'. These initial parameters are now subjected to a global optimization using an, in the best case over-complete, set of primary data. Here the output yields both, the reconstructed unitaryŨ (p ) 'Vienna' and the parameters of the individual building blocksp .
Our new approach 'Vienna' aims to reconstruct the unitary matrix descriptions U (p) of a LON via a global optimization of optical element parameters, p, which is visualized in figure 1 as a flowchart. For purpose-built networks, e.g. quantum logic gates, the physical layout of the optical elements and their target parameters are defined by the encoding scheme and type of gate. If they are sufficiently precise, these target parameters can serve as initial guesses for p and therefore as the starting parameters in the optimization routine. Black-box m × m LONs can be represented by an arrangement of n = m 2 beam splitters and n − 1 phase shifters [38] (see the Appendix for a sketch). In our approach it is sufficient to obtain just one representation of the physical network decomposed in such a way. However, without any knowledge of the starting values the minimization of the pdimensional landscapes is prone to converge into some local minimum only. Hence the starting values need to be obtained by different means. Here we utilize one of the passive reconstruction approaches and find that the approach 'Brisbane' is better suited for this purpose than the approach 'Bristol'. Note that both approaches lead to reconstructed unitaries that are equivalent to the unitary decomposed via the Reck et al. scheme modulo diagonal phase matrices. These diagonal phase matrices do not affect the extraction of the starting parameters [44]. To reconstruct U(p) a set of primary data,Ṽ,τ ,θ, is recorded, whereṼ denotes the full set of two-photon interference visibilities andτ andθ denote the full set of normalized transition amplitudes and relative phases sensed via coherent states, respectively 2 . This data will be in general error afflicted indicated by the tilde.τ ,θ are only used in the case of black-box networks to obtain the initial starting parameters for the global optimization. Finally, a global cost function using an overcomplete set of two-photon interference visibilities,Ṽ, is minimized to obtain optimal reconstructed parameters p . These automatically yield the reconstructed unitarỹ U (p ) 'Vienna' . Optimizing a global cost function comes with an additional advantage: the minimum of that function can act as an direct estimator for the success of the reconstruction and is in our case identical to the χ 2 [45], allowing further statistical interpretation. We choose to utilize just two-photon interference visibilities for the reconstruction ofŨ (p ) 'Vienna' . These visibilities are insensitive to input and output loss, thusŨ (p ) 'Vienna' represents a unitary description modulo loss matrices at the input and output. The parameters of these loss matrices can be easily recovered by using loss sensitive data such as transition amplitude data,τ , and solving a system of linear equations utilizing the reconstructed descriptioñ U (p ) 'Vienna' .

III. RESULTS
We compare the reconstruction results for the different approaches 'Brisbane', 'Bristol' and 'Vienna' numerically for fully coupled m × m networks in the presence of per-2 Throughout the numerical evaluation the transition amplitude data is normalized such that m j=1 |τ jk | 2 = 1. turbance on the primary data. To quantify the performance of the different approaches the fidelity between the initially generated Haar-random unitary matrix, H m,j , and the reconstructed unitary matrixŨ m,j,σ,µ is calculated. We consider m × m networks with m = 4, . . . , 14, where µ labels the reconstruction approach and σ denotes the level of perturbance on the primary data (see Appendix A for further details). Losses are kept zero to allow for a fair comparison between loss sensitive and insensitive approaches. For each network size, 120 Haarrandom unitary matrices are generated (labelled by j) 3 to ensure that random properties of a jth unitary, e.g. symmetry, do not lead to biased results. The unitary descriptions are calculated via a Monte Carlo method drawing the data required for each reconstruction approach, µ, randomly from the set of perturbed data, (Ṽ,τ ,θ) m,j,σ . An average unitary description,Ũ m,j,σ,µ , is obtained after 120 iterations with σ(Ũ m,j,σ,µ ) denoting its standard deviation. We use the fidelity measure, which is normalized by the number of modes, m, such that it is insensitive to the network size. Here, . denotes the trace norm. For given m, σ, µ the resulting fidelity histograms are fitted with Weibull distributions centred around the most probable value,F m,σ,µ (see Appendix A). As an error measure, σ 1 e (F m,σ,µ ), the distances between the most probable fidelity and the two fidelities where the maximum probability decreased to 1 e are used. The most probable fidelities and their respective uncertainties for 200 different combinations of network size and perturbance on the primary data are recovered for each of the reconstruction approaches. Figure 2 shows a representative example, once for 12 × 12 networks and variable error on the primary data, σ, and once for an error of σ = 2.5% and variable network size. Clear differences between the approaches can be observed with respect to the overall performance, the scaling and the dispersive behaviour. Such differences must be attributed to specifics of the reconstruction algorithms as all approaches reconstruct the same unitaries H m,j .
The approach 'Brisbane' shows high reconstruction fidelities with low dispersion which scale linearly with the error, σ, on the primary data (figure 2(a)). Here, an upper bound to the deviation of the reconstructed unitaries, U m,j,σ , from the initial one, H m,j , in the Frobenius norm can be even given analytically: Where M and κ(M ) = M −1 · M denote the reconstructed matrix before applying a polar decomposition and its condition number, respectively. Hence the deviation, = M − H , stems from the polar decomposition, which in turn is due to the errors of the primary data that contribute in first order approximation as . The data shown in figure 2(b) indicates, that the performance of the approach 'Brisbane' is only slightly affected by the network size. The reconstruction fidelities yielded with the approach 'Bristol' are dominated by a (sub)exponential decay, both as a function of the error on the primary data and the network size. This also leads to a highly dispersive behaviour which is reflected in the comparably large error bars in figure 2. While the exponential decay is already  observed in the original publication [28], this just represents a phenomenological fit to the data. We conjecture that on one hand this scaling originates from the additional matrix inversion that has to be taken into account when the transition amplitudes are calculated. On the other hand, the primary data undergoes a more complex algebraic transformation, which, dependent on the noise of the primary data, can cause unfavourable amplification of perturbations. Reconstructing the unitary description of unknown m×m networks via the approach 'Vienna' or 'Vienna reduced' works with highest fidelity and minimal uncertainty. Here, 'Vienna reduced' denotes a variant of 'Vienna' utilizing a smaller set of primary data (see Appendix C for further information). This behaviour can be primarily attributed to the natural implementation of the unitary constraints in the algorithm. The statistical advantage of a full over-complete set of primary data as used in 'Vienna' over a smaller set of primary data as used in 'Vienna reduced' is noticeable, albeit being small. For both approaches we find that errors on the starting parameters are of greater impact than errors on the primary data sets of two-photon visibilities. This was tested via a separate numeric evaluation. All starting parameters are extracted using the approach 'Brisbane' and as a consequence the scaling with the error on the primary data, σ, is inherited for large σ. A better scaling is found for small σ, as more precise starting parameters increase the chance that the optimization routine will converge into the global minimum. Both, 'Vienna' and 'Vienna reduced', exhibit negligible dependence on the size of the m × m networks.

IV. DISCUSSION
Precise knowledge about the optical transformation of LONs is a requirement for the validation of experimental results against theoretical predictions in numerous experiments [20,[46][47][48]. For this purpose, the optical transformations can be given either in terms of scattering descriptions or unitary transformations, where the latter allow a decomposition into individual building blocks. Hence the element parameters for each optical element in the LON can be obtained. From a technological point of view this is beneficial as it enables the localization of erroneous elements. Here we present a new approach, 'Vienna', and compare it to two prominently used approaches, 'Brisbane' and 'Bristol'. We investigate all approaches for the regime of zero mode-dependent loss and quantify the differences in reconstruction performance of unitary descriptions via an extensive numerical evaluation: more than 10 5 m × m black-box networks are sampled for distinct m from primary data exhibiting various levels of perturbance. The results substantiate that the direct implementation of the unitary constraints, as done in the approach 'Vienna', are of advantage for highest reconstruction fidelities. Two-photon interference visibilities play a unique role as they are a priori insensitive to input and output loss and allow to obtain over-complete sets of primary data, which are beneficial for highest reconstruction precision. To quantify the performance of the different reconstruction approaches a fidelity between the initially generated Haar-random unitary matrix, H m,j , and the reconstructed unitary matrixŨ m,j,σ,µ is calculated. Here µ denotes the reconstruction approach and σ the level of perturbance on the primary data. For each network size, 120 Haar-random unitary matrices are generated (labelled by j) to ensure that random properties of a jth unitary, e.g. symmetry, do not lead to biased results. For 'Bristol' always j = 1000 matrices are sampled due to the dispersed results. Subsequently the full set of primary data, (V, τ , θ) m,j , is computed from each H m,j , where V, τ , and θ denote the sets of two-photon visibilities, transmission intensities and phases sensed via coherent states, respectively. Under experimental conditions the primary data sets would be afflicted by statistic and systematic noise. We mimic this by perturbing the primary data sets with noise drawn from a normal distribution N (0, σ 2 ) of standard deviation σ centred around zero. The perturbed primary data distributions are given as (Ṽ,τ ) m,j,σ = (1 + N (0, σ 2 )) (V, τ ) m,j and 20 different values of σ are sampled in 0.5% steps from σ = 0.5% to σ = 10%. Note that for the phase data ,θ m,j,σ = θ m,j + N (0, σ 2 ), absolute perturbances were chosen. Eventually the unitary descriptions are calculated via a Monte Carlo method drawing the data required for each reconstruction approach, µ, randomly from (Ṽ,τ ,θ) m,j,σ . An average unitary description,Ũ m,j,σ,µ , is obtained after 120 iterations with σ(Ũ m,j,σ,µ ) denoting its standard deviation. This way errors are estimated via an identical procedure, independent of whether an analytic error propagation method is available or not. Finally the fidelity (see eq. 1) for each of the ≈ 10 5 reconstructed unitaries is computed. A subset of the computed data is shown in Figure 3b), visualized as a two dimensional histogram for m = 4 and µ = 'Brisbane'. Here the data points along one row, i.e for a given perturbance σ, are composed of j = 1000 instead of j = 120 fidelities, for visualization purposes only. The absolute frequencies for a given σ can be associated with a probability distribution of a certain width, where the highest peak represents the most probable fidelity. For small perturbances σ those distributions will be in general sharp but asymmetric, whereas for larger σ dispersed and more symmetric distributions are found. To capture all but the most dispersed results we chose to fit them with a Weibull distribution centred around the most probable value,F m,σ,µ . As an error measure, σ 1 e (F m,σ,µ ), the distances between the most probable fidelity and the two fidelities where the maximum probability decreased to 1 e are used.  Figure 3. a) Flowchart for the numerical method used to evaluate the different reconstruction approaches. b) Frequency histogram of fidelities obtained in the case that the reconstruction approach 'Brisbane' is applied to 4 × 4 networks. The fidelity axis is divided into 50 bins ranging from 0.95 to 1, whereas the perturbance on the primary data, σ, ranges from 0.5% to 10% in 0.5% steps. For illustration purposes the sampling size is increased from 120 to 1000 4 × 4 Haar random unitaries for each σ.

Probability distributions
For the lossless case discussed above, Weibull distributions are used to extract the most probable fidelity, F m,σ,µ , and the 1 e errors of the fidelity, σ 1 e (F m,σ,µ ). The Weibull distribution is given as with λ and k the scale and shape parameter, respectively.

APPENDIX B: THE GENERATION OF PRIMARY DATA
An m-mode linear optical scattering network can imprint new amplitude and phase information on an impinging light field (see figure 4). Whereas a large class of classical and quantum light fields can be used with the integrated optical networks considered here, their main application lies in the manipulation of coherent states or single photon Fock-states. Likewise, both states of light are suited as probe states to sense the transition-amplitudes or phases imprinted by the network. In the case that the light source used for characterization differs from the light source used in an experiment care has to be taken that the physical properties, especially the frequency and frequency bandwidth, are kept identical. When injected into a single input port k of a network, both coherent-and single-photon Fock-states allow to sense the transition amplitude τ j,k of a specific matrix entry U j,k with j denoting an output mode 4 . However, intensities measured in this way will be affected by loss, coupling and detection efficiencies. Rudimentary techniques to directly measure input loss [31] work only in the case of zero output loss. Alternatively input and output loss can be traced out during the reconstruction process [28]. While this procedure is under ideal conditions loss-insensitive, the required algebraic transformations may even amplify error stemming from the primary data. Thus loss still presents a major problem when sensing transition-amplitudes and is best dealt with by careful calibration of e.g. detector efficiencies. This way a complete and reasonably accurate set of m 2 real entries of any m × m unitary can be sensed if modedependent propagation loss plays a secondary role. To sense the phases of a linear optical network coherent states can be distributed among two input modes k and l, |α 1 k |e iϕ α 2 l [31]. It is sufficient to choose l = 1 and subsequently measure the different input combinations k = 2 . . . m. Modulating the phase ϕ of this two-mode state at a frequency ω results in output intensities in all coupled output modes that are subjected to the modulation frequency ω albeit featuring a relative phase shift γ j,k between output modes. These relative phase shifts correspond to the phases γ j,k of the unitary matrix entry U j,k with all γ j,k = 0 for j ∨ k = 1 and an arbitrary sign for γ 2,2 . Omitting the intensities and only recording this relative phase renders the measurement insensitive to mode-dependent input and output loss. Experimentally the modulation ω can be realized through a piezo actuated mirror, a delay line or similar devices. Since coupling to integrated devices is predominantly implemented with fiber arrays the phase ϕ and modulation with frequency ω will be affected by fluctuations caused by temperature or vibrations [49,50]. Care has to be taken that such noise is kept below a threshold which still allows to identify a ω-periodicity in the output signal. In general the error can be largely minimized by utilizing a modulation frequency ω that is well separated from the frequency of the noise in the laboratory. An alternative technique to sense the phase information utilizes the non-classical interference of two photons [39]. It can be shown [28,42] that the extend of this quantum effect, the visibility of the resulting interference curve, is sensitive to the phases γ j,k of the interferometric network. Only the special case of a 2 × 2 device is phase-insensitive, owed to the unitary scattering submatrix. In general phase-sensitive probe states are also transition-amplitude sensitive and are therefore sufficient to generate all primary data needed to reconstruct the unitary description of a linear optical network. Remarkably, the visibilities obtained via two-photon interferences are insensitive to input and output loss. In contrast, sensing transition-amplitudes with two-mode coherent states generates the same problems as measuring transition amplitudes directly. Experimentally, non-classical interference visibility measurements are ideally implemented using a pure, separable bi-photon state, where each photon is injected into its own interferometer mode. Subsequently the distinguishability of the two photons is scanned, e.g. by altering the relative temporal delay ∆τ between the photons. Given that coupling to waveguides and propagation in waveguides is not lossless and that detection efficiency of e.g. avalanche photo diodes is limited, measurement times exceed those of coherent probe-states. Imperfections in the probe-state generation are the major contributes to systematic errors and affect the quality of the measured visibility. Using an involved modelling contributions from spectral mismatch and spectral correlations, background noise and drift in the coupling can be taken into account. Thus the accuracy of the extracted visibilities can be increased and errors minimized to ≈ 10 −2 . Complete sampling of all accessible visibilities, N V is = m 2 2 for a m × m network has several algorithmic and statistic advantages and can be reduced to N V is red = m 2 measurement runs if a sufficient number of detectors is available. In many quantum optical experiments generating the data via the non-classical interference of two photons has the additional benefit that the apparatus required for characterization of the network is a subset of the whole experimental apparatus and the procedure works 'in-situ'.  single-photon Fock-states injected into one mode of the network allow to measure the transmission and reflection intensity I1(τ1,1) and I2(τ1,2) (shown in red). Repeating the measurement via a second input port (shown in blue) allows to derive a loss-insensitive splitting parameter. Probing of non-global phases relies on interferometry which allows to extract such phases from intensity measurements. c) A coherent state is distributed among two input modes and the relative phase ϕ is e.g. linearly modulated with frequency ω. Here the phase γ1,2 of the interferometer manifests as the relative phase of the recorded intensity pattern. d) In the case of 2 × 2 scattering submatrices the Hong-Ou-Mandel Dip can be utilized to sense the phase γ1,2 via the visibility V is(γ1,2) of the two-photon interference curve, as these 2 × 2 scattering submatrices are in general non-unitary. In contrast, monolithic 2 × 2 blocks, i.e. a beam splitter, are unitary and hence the visibility is only affected by the splitting ratio.

APPENDIX C: SIZE OF THE PRIMARY DATA SET AND RESOURCE COST
Through technological progress the number of fully coupled modes supported by integrated circuits is growing and consequently, so is the size of the primary data sets required to characterize their unitary transformations. Thus a manageable size of these data sets is becoming an important criterion for evaluating different reconstruction approaches. Here, fully coupled interferometers represent the most general case. The lower bound of required data points is quantified by the number n min = 2 m 2 of spherical coordinates that parametrize such a m × m network, with the spherical coordinates corresponding to the beam splitting ratios and phase shifts of Reck et. al [38]. Note that both, 'Bristol' and 'Brisbane' aim to reconstruct the 2m 2 matrix entries directly, hence a set of n min data points is insufficient. The approaches 'Bristol' and 'Brisbane' utilize sufficient data sets that are close to this lower bound and consist of m 2 transition amplitudes and (m − 1) 2 data points to recover the non-trivial phases. In the case of 'Bristol' additional (m − 1) 2 − 1 two-photon visibilities are required to determine the sign of the phases. We chose the upper bound of primary data to be the over-complete set of all two-photon interference visibilities, n full = m 2 2 . This set of data can be efficiently recorded given todays bright single photon sources [51] but can be in principle expanded to even higher order correlation functions. Likewise the m 2 transition amplitudes represent a non-redundant set of data to expand n f ull . Since the transition amplitudes are loss afflicted they are not used in the global optimization routine of 'Vienna'. Only in the case that 'Vienna' is applied to black-box networks the m 2 transition amplitudes and (m−1) 2 relative phases are needed to extract the starting parameters for the global optimization. Furthermore this global optimization allows for an adaptive strategy; best reconstruction accuracy is achieved using the full over-complete set of data. Alternatively, a reduced set of data, the set of all possible two-photon interference visibilities that can be generated when one photon is always inserted into input port one and the second photon into input port 2, . . . , m, can be used. This results in a reduced set of n V iennamin = (m − 1) m 2 visibilities which always suffices to reconstruct m × m networks with m ≥ 3 modes. In the following we will refer to the reconstruction approach 'Vienna' utilizing a reduced set of primary data as 'Vienna reduced'. A second number, the required measurement runs to generate the primary data set, can be regarded as the experimentally more relevant parameter. We list this parameter for the reconstruction approaches compared here in the limit that every output mode is coupled to an individual detector in table I. Now, all output events for a given input combination can be recorded in parallel, thus the number of required measurements corresponds to the required input combinations that need to be consecutively aligned in a laboratory. For instance, the m 2 transition amplitude data can be acquired in m measurement runs and the m  Table I. Size of the primary data set and minimal number of measurements for different reconstruction approaches. The three compared unitary reconstruction approaches differ significantly in the minimal and maximal set of primary data available for the reconstruction algorithms. For 'Brisbane' the minimal and maximal set of data are identical. Whereas a larger primary data set is more costly to generate experimentally this expense is justified if the data can be used to increase the accuracy of the reconstructed description. The required number of measurement runs is given in the limit that each output mode is covered by its own detector and can be regarded as the experimentally more relevant quantity. Here 'Vienna' is referring to a reconstruction of a structure with known layout and starting values, while 'Vienna black-box ' refers to a black-box network. The minimal primary data set required for the approach 'Bristol' presents a special case. Here the amount of data is constituted by the m 2 transition amplitudes and the (m − 1) 2 and (m − 1) 2 − 1 two-photon visibilities to recover the absolute values and signs of the phases, respectively. In the case of sufficient detectors the transition amplitudes can be sensed in m measurement runs and the two-photon visibilities to recover the absolute value of the phases in m − 1 measurement runs.
To fix the sign of the phases additional m − 2 measurement runs are necessary.

APPENDIX D: THE INFLUENCE OF LOSS
Loss can be a major factor in experiments using integrated optical circuits. In the case of direct laser-written networks propagation loss of −0.3 dB cm and coupling loss of −2dB are typical values [22,52]. If these losses are mode-independent, however, they just represent a global loss term that commutes with the optical transformation of a LON. In contrast, mode-dependent losses cause deviations in a targeted optical transformation. Characterizing mode-dependent losses thus becomes important when reconstructing the optical transformation of a LON. Here, we distinguish between the case of mode-dependent loss at the input and output ports and the case of mode-dependent propagation loss. In the first case, loss can always be separated from the transformation of a LON and be described by loss matrices containing virtual beam splitters. For a m × m LON this translates to m additional parameters modelling input loss and m additional parameters modelling output loss, the α i and α o of figure 5, respectively. The unitary matrix of the LON expanded by the 2m loss modes now reads as with L denoting the 3m × 3m loss matrices. Still, only the original m input and output modes are experimentally accessible. Note that the data set for relative phases sensed via coherent states, θ, and the the data set composed of two-photon interference visibilities, V, are insensitive to these losses and hence directly reveal information about U m×m . Whereas θ just contains information on the non-trivial phases of the interferometer but not on the transition amplitudes, the set of two-photon interference visibilities contains information on both. This is a unique feature of the latter data set and owed to the quantum nature of the interference. In comparison, a direct measurement of the transition amplitudes τ is always afflicted by mode-dependent input and output loss. As a consequence, the latter set of data only reveals information on a m × m submatrix of U 3m×3m . In our notation this submatrix corresponds to the upper left m × m submatrix of U 3m×3m , that is spanned by the m accessible input and output ports. In general, the size of the transition amplitude data set τ is insufficient to reconstruct all the 2m loss terms in addition to the m 2 real entries directly. Therefore, the strategy used in 'Brisbane' only works if m loss parameters, either all α i or all α o , can be neglected (see also body text). Alternatively, the transition amplitude data can be subjected to a reconstruction algorithm in which the loss terms drop out, as done in 'Bristol', however this necessitates unitary constraints. Although the strategy of 'Bristol' seems to be of advantage, figure 2 gives a indication that the algorithm reacts fragile to measurement error on the primary data. In comparison 'Brisbane' achieves higher reconstruction fidelities. This may change in the presence of mode-dependent input and output loss. It is an open question where the threshold of measurement error on the sensed data opposed to the level of of input and output loss lies, that would favour one over the other algorithm. Due to the processing of just two-photon interference visibilities, 'Vienna' is insensitive to input and output losses. Hence, U m×m is directly reconstructed and the α i and α o can be obtained in a separate step by solving a system of linear equations using loss afflicted data, e.g. the set of transition amplitudes τ . In the case of black-box networks, however, some dependence is carried over if initial starting parameters, p, are obtained via, e.g. 'Brisbane'. In contrast to input and output loss, mode-dependent propagation loss cannot be separated from the unitary description of a LON, as it does not commute with the optical elements that constitute the network's fundamental transformation. Instead, the unitary description needs to be expanded by additional in-circuit loss modes, subsequently labelled l. This is illustrated in figure 5b) with m the number of accessible network-modes, r = 3m + l the total number of modes and L denoting the r × r input and output loss matrices. All data sets sensed for reconstruction purposes, τ , θ , V reveal information about an in general non-unitary submatrix of U r×r . In our notation, this submatrix corresponds to the upper left m × m submatrix of U r×r . Now, the above mentioned strategies to model loss fail. The loss terms cannot be assessed directly and error is introduced by applying unitary constrains to the non-unitary m × m submatrix. We numerically evaluate the exemplary case of m = 4 LONs to investigate how severe the reconstruction performance of the various approaches is offset by mode-dependent in-circuit loss. The general layout of these networks is shown in figure 5b). Here, the input and output losses are kept zero to ensure that they do not influence the results. Thus the networks are just expanded by l = 8 in-circuit loss modes. The approaches 'Brisbane' and 'Bristol' are constructed to obtain 4 × 4 descriptions which prevents the use of the fidelity measure defined in equation 1. Hence we use an alternative measure which is experimentally motivated and constructed as the mean deviation between a point of primary data and its prediction obtained via one of the reconstructed descriptions. Q t quantifies the mean deviation for the normalized transition-amplitude data and Q vis the mean deviation for the two-photon interference visibilities (see definition below). In the limit of Q t = Q vis = 0, perfect reconstruction is achieved, a result only expected if in-circuit loss is either zero or if it can be fully recovered by an approach. One option to achieve the latter is a reconstruction of the complete (4 + 8) × (4 + 8) unitary description, which we investigate for the approach 'Vienna'. The required starting parameters for the λ i and φ i used to initialize to optimization routine are extracted via the approach 'Brisbane'. All β i are initialized at zero. The data presented in figure 6 is sampled for different levels of loss by drawing the transmittances of the loss beam splitters, β 1 , . . . , β 8 , randomly from a uniform distribution [cos( ), 1]. In the worst case of = 0.1, the maximum loss per beam splitter thus is sin 2 (0.1) ≈ 1%. To sample the unitary space representative, j = 500 different 12 × 12 starting matrices are generated for each loss-interval. The general perturbance on the primary data was set to σ = 1%. All reconstructed descriptions are computed in the same way as in section III via a Monte Carlo method with a sampling size of i = 100 (see also Appendix A). The resulting frequency histograms for Q t , Q vis and the fidelity histograms in the case of 'Vienna' were fitted with Burr type XII distributions [53], as these show good overlap with the numerical data. The data points and error bars contained within figure 6 are given as the most probable value of the distributions and the distance between the most probable value and those values to the left and right where the maximum probability decreases to 1 e , respectively. Already for the small levels of mode-dependent propagation loss considered here, all three approaches struggle to reconstruct precise unitary descriptions. This result is to be expected in the case of 'Brisbane' and 'Bristol' as the sets of data used for the reconstruction procedures originate from a non-unitary submatrix. Hence only closest unitary descriptions are reconstructed, which in turn cannot fully explain the original data sets. The results obtained in the case of 'Vienna' need to be interpreted in a different way. Whereas the global optimization of just two-photon interference visibilities converges, as can be seen by the values for Q vis in figure 6b), the values obtained for Q t show a scaling similar to the other two approaches. This indicates that the original optical element parameters for the beam splitters, phase shifters and loss beam splitters cannot be recovered with high accuracy. As a result the fidelities of the reconstructed 12 × 12 unitary matrices decrease rapidly with growing . It is an open question whether a larger set of primary data including relative phase shiftsθ would yield improved reconstruction results. In light of the above results mode-dependent propagation loss still presents a fundamental problem when reconstructing unitary descriptions of LONs.
Quality measure for the case of mode-dependent propagation loss The quality measure Q vis introduced in above is constructed as the difference between the full set of sensed two-photon interference visibilities and the set of predicted visibilities, obtained via a reconstructed unitary matrix. It is normalized by the maximum amount of two-photon interference visibilities, m 2 2 , that can be obtained.
Similarly, the measure Q t quantifies the difference between the set of measured transition-amplitudes and the predicted ones via a reconstructed unitary matrix. It is normalized by the maximum amount of transition-amplitude data, m 2 , that can be obtained.   Figure 6. The influence of mode-dependent propagation loss on the reconstruction performance for 4 × 4 LONs and the different approaches 'Brisbane', 'Bristol', and 'Vienna'. The data for 'Brisbane' and 'Bristol' is offset from the data in the case of 'Vienna' for visualization purposes only. Here the general perturbance on the primary data, σ, was chosen to be 1% and input and output loss to be zero. The transmittances, β1, . . . , β8, of the eight beam splitters modelling the mode-dependent loss were drawn uniformly from the interval [cos( ), 1]. Several intervals were sampled in discrete steps ranging from zero loss to sin 2 ( ) = 1% loss and 500 random matrices were reconstructed for each interval. The histograms were fitted with Burr type XII distributions and the error bars are given as the distance between the most probable value and those values to the left and right where the maximum probability decreases to 1 e . a) Qt quantifies the mean deviation of the normalized transition amplitudes between the initial values and the ones obtained from the reconstructed descriptions. b) Analogously Qvis quantifies the mean deviation of the two-photon interference visibilities. Only data from the experimentally accessible 4 × 4 submatrices is considered. c) Reconstruction fidelities for the U (4+8)×(4+8) unitary matrices reconstructed via 'Vienna'.
In the case of Q t , we normalize the transition-amplitude data for each of the m measured inputs over all outputs to m unit vectors, to allow for comparison between the loss sensitive and insensitive approaches. This normalized transition amplitude data is labelled τ * and defined as