Referenceless characterisation of complex media using physics-informed neural networks

In this work, we present a method to characterise the transmission matrices of complex scattering media using a physics-informed, multi-plane neural network (MPNN) without the requirement of a known optical reference field. We use this method to accurately measure the transmission matrix of a commercial multi-mode fiber without the problems of output-phase ambiguity and dark spots, leading to upto 58% improvement in focusing efficiency compared with phase-stepping holography. We demonstrate how our method is significantly more noise-robust than phase-stepping holography and show how it can be generalised to characterise a cascade of transmission matrices, allowing one to control the propagation of light between independent scattering media. This work presents an essential tool for accurate light control through complex media, with applications ranging from classical optical networks, biomedical imaging, to quantum information processing.


I. INTRODUCTION
A scattering matrix provides complete knowledge of the linear optical response of a material.This not only gives us a better understanding of light-matter interactions, but also allows us to harness their use in a diverse range of photonic technologies [1][2][3][4].For example, knowledge of the scattering matrix can be utilised in the control of light propagation through an opaque scattering medium [5,6], serve as an alternative way to construct optical devices [4], and has many applications spanning from imaging through scattering tissue [7,8], mode-multiplexing through a multimode fiber [9], optical neural networks [10][11][12][13] to the transport and manipulation of quantum states of light [14][15][16][17][18].However, as the complexity of an optical system of interest grows larger, the efficient and accurate measurement of its scattering matrix can be challenging.
Over the last few decades, the development of techniques for measuring a scattering matrix-both its transmission (TM) and reflection (RM) components-have been advanced and tailored to particular scenarios with different conditions and prior assumptions [6,[19][20][21][22][23][24][25][26][27][28][29][30][31].Conventionally, the measurement of a TM is performed under the assumption that a medium preserves the coherence properties of the input light, i.e., the scattering process (channel) is pure.In this case, the measurement is usually performed by sending different pure input states in a given basis sequentially to probe a medium/system of interest and detecting the corresponding output fields by means of off-axis holography using an external reference [32][33][34][35].The technique requires an interferometer that is insensitive to environmental disturbances, particularly for a long optical path length.
To mitigate the problem of stability, alternative methods such as phase-stepping holography with a copropagating internal reference field have been developed [6].Nevertheless, the use of a common-path reference field for an accurate TM measurement poses additional challenges since the internal reference needs to interfere with every optical mode within the complex medium with sufficient interference visibility for an optimal TM measurement.However, the internal reference field usually turns into a speckle pattern due to scattering, with a large variance in amplitude and consequently interference visibility, leading to a drawback known as the "dark-spot problem" [34,36,37].An alternative way to characterise complex media without using a reference field has been achieved through optimisation algorithms, which search for complex vectors using intensity-only measurements under the assumption of a pure process matrix.Various phase retrieval techniques have been developed, such as ones using Bayesian approaches [38], generalized approximate message passing [39], alternating projection methods such as the Gerchberg-Saxton algorithm [40,41], and convex relaxations [42][43][44].
A caveat of all these techniques that do not involve an external reference is that they do not completely characterise the transmission matrix.This is due to the fact that the intensity-only measurements used in these techniques cause phase ambiguity at the output, resulting in the relative phases between output modes being undefined [45].While a TM obtained in this manner is sufficient for the majority of imaging applications through complex media [36,46], complete coherent information about the TM is necessary for many applications such as programmable photonic circuits [47] and in quantum information processing, where complex media are used to transport [14,15] and manipulate quantum states of light [16][17][18].An extension to phase-stepping holography allows us to characterise this output phase ambiguity by interfering with different optical modes after the scattering process [48].Alternatively, phase diversity techniques can be applied to characterise this phase ambiguity by effectively measuring the output field intensity at different propagation distances in free space [29,49,50].However, these methods are still subject to the dark-spot problem mentioned above or require full reconstruction of the optical field.
In recent years, artificial neural networks such as perceptrons and convolutional neural networks have been applied for tasks such as image reconstruction, classification, and transfer through a scattering medium [51][52][53][54][55], demonstrating their potential for learning the transfer function of a scattering medium.While light scattering through a complex medium is a linear process, its measurement in intensity is non-linear, which makes it a suitable system to model within the framework of artificial neural networks.Incorporating the physical model that describes this scattering process into a neural network architecture is thus a clear contender for solving optimisation problems [56].Unlike general machinelearning models, physics-informed neural networks are models based on physical systems, and thus do not require treating the algorithm like a black box, but indeed a simplification tool.Recent advances in optics-based machine learning have not only led us towards enhanced control through linear complex media [46,57,58], but also useful applications in non-linear microscopy [59] and programming quantum gates [60].
In this work, we demonstrate the complete characterization of a coherent transmission matrix without the use of a reference field and the accompanying problem of dark spots by employing the use of physics-informed neural networks, referred to as a multi-plane neural network (MPNN).We do so by performing a set of random measurements before and after the complex medium, while probing only intensity at the output and subsequently feeding the data into a neural network designed to mimic the experimental apparatus.We demonstrate the improved accuracy of our technique by comparing the focusing efficiency achieved through the medium using the transmission matrices obtained with an MPNN to ones recovered with conventional phase-stepping (PS) holography.Furthermore, we investigate the number of measurements required to characterise the TM and show that while phase-stepping requires fewer measurements, the TM recovered with MPNN is much more accurate.We also show that our technique is significantly more noiserobust as compared to phase-stepping holography, allowing the recovery of a high fidelity TM in the presence of large amounts of noise.Finally, using a numerical simulation, we demonstrate the general usage of MPNNs for the TM measurement of a cascade of random scattering media.

II. MULTI-PLANE NEURAL NETWORK
We start by describing the model of a multi-layer optical network, where each complex medium is placed between reconfigurable phase-shifter planes, as illustrated in Fig. 1.The k-th layer of the optical system is composed of a reconfigurable phase-shifter plane represented by a vector x k and a complex medium with complex transmission matrix T k .The intensity y observed at the output detectors of an n-layered network for a uniform excitation at an input of the network is given by where ⊙ represents an element-wise dot product, T k is the transmission matrix of the k th complex medium in the network, F is a known complex matrix defined by the detection optics, and . Equation 1 also describes the neural network that models this optical process, where {x i } i are n input vectors, {T i } i are fully-connected "hidden" complex layers with a linear activation, and F is a fully-connected known complex layer with a non-linear activation of |.| 2 to simulate intensity measurements.Since the input layers are located at different planes in the network, we refer to this model as the multi-plane neural network (MPNN).
We train the described model on a measured randomized dataset using Tensorflow and Keras on Python.Tensorflow provides an open-source framework that makes models like these convenient to implement and extend [61].Building models in this framework allowed us to use previous complex-value neural network layers developed in [46] and makes our work open to extensions using different optimisation techniques.Our model is optimised using adaptive moment estimation, also referred Light from a superluminescent diode is filtered by a 3.1nm spectral filter centered at 810 nm, modulated by a random phase pattern displayed on a spatial light modulator (SLM1) and coupled into a multi-mode fiber (MMF) which is the complex medium of interest.The output speckle field from the MMF is projected onto SLM2, which displays another random phase pattern, and is then detected on a CMOS camera which is placed in the Fourier plane of SLM2.(b) The phase patterns on the SLM are constructed in a specific macro-pixel basis with varying pixel size based on the incident field distribution.Intensities of the output modes are recorded at a given set of points at the camera, which enclose an area corresponding to the MMF core.(c) The physics-informed neural network consists of two input layers encoding phase patterns on SLM1 and SLM2, and a single output layer encoding the intensity pattern of the output speckle in the given basis.The hidden layer T denotes the complex transformation between SLM1 and SLM2 in the macro-pixel bases, while F is a known layer corresponding to the 2f -lens system between SLM2 and the camera.
to as the Adam optimizer [62], the mean-squared error loss of which is given by where i represents the index of a point in the dataset, ȳ is the predicted output from the model, and y is the measured output.Once the loss function is decided, gradients to each weight in the layer T k are calculated with respect to the MSE using a chain-rule governed by the backpropagation algorithm [63].To ensure that the learning is efficient, we appropriately set the learning rate before beginning the optimisation while also reducing it during the optimisation if the loss plateaus.Post training, retrieving the weights of the layer T k gives us the required transmission matrix of the k th complex medium.

III. EXPERIMENTAL METHOD
In this work, we use the formalism of MPNN to measure the coherent transmission matrix of a multimode fiber using two programmable phase planes.The phase planes are implemented on spatial light modulators (SLMs) placed before and after the MMF to probe the optical fields propagating through the fiber using intensity-only measurements.A schematic of the setup is illustrated in Fig. 2a, where light from a superluminescent diode is filtered by a spectral filter centered at 810 nm (FWHM 3.1 nm) and coupled into a 2m-long graded- index multi-mode fiber (MMF, Thorlabs-M116L02) sandwiched between two SLMs (Hamamatsu LCOS-X10468).
The requirement of two separate SLMs is not strict for these measurements and the setup can be designed such that a single SLM is employed with a double reflection before and after the MMF.The MMF has a core size of 50 µm and supports approximately 200 modes for each polarization at 810 nm wavelength.The telescopes placed between each component in the setup are designed such that the incident beam covers a large area on each SLM.
In this particular experiment, we only control a single polarization channel of the MMF.However, the techniques discussed here can be equivalently applied to characterise both polarization channels simultaneously.We choose to work in the so-called macro-pixel basis, which consists of groups of SLM pixels chosen such that the intensity per macro-pixel is approximately equal (Fig. 2b).SLM 1 is used to prepare a set of input modes in the macro-pixel basis and SLM 2 in combination with a CMOS camera (XIMEA-xiC USB3.1) allows us to perform measurements on the output modes of the fiber.T denotes the optical transmission matrix between SLM 1 and SLM 2 in the macro-pixel basis.The field after SLM 2 is given by the element-wise product x 2 ⊙ Tx 1 , where x 1(2) is a vector corresponding to the hologram implemented on SLM 1(2) .Finally, this field is incident on the camera placed in the Fourier plane of SLM 2 using the transfer matrix F. The resultant measurement intensity y on the camera is given by which describes a single-layered MPNN model.A set of random measurements are performed on the setup to generate a dataset for this model.A number of holograms are generated with phases randomly varied following a uniform distribution.These holograms are displayed on each SLM and the corresponding intensities measured by the CMOS camera are recorded.We sample the intensity of the field at the camera at positions separated by a distance corresponding to the size of a diffraction-limited spot (as shown in Fig. 2b) calculated using the effective focal length of the lens system (L 9−11 ) between SLM 2 and the camera.
While we expect the TM to have a dimension of approximately 200 × 200 based on fiber specifications, we oversample our measurements at the SLM planes to capture any optical misalignment and aberrations in the experiment [58].Thus, we use 800 macro-pixels at the input phase plane (SLM 1 ) and 832 macro-pixels at the output phase plane (SLM 2 ) to perform the measurements.In principle, any modal basis can be used to perform these measurements, however, we choose the macro-pixel basis because it provides accurate and low-loss phase-only modulation on the SLM.Since the intensity sampling at the camera plane is limited by the resolution of the optical system, we only use 367 sampling points here.We train our model containing an 800×832 dimensional TM.The optimization is carried out in multiple batches with a batch size of 500 samples, and is accelerated using a GPU (NVIDIA GeForce RTX 3060).We observe good convergence in about 20 epochs of training, as plotted in the training and test metrics shown in Fig. 3(a-b).Retrieving the weights from the hidden layer of the model recovers the measured TM in the macro-pixel basis, as visualised in Fig. 3c, and can be shown using the supplemental codes and dataset (Ref.[64]).Transforming this TM to the Laguerre-Gauss modal basis with 20 mode groups reveals a block-diagonal structure with crosstalk as shown in Fig. 3d, which is a typical structure expected for a graded-index multi-mode fiber [48,65].

A. Accuracy of the measured transmission matrix
To verify the accuracy of the measured transmission matrix, we perform an optical phase conjugation (OPC) experiment to focus light from a given input mode into a particular output mode after scattering through the fibre, by displaying a phase-only solution on the first plane (SLM 1 ), the second plane (SLM 2 ), and both planes simultaneously.The focusing efficiency obtained by controlling light using only the first plane or the second plane allows us to assess the quality of individual rows and columns of the transmission matrix, respectively.On the other hand, manipulating light using both planes allows us to assess the overall quality of the measured transmission matrix.The focusing efficiency is defined as the ratio of intensity measured in a diffraction-limited spot at the camera to that measured in an output region that is 1.75 times the area corresponding to the output facet of the fiber.This is to capture any light that is diffracted outside the output facet due to phase modulation at SLM 2 .
We measure the TM of a multi-mode fiber using the MPNN technique and compare it with one measured using the phase-stepping (PS) technique.It is important to note that the PS technique can be used to measure the full coherent TM without output-phase ambiguity via a two-step process as follows.In the first step, we measure the joint transfer matrix of the fiber and 2f -lens system ambiguous to the reference field, i.e.D F T, where D is an arbitrary diagonal matrix owing to the unknown reference field.We do so by displaying the superposition of each input mode with the chosen reference mode at SLM 1 and varying their relative phase in multiple steps, while measuring the intensity at each output spot.The output fields at a particular input mode are then reconstructed by using the Fourier-Transform reconstruction algorithm [45,66].In order to maximize the intensity per mode, we choose the input basis to be a discrete Fourier transform of the macro-pixel basis with the 50 th mode chosen as the reference.The second step involves the reconstruction of the input reference field D. To do so, we send the input reference mode using SLM 1 and perform the phase-stepping technique using SLM 2 .We prepare the superpositions of each output mode with the chosen output reference mode on SLM 2 and vary their relative phase in multiple steps.The measured intensity at the camera thus allows us to reconstruct the matrix D corresponding to the input reference field in a similar manner as the first step above.It should be noted that the knowledge of D is unnecessary for many applications such as imaging through complex media as well as the aforementioned experiment on performing OPC using only the first plane.All measured data with PS is processed using the Fourier transform reconstruction algorithm [67,68].
In the case of performing OPC with only the first plane (SLM 1 ), the focusing efficiency achieved at different points across the output facet of the fiber using a TM obtained via the PS and MPNN techniques is shown in Fig. 4a.Using the PS technique, we observe a reduction of the focusing efficiency at several output points due to the dark-spot problem [34].This is due to the nature of speckle that results in a high probability of obtaining very low output intensities of the internal reference mode after scattering through the fiber.This leads to low interference visibility in the PS technique for specific output modes, which consequently results in inaccuracy in the reconstructed transmission matrix at these outputs.This problem is solved by measuring the transmission matrix with the MPNN technique, as this does not involve a static reference mode but instead uses many random inputs to probe the scattering process.The more uniform focusing efficiency achieved across the output facet of the fiber with the MPNN technique is clearly illustrated in Fig. 4a.A histogram of focusing efficiencies achieved with these two methods (Fig. 4b) shows that low focusing efficiencies are only observed with the PS technique, while the MPNN technique achieves a significantly higher maximum focusing efficiency.
While improved control with the first phase plane using various optimisation techniques has been well studied in many previous works, one of the chief merits of our approach lies in the simultaneous measurement of relative phases between the rows of the transmission matrix, i.e. the coherence between output modes.We assess the ac- curacy of the reconstructed relative phase between output modes by performing an OPC experiment to focus light by using only the second phase plane (SLM 2 ).Logscale images of a focused spot at the centre of the output facet of the fiber using the two techniques are compared in Fig. 5a-b.We observe the suppression of unwanted speckle background when using a TM obtained with the MPNN technique as compared to the one acquired using the PS technique.By measuring the focusing efficiencies for different input modes, the overall enhancement obtained with the MPNN technique as compared to the PS technique is evident in Fig. 5c.The average focusing efficiency using the second plane increases from 26.5 ± 2.3% using PS to 40.8±1.7% using MPNN.The underlying reason for this improvement is that learning the TM with the MPNN technique does not require a static internal reference mode, whereas the use of the fixed internal reference mode in the PS technique results in errors at particular outputs due to the dark-spot problem.
As discussed above, a complete characterization of the transmission matrix including relative phases between its rows and columns is essential for coherent control of optical fields propagating through a complex medium.To examine this coherent control, we perform an OPC experiment by focusing light at different output spots by simultaneously utilising both the phase planes at hand.The solution of phase patterns for focusing is determined using an iterative wavefront-matching technique [69,70].As seen in the log-scale images shown in Fig. 5de, light focused using both phase planes with a TM acquired using the MPNN method has significantly less speckle background as compared to one acquired with the phase-stepping technique.Quantitatively, we are able to achieve an average focusing efficiency of 65.5±2.5% (up to a maximum of 73.8%) with both planes using the MPNN technique (Fig. 5f).This is a substantial increase with respect to that achieved with the PS technique, where we observe an average focusing efficiency of 42.4±3.1% (up to a maximum of 46.7%).This result also demonstrates the increase in focusing efficiency achievable with two-plane control as compared to individual phase planes, which makes it particularly suitable for usage in applications such as programmable optical circuits [17,18].
The maximum achievable focusing efficiencies (ideal simulations) shown in Figs. 4 and 5 are numerically calculated by taking into account the presence of polarization coupling in the fiber.The fiber TM is represented by a truncated random unitary matrix considering that only one polarization channel is measured and controlled.This results in the lowest achievable maximum focusing efficiency when only the first phase plane (SLM 1 ) is used for control.When only the second phase plane (SLM 2 ) is used, the maximum focusing efficiency is increased as compared to the first case.The third case entails both phase planes being used together to focus light through the MMF.As there is much more phase control achievable using the two phase planes, one can now focus light with much higher efficiency as compared to the previous two cases.

B. Efficiency of learning with the MPNN technique
In this section, we study the number of measurements required to obtain an optimal TM using the MPNN technique as compared to the PS technique.First, we experimentally evaluate this by performing an OPC experiment with the fiber TM reconstructed with different dataset sizes.For the MPNN method, the size of the dataset (α) is quantified by the total number of intensity-only measurements performed on the camera divided by the number of input modes that characterize the TM.For the PS method, this quantity is a bit more nuanced since we only control the number of phase steps (n ϕ ) within the interval [0, 2π] that are used per mode.As described in section III, the first part of the PS method requires (n in − 1)n ϕ + 1 measurements, where n in is the number of input modes and the additional measurement corresponds to that of the reference itself.The second part requires (n out − 1)n ϕ measurements where n out is the number of output modes measured on the camera.The total number of measurements for each input mode is then given by α ≈ [(n in + n out )/n in ]n ϕ .In our experiment, n in = 800 and n out = 367, which gives α ≈ 1.46n ϕ for the PS technique.In this manner, the parameter α corresponds to the total number of measurements performed per mode in both the MPNN and the PS technique.
The focusing efficiency achieved with a TM obtained via the MPNN and PS techniques as a function of the number of measurements per mode (α) is plotted in Fig. 6 for all three cases-focusing with the first, the second, and both planes.For the PS technique, the focusing efficiency is seen to converge to its maximum value at α ∼ 6 − 8, which is close to the minimum required number of phase steps (n ϕ ∼ 3−4) [71].However, it reaches a plateau after this and cannot be improved further owing to the presence of noise in the experiment [72][73][74].In contrast, the focusing efficiency obtained via the MPNN technique is seen to converge at a higher number of measurements per mode (α = 20).However, in all three OPC cases, the maximum efficiency achieved with MPNN is higher than that achieved with the PS technique-46.5%,43.6%, and 73.8% versus 45.0%, 30.2%, 46.7% with the first, second, and both planes respectively.In particular, the maximum efficiency is significantly higher when focusing with only the second or both phase planes-cases where complete coherent information of the TM plays a critical role.Thus, while the MPNN technique takes longer to learn a given TM, the reconstructed TM is more accurate, as quantified by the focusing efficiency achieved through it.One should note that the number of measurements can also be reduced by incorporating the underlying physical model of a multimode optical fiber [57,58].

C. Noise-robustness of the MPNN technique
From the previous sections, it is clear that one of the advantages of the MPNN technique is improved performance over the PS technique in the presence of noise.While our experiment studies one specific case, here we quantify this improvement by simulating the effects of different noise levels on both techniques.An 800 × 800dimensional random unitary matrix is chosen as our ground truth TM and intensity measurements using the PS and MPNN techniques are simulated while varying the number of measurements per mode (α).Noise in the measurement is modelled as additive white Gaussian noise on the readout intensity where N (µ, σ) is additive white Gaussian noise with mean µ = 0 and standard deviation σ.Such a noise model is standard and includes the effects of multiple random processes such as thermal noise [75,76].The signal-to-noise ratio (SNR) is the ratio of the average norm of the signal intensity to the noise standard deviation, i.e.SNR = |y|/σ [77].It should be noted that each simulated data point is normalised to unity before adding noise, which simply implies an SNR of 1/σ in our model.The fidelity between the recovered TM ( T) and the ground-truth TM (T) is calculated as The fidelity is sensitive to the relative phases between rows and columns of the TM, and thus only reaches its maximum value of 1 when complete coherent information about the TM is present.
The TM fidelity as a function of the number of measurements (α) is plotted in Fig. 7 for different levels of noise.In these simulations, we choose n in = n out = 800, which implies that α ≈ 2n ϕ for the PS technique.In the noiseless case (SNR = inf), the PS technique converges to perfect fidelity at α ≈ 6, while the MPNN technique requires α ≈ 10 to do the same.Note that in this case, the PS technique is able to reach perfect fidelity regardless of the dark spot problem, as even the smallest interference signal provides complete information with no noise present.As the SNR decreases, the maximum fidelity achievable via the PS technique drops rapidly and is unable to reach perfect fidelity regardless of the number of measurements used.For example, even with a small amount of noise (SNR=20), the PS technique can only recover a TM with fidelity less than 60%.In contrast, the MPNN technique is able to achieve very high fidelity in the presence of noise.As can be seen in Fig. 7b, we are able to recover a high-quality TM (F = 80.87%) even when the SNR is as low as 0.8, with fidelity gradually increasing with the number of measurements per mode (α).This highlights the significantly improved noiserobustness of the MPNN technique in comparison to the PS technique.
Despite using the most noise-resilient algorithm to reconstruct the TM using PS data [66], the MPNN method significantly outperforms the PS technique.This is because the MPNN method uses a large part of the dataset together to minimize the defined loss function in Eq. 2. The addition of noise to the data merely adds local minima to the loss function across the optimization space, leaving the global minimum largely untouched.In contrast, PS relies on processing each data point individually over the n ϕ phase-steps as a sinusoid, where the addition of noise can severely impact the visibility and phase reconstruction.
These results may give the impression that PS is an unreliable technique, however, this is untrue because here we have pushed the technique to its very limits.The SNR range where we perform our simulations is far below previously tested ranges with PS [74,78], demonstrating a much superior noise resilience of MPNN.Moreover, fidelity is a very unforgiving metric as it requires both the relative phases and amplitudes of both rows and columns of the TM to be well reconstructed for it to be high.A poor quality TM recovered using PS with an SNR=5 and fidelity of 17%, can still control light with first and second planes with about 17% and 13% focusing efficiencies respectively.Nonetheless, such a metric is critical in high-precision applications such as programmable quantum gates [18], where slight errors in knowledge of the TM can drastically affect the performance of the entire experiment.
In practice, our experimental results are still far from ideal as estimated by the numerical simulations (black dotted lines in Figs. 4 and 5).This deviation from the ideal might originate from a variety of imperfections in the experiment that are not explained by the simple model of noise studied here.These include imperfections such as the choice of basis, instability of light source, phase instability of SLM, temperature-dependent movement of optomechanics and the optical medium, or linearity of the camera.Many of these issues can be addressed and improved in the experiment to achieve perfect focusing efficiency [79][80][81], and could be combined with the MPNN technique presented here.

D. Learning multiple transmission matrices with an MPNN
In this section, we demonstrate how the MPNN technique can be used to reconstruct a complex optical network characterised by a series of transmission matrices, as described in Eq. ( 1).Furthermore, we show how knowledge of each individual TM allows us to focus light at any intermediate plane within this network.As shown in Fig. 8a, we simulate a cascade of three 16 × 16dimensional TMs (T i ) separated by programmable phase planes.These are followed by a known mode-mixer (F) that performs a 2D discrete Fourier transform.By randomly modulating the phase planes and performing intensity measurements after F, the MPNN technique is able to fully characterise this optical network.The recovered TM of each medium as well as the training loss are shown in Fig. 8b-c.
A multi-plane network such as this can allow us to not only control light through the whole system, but also control light at intermediate planes as conceptualised in Fig. 8a.As an example, we simulate optical phase conjugation using the three recovered TMs to focus light at each intermediate plane in the system.Insets in Fig. 8b show the focused image obtained at each plane in the trained network using the preceding phase planes.The size of the dataset required to characterise this network with dimension 3 × 16 × 16 is α ≈ 10 5 as also shown in supplemental codes (Ref.[64]) .However, this is not the minimal size required to train this model and strongly depends on the training parameters.Further tuning and investigations can lead to a better understanding of how the minimum dataset size required to train this model varies with the number of planes and dimensions of the TM.Nonethless, it should be noted that the model for such a neural network is very complex and to our knowledge, MPNN is the only known method to date that can perform such a task.
A caveat of our technique is that the measured TM of two consecutive complex media in the series can be ambiguous up to a diagonal matrix on either side.In a cascade of complex media T i separated by planes P i , taking a single layer at the k th plane, where D is a diagonal matrix.Due to the commutation relation between D, P k and D −1 , the MPNN technique measures the TM of an individual complex medium up to such an ambiguity in the TMs of consecutive media, for example, the presence of equal but opposite diagonal phases.We anticipate that this ambiguity affects the training since few elements in D can acquire high amplitudes leading to over-fitting, however, this problem can be tackled using suitable regularizers.Importantly, this ambiguity between T k and Tk does not affect the description of the overall cascaded optical network.Systems consisting of cascaded programmable phase planes separated by optical media are fast gaining popularity, with recent work demonstrating their use in a variety of applications such as spatial mode-sorting [70,82,83], projective measurements [84], unambiguous state discrimination [85], programmable quantum circuits [18,86,87], and optical neural networks [11,88].In all these implementations, accurate knowledge of the optical system between phase planes is critical to their performance.While free-space propagation is relatively straightforward to model, aberrations arising from the devices and elements used can introduce significant design challenges.By enabling the characterisation of a cascade of independent TMs, the MPNN technique provides a way to more accurately design such multi-plane devices, with uses ranging from classical to quantum optical networks.

V. CONCLUSION
In this work, we have conceptualised and experimentally demonstrated a method to characterise a complex medium, or a network of complex media, using physicsinformed neural networks, referred to as multi-plane neural networks (MPNN).We apply the proposed MPNN technique to measure the full coherent transmission matrix of a multi-mode fiber without the need for an external reference field.The key idea behind the measurement is to randomly modulate phases of optical fields both before and after the fiber and measure the intensity-only outcomes to form a dataset.The trained model produces a transmission matrix capable of controlling light by manipulating fields not only before the multi-mode fiber, but also after it, which relies on the complete coherent information of the obtained transmission matrix without the problem of dark spots and output-phase ambiguity.We demonstrate accurate control of optical modes through a multi-mode fiber using the MPNN method, with a significant improvement over the phase-stepping technique in terms of focusing efficiency and noise-robustness.Finally, we show the capability of this technique to learn more complex systems such as a cascade of transmission matrices interspersed with multiple phase planes and discuss possible applications.Our technique will allow for accurate control of coherent light not only through complex media but also through complex optical networks, with applications ranging from optical communication systems to biomedical imaging.

Figure 1 .
Figure1.Schematic of an n-layered optical network: a cascade of n complex media denoted by their optical transmission matrices T k are separated by reconfigurable phase-shifter planes x k , followed by detection optics F .This optical network can be represented as a multi-plane neural network (MPNN) with n + 1 input layers of x k , n "hidden-layers" for T k , and a known layer for F with a |.| 2 activation.

Figure 2 .
Figure2.(a) Experiment: Light from a superluminescent diode is filtered by a 3.1nm spectral filter centered at 810 nm, modulated by a random phase pattern displayed on a spatial light modulator (SLM1) and coupled into a multi-mode fiber (MMF) which is the complex medium of interest.The output speckle field from the MMF is projected onto SLM2, which displays another random phase pattern, and is then detected on a CMOS camera which is placed in the Fourier plane of SLM2.(b) The phase patterns on the SLM are constructed in a specific macro-pixel basis with varying pixel size based on the incident field distribution.Intensities of the output modes are recorded at a given set of points at the camera, which enclose an area corresponding to the MMF core.(c) The physics-informed neural network consists of two input layers encoding phase patterns on SLM1 and SLM2, and a single output layer encoding the intensity pattern of the output speckle in the given basis.The hidden layer T denotes the complex transformation between SLM1 and SLM2 in the macro-pixel bases, while F is a known layer corresponding to the 2f -lens system between SLM2 and the camera.

Figure 3 .
Figure 3. Learning the coherent transmission matrix of a multi-mode fiber: Test and train (a) loss and (b) mean-absolute percentage error (MAPE) as learning proceeds in epochs.A visual plot of the learnt transmission matrix in the (c) macro-pixel and (d) Laguerre-Gauss (LG) basis, where modes are segregated into respective mode groups.In order to change the basis of the TM from (c) to (d), a transformation matrix B Pix→LG is constructed that maps the set of macro-pixel modes to a set of LG modes.The TM in the LG basis is calculated as T LG = B † Pix→LG T Pix B Pix→LG .

Figure 4 .
Figure 4.A comparison of optical phase conjugation (OPC) performed with only the first phase plane: (a) Focusing efficiencies achieved at different points across the output facet of the multi-mode fiber using the phase-stepping (PS) and multi-plane neural network (MPNN) techniques.Output points with low focusing efficiency due to the inherent dark-spot problem can be seen in the PS method (see text for more details).In contrast, focusing efficiencies obtained with the MPNN method show more uniformity across the fiber facet.Log-scale images of light focused using the first SLM with a TM obtained by the (b) PS and (c) MPNN techniques.(d) A histogram comparing the PS and MPNN methods shows that the PS technique leads to some output points with very low focusing efficiencies, while those obtained with MPNN method are more uniform and significantly higher.Note: Only the points within first 80% diameter of the core from the center of the core are taken for the purpose of this histogram.

Figure 5 .
Figure 5. (a-c) A comparison of optical phase conjugation (OPC) performed with only the second phase plane.Log-scale images of light focused using only SLM2 with a TM obtained by the (a) phase-stepping and (b) multi-plane neural network techniques.(c) A histogram comparing second-plane focusing efficiencies of 50 random input modes achieved with the PS and MPNN techniques shows a marked improvement with the latter.(d-f) A comparison of OPC performed with both phase planes.Log-scale images of light focused using both SLMs with a TM obtained by the (d) PS and (e) MPNN techniques.(f) A histogram comparing focusing efficiencies for all output modes achieved with both planes shows a significant improvement with MPNN over the PS technique.The focusing efficiencies are corrected for SLM2 basis-dependent loss for both methods.

Figure 6 .
Figure 6.Number of measurements required for optimal transmission matrix measurement: Experimental focusing efficiency achieved with (a) the first phase plane, (b) the second phase plane, and (c) both planes simultaneously, using a TM reconstructed with the phase-stepping technique and the multi-plane neural network (MPNN) plotted as a function of the number of measurements per input mode (α).While phase-stepping converges to a maximum faster, the MPNN technique shows a much higher focusing efficiency.

Figure 7 .
Figure 7. Noise-robustness of MPNN: A simulated 800 × 800-dimensional TM is recovered using the (a) phase-stepping (PS) and (b) multi-plane neural network (MPNN) techniques for different additive gaussian noise levels.The fidelity of the recovered TM is plotted as a function of the number of measurements per input mode (α).While the PS technique quickly degrades in the presence of noise, the MPNN technique is able to reach high fidelities with small increases in the number of measurements.

Figure 8 .
Figure 8. Learning multiple transmission matrices with an MPNN: (a) A cascade of multiple complex media interspersed by programmable phase planes can be characterised through the use of the MPNN technique.The recovered TMs corresponding to these media allows us to control the propagation of light at intermediate phase planes.(b) Numerical simulation showing three 16×16-dimensional TMs reconstructed with the MPNN technique, which are then used to focus light at each intermediate plane (insets).(c) Optimisation loss using the training and testing datasets for the three-plane MPNN.

Funding-
This work was made possible by financial support from the QuantERA ERA-NET Co-fund (FWF Project I3773-N36), the UK Engineering and Physical Sciences Research Council (EPSRC) (EP/P024114/1) and the European Research Council (ERC) Starting grant PIQUaNT (950402).