Light scattering control in transmission and reflection with neural networks

Scattering often limits the controlled delivery of light in applications such as biomedical imaging, optogenetics, optical trapping, and fiber-optic communication or imaging. Such scattering can be controlled by appropriately shaping the light wavefront entering the material. Here, we demonstrate a machine-learning approach for light control. Using pairs of binary intensity patterns and intensity measurements we train neural networks (NNs) to provide the wavefront corrections necessary to shape the beam after the scatterer. Additionally, we demonstrate that NNs can be used to find a functional relationship between transmitted and reflected speckle patterns. As a proof of the validity of this relationship, we demonstrate focusing and scanning of light in transmission through opaque media using reflected light. Our approach demonstrates the versatility of NNs for light shaping and for efficiently and flexibly correcting for scattering. In particular, the feasibility of transmission control based on reflected light opens up new opportunities for applications in optical imaging, sensing, and light delivery.

Scattering often limits the controlled delivery of light in applications such as biomedical imaging, optogenetics, optical trapping, and fiber-optic communication or imaging.Such scattering can be controlled by appropriately shaping the light wavefront entering the material.Here, we demonstrate a machine-learning approach for light control.Using pairs of binary intensity patterns and intensity measurements we train neural networks (NNs) to provide the wavefront corrections necessary to shape the beam after the scatterer.Additionally, we demonstrate that NNs can be used to find a functional relationship between transmitted and reflected speckle patterns.As a proof of the validity of this relationship, we demonstrate focusing and scanning of light in transmission through opaque media using reflected light.Our approach demonstrates the versatility of NNs for light shaping and for efficiently and flexibly correcting for scattering.In particular, the feasibility of transmission control based on reflected light opens up new opportunities for applications in optical imaging, sensing, and light delivery.
When light propagates through a linear nonhomogeneous and non-isotropic material its wavefront becomes distorted due to aberrations and scattering, resulting in an apparently random interference pattern of granular speckles [1,2].Such scattering conditions hamper the controlled delivery of light and the engineering of the PSF, which is a basic requirement for many applications [3][4][5][6].To counteract this effect, methods based on shaping the light wavefront entering the scattering material have been developed.Wavefront shaping is typically achieved by using spatial light modulators (SLMs) [7,8] which, with their millions of degrees of freedom (pixels), allow focusing through diffusers [9][10][11], multimode fibers [3,12,13], and biological tissue [14][15][16][17].Different techniques have been developed to determine the appropriate wavefront corrections to be displayed on the SLM.The first demonstration of scattering control took advantage of iterative wavefront optimization [9,18,19], which approaches the targeted light distribution, typically a single or multiple focal spots, by updating the wavefront depending on the result after each optimization step [4,9,18,20,21].These feedback-based algorithms calculate the wavefront correction separately for each focal position or shape and can be optimized to achieve very fast focusing times [11,22].A second approach, also typically used to control a single focus, is digital optical phase conjugation which uses interferometry to measure the scattered light field and reverses it with an SLM [16,[23][24][25][26].This technique has the advantage of achieving update rates approaching the millisecond range needed for imaging in dynamic biological tissue [27], while however requiring that the scattering signal originates from a coherent source suitable for reversal.A third group of methods aims for describing and controlling the scattering process simultaneously across an entire field of view which was first achieved with the help of a transmission matrix [10,28,29].For obtain-ing the transmission matrix, one needs to measure light phase, which, similar to digital optical phase conjugation, requires technically more demanding interferometric approaches.To simplify such experiments, computational methods for estimating incompletely measured information have been implemented which for example can infer the light phase from intensity measurements [30,31].
Another set of computational techniques that, thanks to the development of programming frameworks together with the computational power of GPUs, is increasingly being applied in imaging and microscopy relies on machine learning (ML) [32][33][34][35] and in particular on NNs [36][37][38][39].The usefulness of these techniques [40][41][42][43][44][45][46][47][48][49] has been demonstrated in the context of light scattering for image analysis [37,44,45,47], where the goal lies in the classification of an object across a scattering layer, or image reconstruction based on a predefined data set [48].In astronomy, NNs have been applied for the correction of weak scattering encountered when imaging through the atmosphere, for example for the control of multi mirror telescopes [49].For light control, genetic algorithms, a class of iterative optimization algorithms, have been used for optimizing focusing across scattering materials [40,41,43].Single-and multi-focus single-shot control (after training) over a 5 × 5 pixel area has been achieved using support vector regression [46], but the reported small field of view, low signal to noise ratio, and long training times (97 min) are limiting for high-resolution PSF engineering.
While the methods outlined above allow focusing through scatterers in transmission, an additional set of challenges arises when the focal plane lies hidden behind or inside the scatterer, remote from direct optical access.For applications such as sensing, imaging or communication this is the more relevant configuration [50].For example in biological imaging, fluorescent signals or guidestars can be used to monitor excitation intensity inside or behind a scatterer ([2, 15, 19, 21]).Particularly in the presence of strong scattering, however, these signals are often dim and generally have a limited pho-ton budget [51].Alternatively, back-scattered excitation light can provide feedback about the beam [52][53][54].These signals need to be additionally filtered to remove out-offocus light and using various combinations of temporal, frequency, or spatial gating [52][53][54][55][56][57][58] one aims for extracting photons that are scattered little and therefore retain image information.Since such weakly scattered photons disappear exponentially with depth, they in turn limit imaging depth.
However, even under strongly scattering conditions reflected (or backscattered) photons carry information about transmitted light [50,59,60].Mutual information between speckle patterns generated in these two opposite scattering directions indicates that reflected light might potentially be used to control transmitted light [50,59,60].This would require that a functionally explicit relationship between these two scattering signals could be found and that the available information would be sufficient for controlling one signal through the other.So far, reflected light has been used to maximize the energy sent into a sample [61,62], but without control over the resulting light distribution inside the sample, or required an embedded highly scattering target to achieve a localized light distribution [63].Other schemes to take advantage of backscattered light have been suggested in theoretical work [64,65], but these concepts have so far not been implemented experimentally.
Here, we discuss how neural networks can be used to image through materials with different scattering characteristics such as glass diffusers, multi-mode fibers, or paper.First, we show that single-layer NNs (SLNNs) and multi-layer convolutional NNs (CNNs) can be trained to control the light distribution behind scattering materials with high accuracy.Second, we show that NNs can be used to find functional relationships between transmitted and reflected light, i.e., they can predict transmitted speckle patterns from reflected speckle patterns with sufficient accuracy for light control through opaque materials.Taking advantage of this relationship we then show that NNs can be used for focusing in transmission using reflected light.

NN approach for scattering control
We here first outline the underlying approach of using NNs for light control through a scatterer, which is also sketched in Fig. 1(a).In an initial step, we generate a dataset consisting of pairs of binary illumination patterns displayed on the SLM and corresponding speckle patterns recorded with a CCD camera after transmission through the scatterer (64 × 64 macropixels for illumination patterns and 96 × 96 pixels for the CCD camera, see also Supplementary Fig. 1).These pairs of illumination and speckle patterns (typically on the order of 10000, but see below for training with a reduced number of patterns) are used to train the NNs as detailed in Methods, with the goal of inferring the relationship between the resulting scattered light distributions and the illumination patterns.We then feed the desired distribution into the trained NNs to predict the corresponding illumination pattern.This pattern is finally displayed on the SLM and the resulting light pattern is recorded with the camera.Each pattern, C (k), can be considered as a combination of plane waves with different wave vectors k.This distribution of plane waves is modified by the scattering material through a function F [C(k)] in a deterministic way and results in the speckle pattern, S, i.e. S = F (C [k]).Through training, the NN learns an approximation of the function F needed to generate any light distribution after the scatterer.
The experimental setup is schematically shown in Fig. 1(b) and explained in detail in Methods.A laser beam is sent to the SLM, a high-speed digital micromirror device (DMD), which displays binary patterns of high and null intensity values (0s and 1s) with a frame rate of up to 22.7 kHz.We have tested our system, both with pseudo-random checkerboard-like patterns and with patterns obtained from Hadamard matrices.These patterns are imaged onto the back aperture of a microscope objective that focuses the light beam through the scatterer.A second identical microscope objective and a pair of lenses are used to collect and image the speckle patterns onto the CCD camera.A second detection path, including a non-polarizing beam splitter and an additional CCD camera are added for experiments with reflected light (see below and Methods for details).

SLNNs for light control through glass diffusers
In Fig. 2(a) we demonstrate the ability of SLNNs to generate diffraction-limited Gaussian foci through a glass diffuser (as used for example in [17]) at different positions in the image plane.Top images schematically illustrate the NN architecture and training process (see Methods for details); below that, the first rows show the intensity distribution captured with the CCD camera, while the second and third rows display horizontal and vertical cross sections through the center of the focus.Insets and red-dashed lines show the position and shape of the target distribution that is fed into the neural network and for which the network then calculates the appropriate SLM mask (the target distribution is displayed normalized to the experimentally recorded intensity).
The images show an excellent agreement between the desired and recorded patterns with a signal-to-noise ratio > 10 and an enhancement η = 32 ± 5 (see Methods, and Supplementary Fig. 1 for scanning across the entire field of view).The time to achieve light control depends on the number of recorded frames and the training time.For the typical datasets of 10000 frames (with a resolution of 64 × 64 macropixels on the DMD and 96 × 96 pixels on the CCD, recorded at 1000 Hz) training on a single GPU required 34 s, and could be reduced down to 18 s while keeping an enhancement η > 10 (see Supplementary Fig. 5).

CNNs for light control through glass diffusers
While SLNNs are easy to implement and train, the underlying linearity limits their performance for many tasks [66].A plethora of other network architectures have therefore been developed with the goal to improve over the performance of SLNNs.The most straightforward generalization of SLNNs combines multiple NN layers with connections between all neurons, resulting in a densely connected network.While such densely connected networks are not limited by linearity, the increased number of parameters also makes them more challenging to train, particularly for large data sets such as stacks of high resolution images.Network architectures were therefore developed that take into account the structure of the underlying data and convolutional neural networks (CNNs) have emerged as one of the most successful solutions for image processing [66].The typical architecture of a CNN consists of multiple convolutional layers that extract features across an entire field of view, interspersed with pooling layers that down-sample the image, and fully connected layers.While a large number of different networks are applied for different tasks, with a few to a few hundred convolutional layers [32,33,66], we here found that a three-layer CNN (see Fig. 2(b) and Methods for details) could be used for scattering control through a glass diffuser.To circumvent the difficulties of training nonlinear networks we pretrained the network with an autoencoder [66], a network that compresses and then uncompresses the data into a close approximation of the input.The part of the network that was used for compression then served as the initial CNN for scattering control (see Methods for details).In Fig. 2(b) we demonstrate the ability of CNNs to generate diffraction-limited Gaussian foci through a glass diffuser.The images again show an excellent agreement between the target pattern that was fed into the CNN (red dashed lines and inset) and the recorded patterns with a signal-to-noise ratio > 10 and an enhancement η = 3.6 ± 0.9 (measured over an ensemble of 25 different focus positions, see Methods and Supplementary Fig. 2 for scanning across the entire field of view).CNNs in this case reduce the number of network parameters by 80% compared to SLNNs.

SLNNs for point spread function engineering
Since the SLNN is a linear network, we reasoned that after training it should be able to take advantage of the linearity of light scattering in non-absorbing media to generate arbitrary light distributions.To demonstrate the validity of our approach for controlling the light intensity distribution after the scatter we generated in Fig. 3 a variety of non-trivial shapes using SLNNs.Again, there is an excellent correspondence between the target distribution that enters that network (insets) and the recorded patterns.We note that thanks to the high frame rate of the DMD (22.7 kHz), alternatively, any shape can be generated with high fidelity by painting it spot by spot, e. g. similar to approaches for trapping ultra cold atoms [67] or optogenetics [68], as shown in Supplementary Fig. 2 and in Supplementary Movies 1-6.

SLNNs for light control through optical fibers
Our system is suited well to correct for scattering in materials with slow dynamics (on the order of a few tens of seconds, see Supplementary Fig. 5), such as optical fibers [12,69,70].In particular, multimode optical fibers are ideal for applications in imaging and optogenetics, but modal dispersion and cross-talk distribute light into an apparently random speckle pattern.We therefore tested the performance of SLNNs for controlled light delivery through multimode fibers.In Fig. 4 a single focus is scanned (η = 10 ± 3) with different paths across the field of view of the fiber (including a circle, a square, and a 5 × 5 array of points), demonstrating that SLNNs are able to precisely control light through optical fibers.

NNs find functional relationships between transmitted an reflected speckle patterns
While many of the methods for focusing light through strongly scattering media rely on measuring transmitted light (as in the experiments described so far), many applications could benefit from using reflected light.Towards that goal we tested whether neural networks can take advantage of mutual information between transmission and reflection images [50,59,60] for light control.In the following we show that with the help of a functional relationship between reflected and transmitted speckle patterns one can control transmitted light using reflected light.
For this experiment we simultaneously recorded transmitted and reflected light by adding a non-polarizing beam splitter, a pair of imaging lenses and a CCD camera to the setup, as shown in Fig. 1.To achieve good signal to noise ratio of transmitted as well as reflected speckle patterns, we used paper as scattering material, which was more strongly scattering than the glass diffuser [56] and led to an increased amount of backscattered light.Movies of simultaneously recorded transmitted and reflected speckle patterns (with size of 128 × 128 pixels) were then generated by illuminating the sample with a series of checkerboard projections (64 × 64).Once the speckle patterns were recorded, we trained a SLNN (SLNN1 in Fig. 6(a)) to find the relationship between transmitted and reflected light.To quantify the performance of this network we used the Pearson correlation co-efficient as a similarity measure [71] between transmission speckle patterns predicted by the network and measured transmission speckle patterns.Fig. 5(a) shows the histogram of these correlation coefficients and the correlation coefficient between transmitted and reflected speckle patterns for comparison.Figs.5(b) and (c) show, respectively, an example of measured and predicted speckle patterns with median correlation (ρ = 0.50), while and Figs.5(d) and (e) show an example of measured reflected speckle pattern when a corresponding focus is generated in transmission.

Focusing through opaque media in transmission using reflected light
To take advantage of this mutual relationship between transmitted and reflected light for light control, we trained a second independent network (SLNN2 in Fig. 6(a)) to infer the relation between reflected speckles and illumination patterns, similar to the training of the SLNN in the transmission configuration in the previous sections of the article (that is, with the reflected speckles as input of the SLNN and the illumination patterns as output, see Methods for details).Combining these SLNNs, as shown in Fig. 6(a) (see Methods for details), allowed us to form transmission foci by only taking advantage of reflected light, based on SLNN1 relating reflected to transmitted speckle patterns.In Figs.6(b)-(d) we show, respectively, that we can scan a circle, a square, and a grid, demonstrating full control of transmitted modes using reflected modes over the entire field of view.This additionally demonstrates that the predicted speckle patterns (Figs. 5) are sufficiently accurate for high-resolution light control.The measured enhancement in this case was η = 12 ± 4 (measured over an ensemble of 25 different focus positions).Note also that even though paper is more strongly scattering [56] than the glass diffuser, this does not hinder the SLNN from light control.Focusing through paper with a SLNN as reported in the previous sections is shown in Supplementary Fig. 3.

DISCUSSION
In summary, we showed that NNs can be used to efficiently shape light through a variety of media with different scattering characteristics (Figs. 1 to 4).Once the NNs are trained, we achieve real-time, single-shot light control through the scattering material with high fidelity, in a fashion similar to transmission matrix approaches [10,29,72].Specifically, we demonstrated the ability of SLNNs to focus and scan light through glass diffusers, multimode fibers, and paper, and to generate arbitrary light distributions through glass diffusers.We further showed that nonlinear networks, specifically a three layer CNN, can focus light through a glass diffuser.
In a second set of experiments, we demonstrated that with the help of two networks, one establishing an explicit functional relationship between light that is transmitted through a scatterer and light that is reflected, and one relating reflected light to illumination patterns, we can control transmission using reflection at diffraction limited resolution.SLNNs therefore prove to be well suited to take advantage of a recently described mutual information between transmitted and backscattered light for light control [50,59,60].
To compare the performance of the NN method for focusing in transmission with other schemes, we quantified the enhancement as in [73] (see Methods) and obtained values in the lower range of what was reported for intensity-only modulation for the SLNN [30,41,73] and lower values for the CNNs (but still with a sufficient SNR and enhancement for imaging applications), with the caveat that a direct quantitative comparison needs to take into account the specific combination of scatterer, optical setup (we used lower N.A. objectives than many of the reports with higher enhancement), and the number of controlled modes.The maximum number of controlled modes in our experiments was 4096 (64 × 64); while the SLM has 768×1024 pixels, using more degrees of freedom was not warranted given our optical configuration.The number of controllable modes depends on the memory of the GPU of 12 GB which needs to accommodate the NN model and a single batch of training data, in our case typically 150 frames, but the memory could be expanded by using multiple GPUs.
To increase the enhancement and light control one could in addition modulate phase (see p. 42 of [1] for the effect on enhancement and signal-to-noise ratio) and the NN approach could be extended to any combination of stimulus-response pairs, including phase or polarization on either the detection or projection side, or both.An advantage of only using binary intensity modulation and intensity measurements [30], is that it simplifies the setup compared to approaches that also rely on phase information.Additionally, although we used monochromatic light, our approach could also be used with pulsed light [74].
For applications, the time it takes to achieve light control is critical.For the transmission matrix approach as well as the NN approach this time can be broken down into two parts, the time for acquiring the data and the time to compute the wavefront correction.The acquisition time is ultimately limited by the number of required frames.Typical numbers for the transmission matrix approach in recent reports range from 4000 [29] to 12000 [70] which is similar to the number of frames used in our experiments, which was typically 10000 or less, see Fig. 4(b).The time-limiting factor in our experiments was training of the NNs.For the largest data sets the time required for training was less than 35 seconds for the SLNNs and less than 50 seconds for finetuning the CNNs.To accelerate the process we tested training with a reduced amount of data (Supplementary Fig. 4) which sped up training at the cost of lower enhancement.The shortest training time on a single GPU with the SLNN that lead to a focus with significant enhancement (η > 10) was obtained with 5000 frames in 18 seconds.For comparison, the time required for calculating the transmission matrix varies for different techniques, from a simple Hermitian conjugation operation, to computationally more demanding approaches which require 15 seconds matrix multiplication on a GPU [70].While some methods that optimize a single mode can be very fast (for example 33.8 ms in [11]), this still results in a comparable correction time for a full field of view (of about 5 minutes for 96 × 96 pixels in this example).
Further optimization of the NN approach could be achieved by optimizing the network architectures.Here, we compared two basic networks, SLNNs and three-layer CNNs.SLNNs take advantage of the linearity of scattering (as does the transmission matrix approach) and therefore can generalize from speckle patterns to arbitrary light distributions.Multi-layer NNs in contrast need to be specifically designed and trained to generate a desired type of light distribution.At the same time, that multi-layer NNs are independent of assumptions about the underlying physical model (such as linearity of scattering) and can efficiently reduce the dimensionality of the images through convolutional layers as well as lower the number of parameters required for training (by 80% compared to the SLNNs in our case), will likely prove advantageous for applications.
For many applications of light control through scattering media, such as imaging, sensing or communication, it will be necessary to develop methods that can work with reflected light [50].For example in biological microscopy, fluorescence signals can serve as feedback for scattering correction [2,19], but they require labeling of the sample and are often dim, particularly before wavefront correction.Other schemes for light control in tissue resort to the assistance of acoustics waves [5,75] but do not achieve diffraction limited optical resolution [19].The most broadly applicable implementation for wavefront correction takes advantage of backscattered light as for example in optical coherence tomography or related approaches [52][53][54][55][56][57][58].However, ultimately the availability of weakly scattered photons is limiting the imaging depth of these methods and ways to take advantage of strongly scattered light are therefore needed.Strategies for light control using strongly scattered, reflected light have indeed been developed [61][62][63] for maximizing the energy delivered into the material [61,62] or an embedded strongly scattering target [63] without, however, exerting full independent control over the transmitted modes.
We here took advantage of mutual information between transmitted and reflected speckle patterns [50,59,60] and used NNs to show that it is indeed possible to control transmitted light with reflected light with sufficient accuracy for high-resolution focusing and scanning (Fig. 6).We achieved this by establishing an explicit functional relation (with NNs) between transmitted and reflected speckle patterns (Fig. 5) with sufficient accuracy for high-resolution transmission control.That such a relationship can be established (with a linear network) could not necessarily be expected based on the mutual information relationship in [50].
The limitation of the current approach for applications is that it first requires characterizing the transmission and reflection properties of the scatterer for the specific field of view, which still requires unobstructed access to the focal plane behind the scatterer.How could this limitation be overcome?One of the distinctions of neural networks is their ability to generalize.One potential avenue would therefore be to train appropriate NN models on sufficiently broad training sets and to adapt these models to the specific sample or field of view, e.g. using backscattered light.For example, CNNs are the building blocks for many of the more advanced network techniques that analyze novel visual scenes based on previously learned data sets [32,33,66], and such methods might also be harnessed for light scattering.
Independent of this, the simplicity, effectiveness and flexibility of the method presented here makes it suitable for many different applications that can take advantage of scattering control in transmission or now also in reflection (for example super-resolution imaging [76]), as well as for the further analysis of the relationship between transmission and reflection in scattering materials.

Experimental setup
The laser beam (λ = 640 nm, power up to P = 100 mW; iBeamSmart, Toptica) is expanded with a telescope (f 1 = 15 mm, f 2 = 150 mm) and sent to the SLM.The SLM is a high-speed digital micromirror device (DMD, 768 × 1024 pixels, pixel size = 13.7 µm 2 ; model V-7000 from Vialux) allowing binary amplitude modulation at a maximum frame rate of 22.7 kHz and is used to display binary pseudo-random checkerboard or Hadamard patterns with typically 64 × 64 macropixels extending over the central 768 × 768 pixels of the DMD (12 × 12 micromirrors per macropixel).Two additional lenses (f 3 = 200 mm, f 4 = 50 mm) combined with a pinhole are used after the DMD to filter the maximumintensity diffraction order mode and to demagnify and image the DMD onto the back aperture of the microscope objective (10X, 0.25 NA, Olympus).The objective focuses the light beam through the scatterer (a glass diffuser, Thorlabs DG20-120, a step-index multimode fiber optic patch cable, Thorlabs M38L02; and a piece of white paper) and a second identical microscope objective is used to collect the scattered light.Finally, a pair of lenses (f 5 = 100 mm, f 6 = 75 mm) in 2f configuration images the back aperture of the second microscope objective onto the CCD camera (acA640-750um, Basler), with a frame rate of 500 fps at full resolution of 480×640 pixels (pixel size 4.8 µm 2 ).Both microscope objectives and the scattering material are mounted on XYZ stages (omitted in Fig. 1) for aligning the system and moving the sample to different positions, as well as for displacing the image plane axially.In our experiments typically 10000 checkerboard patterns are uploaded to the internal memory of the DMD.Then, the projection of a pattern on the DMD triggers the frame capture of the CCD camera.The maximum frame rate of the DMD is 22.7 kHz and the maximum frame rate of the CCD camera is about 1000 fps at a resolution of 96×96 pixels, which allow us to record the whole sequence in about 10 s.We also note that our approach is valid for larger fields of view than those shown in the main figures (20 × 20 µm 2 , see Supplementary Information and Supplementary Fig. 1).For experiments with reflected light, a non-polarizing beam splitter redirects the backscattered speckles towards a pair of lenses (f 7 = 50 mm, f 8 = 25 mm) in 2f configuration that image the back aperture of the first microscope objective onto a CCD camera identical to and synchronized with the one used to capture the transmitted speckles.

Enhancement factor metric
The quality of the generated foci is analyzed with an automated procedure that generates spots at different positions (up to 25) placed in a grid throughout the whole field of view and measures the enhancement, defined as: where I focus is the intensity at the generated foci and I speckle is the mean value of the background speckle [73].

Computer specifications
Our computer has a Linux-Ubuntu operating system, an Intel Xeon CPU E5-1620 v4 @ 3.50 GHz, 32Gb of DDR5 RAM memory, and a Nvidia Titan XP GPU possessing 3840 CUDA cores running at 1.60 GHz and with 12GB of GDDR5X memory running at over 11 Gbps.

Neural network design and performance
We use the Keras library [77] with the TensorFlow [78] back-end for GPU-accelerated neural network training.The networks are trained to map grayscale speckle images to the corresponding binary illumination patterns with a subset of the total dataset of image pairs (8000 pairs in our case) and tested on the previously not introduced data (the remaining 2000 pairs).Once the network is trained, we input the desired PSF and the output binary map is uploaded to the DMD for light control through the diffuser or fiber.
The SLNN we used is a single-layer perceptron, which is a network consisting of one fully connected layer followed by a non-linear activation function bounding the output to the 0-1 range.In principle, it can be represented as a matrix dot product, with bias addition and a sigmoid function applied element-wise to the resulting vector.We found that with the activation function applied per each individual element the model is prone to over-fitting and does not make good generalizations.As a solution, we replaced the nonlinear activation function with a binarization function with a threshold common for the whole predicted pattern (mean value of the prediction) which results in a more robust model with better focus enhancement and faster training.The training time depends on the number of images used (8000 in our case), the batch size (number of images taken for each iteration of the training algorithm, 150 in our case), and the number of epochs (up to 20 for the results presented here, see Supplementary Figs. 3 and 4 for SLNN training performance).With these parameters the single-layer perceptron requires less than 35 seconds for training, while the predicted patterns take about 1 s to be calculated.However, we have verified that lower training times with acceptable enhancement can be obtained by reducing the number of image pairs used in the training.For an analysis on how the training times and enhancement of the focus depend with the number of images pairs used in the training, see Supplementary Fig. 5 For exploring functional relations between transmitted and reflected speckle patterns we concatenated two SLNNs similar to the one described above.The first SLNN (SLNN1) uses the transmitted speckle as input and connects it to the reflected speckle pattern (output) through a single fully-connected layer (without binarization).The second SLNN (SLNN2) connects the reflected speckle (input) to the illumination patterns (output) also through a fully connected layer including the mean threshold binarization.Both sets of speckle patterns can be generated with independent checkerboard illumination patterns.For training of these two SLNNs we used the same number of images, batch size, and number of epochs as for the SLNN discussed above, with similar performance.The desired illumination in transmission is finally fed into SLNN1, which predicts a speckle pattern as output, which in turn serves as input for SLNN2.The output of SLNN2 is a binary illumination pattern that is sent to the DMD in order to experimentally obtain the desired illumination.
In perceptron-like models a single fully-connected layer contains a large number of parameters (the product of input and output vector dimensions) which makes these models more demanding to train as the resolution of the illumination and speckle images and the memory demand increase.CNNs can efficiently reduce the number of trainable parameters and we used a model with three convolutional layers with 48 (9 × 9), 24 (5 × 5) and 12 (3 × 3) filters respectively, each succeeded by rectified linear unit (ReLU) activation and (2 × 2) max pooling operation, followed by a fully connected layer with 0.25 dropout rate.This configuration achieves a performance similar to the SLNN in controlling a single focal spot while having 20% of the SLNN's number of parameters.
As any deeper network, the CNN requires longer training time and a more extensive dataset.A workaround is offered by the fine tuning technique: the convolutional layers are pretrained separately on a dataset of 40000 speckle images in an autoencoder.An autoencoder is a network trained to map its input to itself, however it contains a bottleneck -a lower dimensional middle layer (latent space) where a compressed representation of the data is learned.Our autoencoder has three convolutional layers as needed for the proposed CNN model and a symmetrical deconvolutional decoder.The training time largely varies with dataset size and speckle image resolution, and it is best to provide as much data as possible.Good training results were achieved after 20 minutes of training with 40000 samples sized 256 × 256.The illumination-predicting CNN is then constructed from the pretrained encoder and an untrained dense output layer.During the training through backpropagation, predominantly the output layer is adjusted, while the convolutional layers are already initialized with adequate values to compress the speckle images and extract features.This reduces the CNN training time to under 50 seconds while using the same 8000 image-pair dataset, compara-ble to the SLNN.

FIG. 1 .FIG. 2 .
FIG. 1.(a) Approach for light control through scattering media with NNs.A NN is trained with pairs of illumination and speckle patterns (illustration, see suplementary Fig.1for examples of actual illumination and speckle patterns), using the speckle patterns as input of the network and the illumination as output.Once the NN is trained, it is used to predict the illumination necessary to generate a target pattern after the scattering material.The predicted illumination is subsequently sent through the material resulting in the desired light pattern.(b) Experimental setup.An expanded laser beam at 640 nm illuminates the SLM (in our case a DMD) allowing for binary modulation of the light beam.After filtering the maximumintensity diffraction order with a pair of lenses and a pinhole (PH), the DMD pattern is sent to the scattering material (SM) with a 10× microscope objective (MO, NA = 0.25).An additional identical objective is used to collect the transmitted light and is imaged onto a CCD camera with an additional pair of lenses.Both objectives and the sample are mounted on XYZ translation stages (omitted in the figure).A non-polarizing beam splitter (NPBS), a pair of lenses and a CCD camera are used to retrieve speckle patterns reflected by the sample for experiments with combined transmission and reflection.