Fast reconstruction of single-shot wide-angle diffraction images through deep learning

Single-shot X-ray imaging of short-lived nanostructures such as clusters and nanoparticles near a phase transition or non-crystalizing objects such as large proteins and viruses is currently the most elegant method for characterizing their structure. Using hard X-ray radiation provides scattering images that encode two-dimensional projections, which can be combined to identify the full three-dimensional object structure from multiple identical samples. Wide-angle scattering using XUV or soft X-rays, despite yielding lower resolution, provides three-dimensional structural information in a single shot and has opened routes towards the characterization of non-reproducible objects in the gas phase. The retrieval of the structural information contained in wide-angle scattering images is highly non-trivial, and currently no efficient algorithm is known. Here we show that deep learning networks, trained with simulated scattering data, allow for fast and accurate reconstruction of shape and orientation of nanoparticles from experimental images. The gain in speed compared to conventional retrieval techniques opens the route for automated structure reconstruction algorithms capable of real-time discrimination and pre-identification of nanostructures in scattering experiments with high repetition rate -- thus representing the enabling technology for fast femtosecond nanocrystallography.

Single-shot X-ray imaging of short-lived nanostructures such as clusters and nanoparticles near a phase transition or non-crystalizing objects such as large proteins and viruses is currently the most elegant method for characterizing their structure. Using hard X-ray radiation provides scattering images that encode two-dimensional projections [1], which can be combined to identify the full three-dimensional object structure from multiple identical samples [2,3].
Wide-angle scattering using XUV or soft X-rays, despite yielding lower resolution, provides three-dimensional structural information in a single shot [4,5] and has opened routes towards the characterization of non-reproducible objects in the gas phase. The retrieval of the structural information contained in wideangle scattering images is highly non-trivial, and currently no efficient algorithm is known. Here we show that deep learning networks, trained with simulated scattering data, allow for fast and accurate reconstruction of shape and orientation of nanoparticles from experimental images. The gain in speed compared to conventional retrieval techniques opens the route for automated structure reconstruction algorithms capable of real-time discrimination and pre-identification of nanostructures in scattering experiments with high repetition rate -thus representing the enabling technology for fast femtosecond nanocrystallography.
Sources of soft and hard X-rays with large photon flux such as free electron lasers [6] have enabled the high-resolution imaging of unsupported nanosystems such as viruses [2], helium droplets [7][8][9], rare-gas clusters [10], or metallic nanoparticles [5]. For reproducible samples, a set of scattering images for different orientations in the small-angle scattering limit, each delivering a two-dimensional projection of the object's density, can be used to retrieve its three-dimensional structure using conventional reconstruction algorithms [11]. Short-lived and non-reproducible objects, however, elude the repeated acquisition of several images required for the tomographic reconstruction from small-angle scattering. The partial threedimensional information contained in wide-angle scattering enables to overcome this main deficiency, for the prize of an even more complicated inversion problem [4,5,8]. Finding a fast reconstruction method thus remains the major obstacle for exploiting the potential of wide-angle scattering for genuine single-shot structure characterization.
Two aspects distinguish wide-angle from small-angle scattering. First, the projection approximation is no longer valid due to substantial contributions of the longitudinal compo-2 nent of the wavevector, such that the curvature of the Ewald sphere plays an important role.
Second, for the wavelength range for which wide-angle scattering is realised, the refractive index of most materials deviates substantially from unity, and hence multiple scattering, absorption, backpropagating waves, and refraction all have to be accounted for. Currently, all these constraints can only be met by solving the full three-dimensional scattering problem by, e.g., finite-difference time-domain (FDTD) methods, gridless discrete-dipole approximation (DDA) techniques, or appropriate approximate solutions based on multislice Fourier transform (MSFT) techniques [9,12].
These methods allow, for an assumed geometry model of the nanoparticle, to describe their wide-angle scattering patterns. However, the determination of the geometry from those patterns is highly nontrivial, as there exists no rigorous inversion method. Subsequently, the existing applications of wide-angle scattering had to be based on a parametrized geometry model whose parameters can be determined by an iterative forward fit, e.g. by an ensemble of optimization trajectories in phase space as employed in the simplex Monte Carlo approach in Ref. [9]. Because for every iteration step, at least one forward simulation has to be performed, this method is only applicable to a small data set and for a sufficiently simple geometry model [9]. Hence, there is an urgent need for efficient reconstruction methods that can be used in real time for a large data set. Here we present a proof-of-principle study that shows, by considering icosahedra, that a neural network, trained with simulated scattering images, establishes a high-quality reconstruction method of particle size and orientation with unprecedented speed.
Machine learning using neural networks, and deep learning in particular, are ideally suited for the extraction of structural parameters from scattering images, as this is equivalent to the retrieval of a small number of parameters or classes from high-dimensional spaces [13,14].
Originally conceived for analyzing big data, deep learning has already had significant impact in natural sciences, ranging from analyzing phase transitions and properties of matter [15][16][17][18] and simulations of many-body quantum systems [19] to quantum state reconstruction [20,21]. In contrast to data science applications where the neural networks are both trained and validated on real-world data, we take the decisive step by training a neural network on augmented theoretical data and use it for validation of experimental scattering data.
The choice of icosahedra as test objects was motivated by their ubiquity in nature, ranging from viruses [2, 3,22] to rare-gas [23] and metal clusters [5]. Focussing on the last example, 3 which already constitutes a wide-angle scenario (see Fig. 1a), we compute scattering images of icosahedral silver clusters with a range of sizes and spatial orientations using an MSFT algorithm [9], representing the training data. The employed generalized multi-sliced Fourier transform (MSFT) algorithm includes an effective treatment of absorption [5].
We numerically generated 20000 individual scattering images for clusters with a uniform size distribution (30 nm ≤ R ≤ 160 nm) and random orientations in the fundamental domain of the icosahedron, which represent perfect theoretical data. The fundamental domain for the representation by unit quaternions (see Methods for details) is limited by a dodecahedron inside the quaternion-sphere [24], and any rotation in the axis-angle representation may be projected into this domain by determining the distance to the closest quaternion associated to one of the symmetry rotations.
The ultimate goal is to analyze realistic scattering data that are obtained from experiments with various imperfections. Therefore, the neural network should not be trained solely using the ideal theoretical data, but also with appropriately augmented data [25]. In that way, the network will be trained to focus on physically relevant features. Here, we augment our data by adding noise, blur, spatial offsets, a central hole, as well as blind spots and cropping of the images [26][27][28]. These augmentation features address common experimental imperfections associated with photon noise, limited detector resolution, source-point and beam-pointing jitter, transmission of the high-intensity primary beam, and detector segmentation and finite size (see Fig. 2). These augmentations (see Methods for details) increase the training set 11-fold.
We use a state-of-the-art custom 34-layer ResNet [29] containing approximately 20 million trainable parameters (see Methods for details). The validation of the network has been performed on a separate set of augmented theoretical data (unknown to the network). The network was trained with respect to the mean-squared deviation of target and prediction vectors. The performance of the network is benchmarked by the relative prediction error (relative l 1 norm x 1 , blue bins in Fig. 3a) normalized to the possible parameter range. In addition, we specify the relative maximal prediction error over all parameters (maximum norm x ∞ , red bins in Fig. 3a). The reconstruction of the relevant physical parameters is highly accurate, with prediction errors well below 1%. This compares very favourably with established forward-fitting procedures, and demonstrates the reliability of the deep learning approach. Once trained, the neural network vastly outperforms any forward fitting reconstruction methods (see Fig. 3b) which are limited in their temporal performance by the perpetual necessity to generate new numerical scattering data. The entire reconstruction process by the neural network requires only a fraction of the numerical effort to generate one scattering image. For our example, a speed-up of more than three orders of magnitude was achieved.
We demonstrate the network's ability in recognizing structures in imperfect experimental images by applying it to data taken from Ref. [5], where two icosahedral clusters have been identified among the images (left column in Fig. 4). The reconstructed size and spatial orientation (central column in Fig. 4) are validated to reproduce the experimental scattering images (right column in Fig. 4) with very high accuracy. Our results match the reconstructed data published in Ref. [5], with the remaining small deviations having to be attributed to the approximations use in the forward scattering approach rather than the neural network.
We have shown that, using a deep-learning technique based on augmented theoretical scattering data, neural networks enable the accurate and fast reconstruction of wide-angle scattering images of individual icosahedral nanostructures. Our results demonstrate, that a network which has only been trained on theoretical data can be employed for the analysis of experimental scattering data, with image processing times on the millisecond time scale.
Motivated by the performance of this method, we anticipate that a generalization to a wide range of particle morphologies will be feasible. Combined with pre-selection algorithms as utilized in Ref. [9], this may evolve into a classification tool for archimedean bodies. The envisaged combination of identification of arbitrary three-dimensional shapes with short processing times represents the enabling technology for a fully automated analysis of scattering data and real-time reconstruction of ultrafast nanoscale dynamics probed at the next generation of X-ray light sources with high repetition rate -with major implications for a broad range of physical, chemical and biological applications.   T.S., T.F. and S.S. reviewed and edited the paper.

A. Icosahedral Symmetry
The icosahedron is one of the five platonic solids and is spanned by 20 equilateral triangle faces, intersecting with 30 edges and twelve corners. It possesses three-fold rotation symmetry axes C 3 about the center-of-mass of each triangle, two-fold axes C 2 about the center of each edge and five-fold axes about each corner, which together form the icosahedral rotation group I. The 60 symmetry rotations imply that any rotation of a body with icosahedral symmetry is 60-fold degenerate. Hence, the mapping of three-dimensional rotation representations, such as Euler-angle or axis-angle representations, to icosahedral orientations are not unique, but have to be constrained in their parameter range. The fundamental domain of rotations has an exceptionally simple form in quaternion representation of rotations, where it forms a dodecahedron in imaginary space [24].
Quaternions Q are the four-dimensional extension of the complex numbers with three imaginary units i, j and k fulfilling the relations i 2 = j 2 = k 2 = ijk = −1 and ij = −ji. With real coefficients q i , any quaternion may be written as q = q 0 + iq 1 + jq 2 + kq 3 . Imaginary quaternions (q 0 ≡ 0) are isomorphic to the space R 3 , implying that all vectors a = (a 1 , a 2 , a 3 ) can be represented by quaternions as q a = ia 1 + ja 2 + ka 3 . The sum of two vectors then translates into the sum of two quaternions, whereas the quaternion product contains both the scalar product of two cartesian vectors (in its real part) and their cross product (in the imaginary part). The rotation by an angle α of any vector a about a unit vector n can thus be expressed by the product of the quaternion q a with the unit quaternion q rot = cos(α/2) + sin(α/2) (n x i + n y j + n z k). Hence, any rotation can be projected into the fundamental domain by applying all inverse symmetry rotations and selecting the one yielding the smallest rotation angle. For the training of a neural network, the quaternion representation has the additional advantage of providing a useful metric for the distance between rotations.

B. Dataset Generation
The scattering patterns used for training are created by using the MSFT algorithm described in Ref. [5]. In accordance with the experiment described therein, we simulate the scattering of ultra-short XUV pulses with wavelength λ = 13.5 nm and femto-second duration on nano-sized silver clusters. The material parameters are assumed to be equal to bulk silver, with absorption length a abs = 12.5 nm. For the calculations, the electron density of the cluster is discretized on a cuboid grid, chosen to contain a depth of 192 pixels. The outgoing scattered field is determined by the phase-coherent summation of the scattered field of each slice, which can be obtained by Fourier transformation. Before transformation, each slice is zero-padded to a width of 512 pixels, thereby increasing the resolution of the scattering pattern. The computed scattered field is then reduced to an logarithmic intensity profile of 128 × 128 pixels with random background noise, which is stored as a grayscale image. The rotation quaternions are sampled uniformly from the fundamental domain, while the size of the clusters range from 30 to 160 nm. With this procedure, a dataset of 25361 images has been generated, one fifth of which has been reserved for validation.

C. Image Augmentation
Prior to training the neural network, image augmentation is applied to the dataset.
The augmentation is performed by applying eleven different filters to each ideal scattering pattern, and randomly adding the newly generated images to the training set. These filters can be divided into five groups: trivial, noise, blur, cropping and successive application. The trivial filter is the identity mapping, leaving the image unchanged. Noise is applied both with uniform distribution with a randomly chosen intensity upto half the maximum signal, changing every pixel by a random margin as well as salt-and-pepper statistics, where random pixels are set to either minimal or maximal signal. Blurring is performed by convoluting with a Gaussian kernel with randomly chosen radius of upto five pixels, and by jitter distortion.
Cropping filters delete different parts of the image, mainly to account for the characteristics of real detectors. Images are either center-cropped for limited detector size, a central hole of random radius is deleted to simulate the shadow of a beam dump, images are shifted or uneven detector sensitivity is simulated by attenuating parts of the image. Finally, we both randomly combine all image effects, and in addition apply them in a well-defined order so as to generate images that closely resemble experimental results (see Fig. 2).

D. Network Design and Training
For the regression task of assigning a parameter vector to an image, we utilize the ResNet architecture of a convolutional neural networks [29]. We used the 34-layer deep design with tanh activation functions. Upon training, the network parameters were optimized to minimize the mean-squared deviation of the predicted parameters compared to their target values. The training was performed on an Nvidia GTX 1060 consumer graphics card with the Wolfram language neural network framework, which was completed within approximately     in the laser propagation direction is shown in the middle column. The reconstructed radii are very close to those given in Ref. [5]. The theoretical scattering patterns associated with these reconstructions reproduce the experimental images very well, including low-intensity features (right column).