Scanning electron diffraction tomography of strain

Strain engineering is used to obtain desirable materials properties in a range of modern technologies. Direct nanoscale measurement of the three-dimensional strain tensor field within these materials has however been limited by a lack of suitable experimental techniques and data analysis tools. Scanning electron diffraction has emerged as a powerful tool for obtaining two-dimensional maps of strain components perpendicular to the incident electron beam direction. Extension of this method to recover the full three-dimensional strain tensor field has been restricted though by the absence of a formal framework for tensor tomography using such data. Here, we show that it is possible to reconstruct the full non-symmetric strain tensor field as the solution to an ill-posed tensor tomography inverse problem. We then demonstrate the properties of this tomography problem both analytically and computationally, highlighting why incorporating precession to perform scanning precession electron diffraction may be important. We establish a general framework for non-symmetric tensor tomography and demonstrate computationally its applicability for achieving strain tomography with scanning precession electron diffraction data.


Introduction
In this paper, we examine whether it is theoretically possible to recover the threedimensional strain tensor field within a material using scanning electron diffraction (SED) data and tensor tomography methods.
Nanoscale strain is widely used to engineer desirable materials properties, for example, improving field effect transistor (FET) performance [1], opening a bulk bandgap in topological insulator systems [2] and enhancing ferroelectric properties [3]. Strain also arises around crystal defects, which further affect materials properties. The strain tensor field is a rank-2 symmetric tensor field in three-dimensional space and is therefore fully described by 6 components at every 3D coordinate. 3D strain reconstruction of one or more strain components has been achieved using X-ray diffraction techniques, including: coherent Bragg diffractive imaging [4][5][6], micro-Laue diffraction using a differential aperture [7], and diffraction from polycrystalline specimens combined with back-projection methods [8,9]. The spatial resolution of these X-ray techniques is however limited to ca. 20-100 nm and sub-10 nm resolution strain mapping is therefore dominated by (scanning) transmission electron microscopy ((S)TEM) techniques [10]. These techniques include: imaging at atomic resolution [11,12], electron holography [13] and SED [14,15]. Amongst these, 3D strain has been assessed by atomic resolution tomography [16] and in a proof-of-principle reconstruction of a single strain component using SED [17]. In 2D strain mapping, SED has emerged as a particularly versatile and precise approach to strain mapping with few nanometre resolution [18,19].
SED is a 4D-STEM technique [20] based on the acquisition of a 2D transmission electron diffraction pattern at every probe position as a focused electron probe is scanned across the specimen in a 2D scan, as shown in Figure 1. Each 2D electron diffraction pattern comprises intense Bragg disks (see Figure 1) at scattering angles, θ B , related to the spacing of atomic planes, d hkl , by Bragg's law, where, λ, is the radiation wavelength. For high-energy (ca. 60 -300 keV ) incident electrons, the corresponding de Broglie wavelength is ca. 0.0487 -0.0197Å .
The largest d hkl values are typically <3 -5Å for most metals and ceramics and therefore the smallest scattering angles are typically ca. 10 mrad and the corresponding Bragg disks are associated with atomic planes near parallel to the incident electron beam direction. Strain alters the spacing of atomic planes and consequently causes the position of the Bragg disks to change and to be blurred if a varying strain field is sampled along the beam path. Components of strain perpendicular to the incident beam direction may therefore be mapped by tracking the position of Bragg disks as a function of probe position [14].
Strain maps of three path averaged components of the strain tensor in 2D have been reported from a wide range of materials via SED [19,[21][22][23]. Determining the position  of the Bragg disks in each diffraction pattern is a critical step, which has been explored in recent literature with cross-correlation based disk finding approaches achieving the best accuracy and precision [24][25][26][27]. The incorporation of double-conical electron beam rocking [28], to record scanning precession electron diffraction (SPED) data, has further been demonstrated to improve precision both numerically [29] and experimentally [18]. However, progress towards 3D strain mapping using S(P)ED data has so far been limited to a proof-of-principle reconstruction of one strain component in 3D [17] by the lack of a framework for three-dimensional strain tensor field reconstruction using S(P)ED data. In this work, we establish such a framework for three-dimensional strain tensor field reconstruction from S(P)ED data via consideration of an analytical forward model (see . We show that a linearised approximation coincides with a non-symmetric tensor tomography problem and use this to demonstrate recovery of the full strain field. In Section 5 we show that the linearised model for diffraction data coincides with the transverse ray transform on tensor fields with non-symmetric tensors. Tensor tomography is a well-established area of mathematics with many physical applications [30], including (symmetric) strain tomography for polycrystalline materials [31]. Results for tensor fields of non-symmetric tensors are well explored in dimensions greater than three [32,33] and also in three dimensions for a general Riemannian geometry [34]. We combine and simplify these results into a single cohesive argument to clarify properties of the inverse problem in the physical scenario. Finally, we validate our reasoning computationally in Section 6 with a range of complex forward models for scanning electron diffraction. Our analytical and numerical results establish a robust framework for three-dimensional strain tensor field reconstruction using S(P)ED data.

Notation
We define that: • Ψ p : R 2 → C is the probe function representing the incident electron wave with wavelength λ > 0. This is extended to 3D by Ψ p (x, y, z) = Ψ p (x, y), assuming the in free space the wave would not change significantly over the specimen thickness.
• u : R 3 → R and any subscripted derivation denotes the electrostatic potential of the specimen under investigation • x = (x, y, z) gives the standard coordinates on R 3 (direct space) • K = (k, k z ) = (k x , k y , k z ) gives the standard coordinates on R 3 , in Fourier space (reciprocal space), which we treat as R 3 = R 2+1 interchangeably, where k is the 2D coordinate on the detector • D : R 2 → R and subscripted derivatives denote a two-dimensional diffraction pattern recorded on a flat detector 0 else denotes the indicator on the set A • δ p is the Dirac delta centred at p. Formally, this is considered as a tempered distribution on the space of Schwartz functions. Informally, we write it as a function, δ p (x), where it aids clarity

General electron diffraction
Forward models of electron diffraction typically assume coherent elastic scattering of fast incident electrons by the static Coulomb (electrostatic) potential, u, of a specimen in the far-field (Fraunhofer) diffraction limit [35]. This neglects inelastic and mixed elastic/inelastic scattering contributions [36]. Various further assumptions lead to a range of analytical and numerical forward models [37,38]. A key distinction is between kinematical models in which a single electron undergoes at most one scattering event and dynamical models in which multiple scattering events can occur. While the latter is more physically acurate the former captures many important aspects. We proceed here with a kinematical model for analysis and validate our results computationally using both kinematical and dynamical models in Section 6. The kinematical diffraction pattern, D, may be written as where k is the 2D coordinate on the detector, Ψ p u is a pointwise multiplication between the probe function and specimen potential and the incident electron beam is chosen to be travelling parallel to the z-axis. The Fourier transform is only observed on the so called Ewald sphere, on which Bragg's law is satisfied geometrically.

Electron diffraction from a nanocrystal
A crystal is defined as a material that has an essentially sharp diffraction pattern [39]. This means that most of the diffracted intensity is concentrated in Bragg peaks at particular positions. Diffraction patterns recorded from crystals are therefore highly structured and essentially sparse, though in practice diffuse scattering is always observed in addition to the Bragg peaks. This leads to the definition of an ideal defect-free crystal inferred from (1) to be a material with a sparse Fourier transform.
Definition 2.1. The electrostatic potential of an ideal crystal u 0 : for some weightings w i ∈ C and Bragg peaks p i ∈ R 3 .
For conventional crystals the set of Bragg peaks are structured such that there exist some reciprocal lattice vectors a * , b * and c * ∈ R 3 such that The reciprocal lattice vectors link the sparsity of F[u 0 ] and the periodicity of u 0 by the following Lemma.
Lemma 2.2. If u 0 has reciprocal lattice vectors a * , b * and c * then, with V := det(( a * b * c * )), u 0 is periodic in the vectors In other words, there exists a repeating unit v : [0, max(|a|, |b|, |c|)] 3 → R, referred to as the unit cell such that This is a standard result in crystallography and a short proof is provided in Appendix A. The key point is that both direct space and reciprocal space representations of the idealised u 0 are equivalent however have distinct advantages. The direct space parametrisation shows that ideal conventional crystals are built of repeating unit cells which will be the basis of a discretisation in Section 3, however we generally adopt the reciprocal space (Fourier) parametrisation, which is natural for analysing diffraction patterns.
The diffraction pattern from a crystal truncated to a cubic volume can be computed directly using (1) as, and ρ > 0 is the width/depth of the crystal u then This is also a special case of Lemma 3.4 for which a proof is provided in Appendix D. Single crystal diffraction patterns are therefore essentially a sum of spikes centred at γ p i = (p i,x , p i,y ) and with a diffracted intensity proportional to |w i | 2 , which is therefore related to the structure factor in crystallography. The shape of spikes is dictated by Ψ p , encoded in f .

Precession electron diffraction
Precession electron diffraction (PED) patterns are recorded using a dynamic "double conical beam-rocking geometry" [28], as illustrated in Figure 2. The incident electron beam is tilted away from the optic axis (z-axis) of the microscope by a precession angle, α, and rotated about the z-axis above the specimen, with a counter rotation performed below the specimen to integrate diffracted intensities on the detector. Typical values for the precession angle in SPED experiments are ca. α = 0.5-2 • .
Algebraically, PED can be equivalently described by switching to the frame of reference such that the electron beam is stationary and the sample is rotating around the z-axis. We can then write an expression for a PED pattern, D α (k), as, Experimentally, precession is a 'summing' of diffraction patterns although here we write it equivalently as an average or expectation. There is no intrinsic randomness but considering t ∼ Uniform[0, 2π) ensures that the expectation is the desired value. Heuristically, standard diffraction patterns sample the Fourier transform point-wise, which is very sensitive to variations in intensity both due to the intersection of the Ewald sphere with the Fourier space and dynamical scattering, as can be seen in Figure 3a. PED patterns integrate over these intensity variations (Figure 3.b) leading to improved agreement with a simpler and 'more kinematical' model ( Figure 3 parts c and d). This idea will form a key approximation described in Section 4. Overall, Figure 3 shows that the kinematical model has a much faster decay due to the Ewald sphere, however, it is also much less sensitive to oscillations in intensity. After precession, the two models agree well.
standard beam precessing beam diffracted beams diffracted beams pre-specimen deflection coils post-specimen deflection coils 2α Figure 2: Schematic of double conical beam-rocking geometry used to record precession electron diffraction (PED) patterns. The electron beam is rocked around the optic axis above the specimen and counter-rocked below the specimen to record electron diffraction patterns containing Bragg disks integrated over the rocking condition.

Technical assumptions for diffraction imaging
We require five technical assumptions on the experimental setup which are used to justify the images seen in Figure 3. non-overlapping Bragg disks, i.e. ∃r > 0 such that for all i 1 = i 2 : These assumptions can all be met readily in typical SPED experiments by configuring the electron optics while considering the crystal lattice parameters. Assumption 1 and Assumption 2 justify that Figure 3.d is a sum of spots of shape F[Ψ p ] centered at (p i,x , p i,y ) with p i,z = 0. Assumption 3 and Assumption 4 state that the spots are symmetrical and well separated. For the kinematical model, Assumption 4 is equivalent to the symmetry of Ψ p and precession helps to mitigate dynamical effects breaking this condition. Finally, Assumption 5 dictates the final spatial resolution of the strain mapping. The width of the support of Ψ p acts like a point-spread function on the model. Without treating this explicitly, we need to assume that the beam is contained within a single column of voxels in the final strain mapped volume.

Electron diffraction from a strained crystal
There are two standard models of deformed crystals [40], the continuum (deformable ion) model, which we use here for our analysis, and the discrete or atomic (rigid ion) model, which we use in Section 6 for computation. In the continuum model, a deformed crystal is defined as; Definition 3.1. Let u : R 3 → R define an electrostatic potential and R : where R the displacement map and F = ∇ R is the displacement gradient tensor field.
The strain tensor field, ε is the symmetric part of the displacement gradient tensor, which we obtain as ε = 1 2 ( F + F ), while the skew part describes the rotation field. Since the diffraction pattern is sensitive to both strain and rotation we consider general displacement gradient tensor fields, which can then be decomposed into strain and rotation parts following tomographic reconstruction.

Strain in Fourier space
Understanding the Fourier transform of a deformed crystal is the key to understanding the diffraction pattern produced by that deformed crystal within the kinematical model defined in (1) and (3). The following Theorem considers the case of a crystal subject to uniform affine deformation.
where A is invertible, then we can express its Fourier transform as: This is a standard result in Fourier analysis and a proof is given in Appendix B. The appearance of the determinant emphasises that larger potentials diffract more strongly. The only changes to the Fourier transform are a corresponding linear deformation and a change of phase depending on the translation. To translate this result to the ideal crystal u 0 , where the Fourier transform is a distribution rather than a smooth function, we need a further technical lemma.
Due to its importance of this standard result in this work we include a proof in Appendix B. This Lemma demonstrates a key feature for diffraction from deformed single crystals. Under linear deformation, the location of spikes in Fourier space is equivalently linearly deformed and therefore the locations of diffracted peaks are also linearly deformed.

Strained diffraction patterns
To model the diffraction patterns produced by crystals subject to more general deformation we consider a crystal subject to piece-wise affine deformation. This is an implicit assumption that deformed crystals are also piece-wise smooth and so can be well approximated in this framework. We thus assume the material, u, may be expressed as: where: j ∈ [N ] = {1, . . . , N } indexes over all, finitely many, voxels represents the shift to align u 0 with the u in voxel j.
In this work, each voxel is a volume element that is sufficiently large that the discrete atoms blur into a continuum. We note that the results which follow can be extended to more generic tensor fields by limiting the voxel size to zero and replacing the finite sum with a Riemann integral if additional smoothness assumptions are made on u 0 , b j , and A j to guarantee any necessary exchanges of limits.
We can now express the full diffraction pattern of a deformed crystal.
Lemma 3.4. If the probe is narrow (Assumption 5) then The proof of this is in Appendix D. A key point here is that Assumption 5 is used to guarantee that the beam is completely contained within a single column of voxels. Without loss of generality, this column is centred at x = y = 0 which reduces the sum to indices j such that γ β j = 0.
In the limit ρ → ∞, λ → 0 (large voxels, high energy incident electrons) Lemma 3.4 simplifies to lim λ→0, ρ→∞ . This helps to highlight two key properties of diffraction imaging under deformation: • 'in-plane' deformation moves the centre of the spot linearly from • 'out-of-plane' deformation changes the intensity of each spot depending on (A j p i ) z . In the high-energy limit, this is absolute however, in practice the intensity of spot i is dampened by a factor of sinc(ρ(k z (k) − A j p i ) z ) ≤ 1. In either case, dependence of the intensities on the out-of-plane strain A j is highly non-linear.
Lemma 3.4 provides a very explicit model for computing diffraction patterns from deformed crystals yet it is still too complex to directly derive a linear correspondence for the strain mapping inverse problem. To do this we will use precession and also one final technical assumption, that we are in a 'small strain' scenario. Formally, we state this as: Informally, we need to ensure that diffraction patterns of strained crystals still look like blurred diffraction patterns of single crystals. That is to say that the diffraction patterns should still be essentially sharp with isolated, if blurred, Bragg disks.

Linearised model of electron diffraction from deformed crystals
A linear tomography model is developed from the kinematical diffraction model in this section following a physically motivated argument based on precession electron diffraction, which is supported by a parallel mathematically rigorous argument in Appendix D. It is not clear how the necessary assumptions may be justified physically and we therefore provide computational results in Section 6 which demonstrate that our model can be quantitatively accurate.

Strained diffraction patterns with precession
In Section 2.4, precession was motivated as a technique for simplifying the computation of the average deformation. Now, we make this statement more precise by using tools from probability theory. Approximation 1. If the beam energy is large and the strain is small (Assumption 1 and Assumption 6) then for some new weights w i ∈ C.
D α is exactly the simple model one would hope for in strain mapping. Ignoring the squared norm, such a diffraction pattern is the average of each idealised diffraction pattern with an average spot-shape and weighted with an average structure-factor w i independent of the strain parameter A j . The cost for assuming that the raw data D α obeys such a simple model is determined by the error term. With too little precession D α will look close to Figure 3.a with inhomogeneous spot intensities and so the error will be large, however, too much precession introduces a new blurring of the data which also contributes to the error. The important point is that the error should not bias the computations that we go on to make in Approximation 2, in particular it should not bias the centres of spots away from D α .
Algebraically, it is hard to quantify the precise sweet-spot but the proof motivates a rule-of-thumb for the choice of precession angle. It should be just sufficiently large to ensure that for all j and i such that p i,z = 0. This is sufficient to guarantee that for any spot visible in Figure 3 and all small strains (|A j − id | < σ), the precession angle α is large enough such that R t A j p i lies on the Ewald sphere for some t. After a geometrical argument detailed in Appendix E, to analyse the deformation of spots p i with |p i | < P , we suggest the relation In words, the precession angle should be larger than the maximum rotation due to the deformation plus the distance between the flat hyperplane and the Ewald sphere. One final observation from this approximation is that the coordinate (A j p i ) z has disappeared completely, other than in the choice of α above, and the remaining expression only depends on γ A j p i . This indicates that diffraction imaging is insensitive to deformations parallel to the beam direction, which will dictate our choice of tensor tomography model in Section 5.

Linearised diffraction model
The final approximation is to linearise the forward model. D α is now a sufficiently simple, but still non-linear, model to go from a deformed crystal to a diffraction image, however, what we want is a simple linear model to map from a deformed crystal to its average deformation tensor. To do this we propose a simple pre-processing procedure to apply to the raw data D α which corresponds to a linear forward model with respect to the deformation parameters. In particular, we propose computing centres of mass for each of the observed diffraction spots.
where r > 0 is the separation of spots given by Assumption 3.
This theorem directly indicates that centres of mass are a good linear model for deformed diffraction patterns and this is confirmed by the numerical results in Section 6. More generally, this provides a motivation that the centre of deformed spots are equal to average deformation tensors. Other centre detection methods are common in the literature and we also give numerical comparison of the accuracy and robustness of each method.

Non-symmetric Tensor Tomography
Up to this point we have considered how to compute a single average deformation tensor from a single diffraction pattern. The role of this section is to identify this process with an inverse problem capable of reconstructing a 3D strain map from many average deformation tensors and then highlight the relevant properties of this inverse problem.
The fact that strain maps are rank-2 tensor fields immediately puts us in the domain of tensor tomography and the physical characteristics of diffraction imaging seen in Approximation 1 highlights the transverse ray transform (TRT) as the natural parallel.
In particular, Approximation 1 shows that precessed diffraction patterns are insensitive to out-of-plane strain. This aligns with equivalent reasoning in [31] for polycrystalline materials where it was shown that this corresponds to the TRT. The only difference for polycrystalline materials is that the forward model is insensitive to the skew component of deformations and so we will extend the analysis of the TRT to account for general tensor fields of non-symmetric tensors.

Notation and definitions
We define that: We follow the definitions of tensor tomography operators given in [30], defining the Longitudinal Ray Transform (LRT), I :

Electron Diffraction and the Transverse Ray Transform
The first step is to generalise the notation from Section 2 to consider diffraction patterns where the electron beam is not parallel to the z-axis or through the point x = y = 0. This is mainly an algebraic exercise, first to upgrade from average spot centres to average strain tensors and then to realise the generalisation.
(i) Tensors from vectors: E γ β j =0 γ A j γ can be computed from the values of E γ β j =0 γ A j p i for any two non-colinear points, say p 1 , p 2 , such that p i,z = 0.
(ii) Generalising notation: Proof. Let square brackets temporarily denote the horizontal concatenation of vectors into matrices. Note that p i are non-co-linear and so this final matrix is invertible. Thus, we have which verifies the first part. The final part is a simple algebraic argument: From this we see This Lemma is where Assumption 2 becomes relevant. If the crystal is of finite thickness then all p i are always present in the diffraction pattern (not just when p i,z = 0) although the intensity may be small. In practice, all visible spots will lie on this hyperplane which is justified by Assumption 2.
The final step to aligning with the TRT is to replace discrete sums with line integrals. We compute This is now in the classical TRT format as in (10).

Invertibility of tensor ray transforms
The results that we show in this section are necessary for the analysis of the strain tomography inverse problem but are not novel in the field of tensor tomography. Instead, the intention is to bring together all of the relevant results to clearly explain the relatively special case of three dimensions. In dimension strictly greater than three the TRT is invertible and its properties are already well explored [32,33]. In a general three dimensional Riemannian manifold the analysis is given by [34] however this is much more complicated than needed in the Euclidean case. Our argument closely follows that of [32,Section 4], however we also retain the symmetric component of the tensor field. We note that we have formulated these ray transforms in a Schwartz space setting in order to make use of existing results regarding invertibility of both the LRT and the TRT in the next section. Recent work has extended the stability results of the LRT to Sobolev spaces on compact domains [41] and the same techniques appear valid for the TRT but this has not yet been made rigorous.
The invertibility of the LRT and the TRT for symmetric rank 2 tensor fields is established in [30,Chapter 2] and [42, Theorem 1] respectively. We state the key results: In particular, In other words, vector fields f and g can never be completely distinguished experimentally whereas any two symmetric tensor fields, F and G, can be distinguished from experimental data from three well chosen tilt-series.
We can now state the invertibility of the TRT for non-symmetric rank 2 tensor fields.
The proof of Theorem 5.3 relies on the following decomposition lemma.
Lemma 5.4. The TRT of a general tensor field can be decomposed as: is simply the pointwise extension of this equality.
Confirming f ∈ S is also clear as f Hence, to prove part (ii), it suffices to show: The proof of part (iii) is also by direct evaluation, fix a ∈ R 3 . We claim Π ξ [a] × Π ξ = a · ξ [ξ] × . First note the trivial case, when ξ = e 3 = ( 0 0 1 ) To generalise this, recall the property of cross products (Ra) × (Rb) = R(a × b) for all rotation matrices R and vectors b. Because of this, we know If we choose R such that ξ = Re 3 , it follows as required. Finally, we can extend this equality pointwise to f Proof of Theorem 5.3. As J is a linear map, it suffices to prove the theorem in the case G = 0. Applying the decomposition established in Lemma 5.4: Hence, by Theorem 5.2 we know that f , and thus the skew component of F , is never uniquely determined but the symmetric component can be recovered through the equality of Lemma 5.4(ii).
This decomposition, Lemma 5.4, is very powerful for understanding the tensor tomography problem from an analytical stand-point. Theorem 5.3 lifts invertibility results from [30,42] to demonstrates when (or how much) the non-symmetric TRT is invertible. On the other hand, results from [42] also compute the (pseudo) inverses of the LRT and symmetric TRT which could be lifted to a filtered-back projection-like pseudo-inverse of the non-symmetric TRT. This in turn allows us to characterise the sensitivity to noise, the inverse problem of non-symmetric tomography is only mildly ill-posed as the singular values decay at only a polynomial rate.

Physical setting of the transverse ray transform
There are two key differences before applying the standard theory. Firstly, the scaling constant switching between average to integral was ignored in (12), however, practically this represents a necessary re-scaling by the thickness of the specimen to go from the raw data (average deformation) to the linear model (integral of deformation). This also appears in the analysis as a violation of the small strain assumption, Assumption 6, because the vacuum outside of the crystal has a deformation of order one. This scaling allows us to account for the violation by incorporating outside knowledge of the specimen. Experimentally, a high-angle annular dark-field (HAADF) STEM image can be recorded at each specimen orientation to record the object thickness.
Secondly, we draw attention to the different acquisition geometries expected from the theoretical and physical settings. In Theorem 5.3 the TRT requires three (well chosen) continuous tilt-series to be suitably invertible, however in Lemma 5.1 we can only compute the signal if two non-colinear spots are visible in the diffraction pattern. This places a strict constraint on the choice of ξ and, in particular, the set of physically feasible beam orientations is a discrete set. Because of this, datasets in this application will always be in a limited data scenario.

Computational Validation
We perform a computational analysis to validate the approximation of the forward model (9) used to relate electron diffraction to the TRT and to confirm that the TRT inverse problem, which is under-determined and has a non-trivial null-space, can be solved accurately using a realistic amount of data. These tasks are separated for reasons of efficiency by first rigorously testing the forward model to provide a worse case estimate for the error level and then simulate error at and above this level for numerical tomographic reconstructions.

Forward model validation
Physical validation requires quantifying the level of error in (9) knowing the exact deformation and comparing against the computed deformation determined based on measuring the centres of the Bragg diffraction disks. This will be assessed in three ways. First we use the MULTEM [43] package as a dynamical simulator, in particular modelling multiple scattering effects. We suggest this provides a worst-case analysis as the simulation model is more accurate than the analytical model used, and also we use thick crystals where the extra complexity should be most apparent. Second we compare against a kinematical simulation, the analytical model of this study, and observe that the accuracy is equivalent to that of the dynamical simulation. The previous two comparisons use crystals formed as in (4) which ignores strain on the boundaries of each block. The final comparison uses crystals with dislocations where the deformation is continuously defined to verify robustness to the specific definition of strain.
6.1.1. Piece-wise affine deformation phantom Phantom crystals are defined with piecewise linear deformations as in (4) but using the rigid-ion model of deformation, i.e. where the array of atoms is deformed but atoms remain spherical and the same size. To do this, we create a three parameter collection of phantoms. Definition 6.1. Given an initial crystal structure, randomly sampled phantoms are built up atomistically using three parameters: (i) L ∈ N is the number of layers. Phantoms consist of layers stacked orthogonal to the z-axis. Each layer is under constant affine transformation and is (approximately) the same thickness.
(ii) d ∈ {1, 2, 3} is the rank of the displacement gradient tensor. d = 1 corresponds to a simple isotropic scaling of the initial crystal structure. d = 2 also allows for rotation and shearing within the layer. d = 3 allows any generic 3 × 3 tensor.
(iii) σ ≥ 0 is the average magnitude of atomic displacement. In particular, over each randomly generated displacement gradient tensor, the average of the spectral norm of the perturbation from identity will be equal to σ. Note that this average is not enforced in each simulated phantom.

Continuous deformation phantom
When defined atomistically, the deformation gradient tensor requires a choice of interpolation to be related our discussion in Section 3. The piece-wise affine deformation phantoms test this in a discrete sense where deformations were defined to be piece-wise constant with discontinuities at the interfaces between layers. To test accuracy under continuous deformation we define use the continuum deformation associated with a dislocation with Burger's vector b ∝ (1, 1, 1) and line vector u ∝ (1, −1, 0) [44]. In the notation of Definition 3.1 where (r, φ, Z) are the cylindrical coordinates of x on the basis (u × b, u × b × u, u). A dislocation can be modeled naively by removing all atoms along the half-plane defined by φ = π from a crystal and displacing the atoms according to Equation 17. This is visualised in Figure 4 and 100 diffraction patterns were simulated from this phantom with beam parallel to the z-axis and α = 2.

Electron diffraction simulations
Electron diffraction patters were simulated using a wavelength λ = 0.02Å (energy of ca. 300 keV) and an average deformation magnitude of σ = 0.01(= 1%) assuming an incident electron probe function of the form, where J 1 is a Bessel function of the first kind. This probe function corresponds to the probe produced using a circular disk aperture in ideal probe forming optics and the choice of r corresponds to an aperture of 2 mrad.  Kinematical simulations are computed using (3) while dynamical simulations were performed using the multislice approach as implemented in the MULTEM package [43]. The number of points chosen to simulate precession was chosen sufficiently large such that the errors reported in Table 1 had converged, details are given in Section S.2. Precession angles in the range 0-2 • were used.
6.1.4. Results: Linearised model accuracy Computations presented in this section were performed to determine the accuracy with which Bragg disk centres can be measured and related to the average deformation tensor. We also considered how this accuracy may be affected by both the choice of precession angle and the choice of disk centre detection algorithm. The theory suggested in (9), that the centre of mass is an accurate measure of average deformation and we compare this with disk-detection by cross-correlation, which is common in strain mapping literature, to evaluate which method is best at linearising the forward model.
Precession angle is an experimental parameter which (8) suggests should be chosen It is more common in practice to use α < 1 • , however in this case we wish to quantify any expected advantage of using larger values. The following simulations were performed: • 72 MULTEM simulated diffraction patterns with one random phantom for each combination of α ∈ {0 • , 0.  Figure 4). The full phantom objects were 1000Å thick.
• 1620 kinematical simulated diffraction patterns with 30 random phantoms for each combination of α ∈ {0. • 100 kinematical simulated diffraction patterns of the dislocation phantom with α = 2 • at different beam locations. The full crystal was 250Å thick.
In particular, we used a disk-detection method involving patch-wise (least squares) registration between each Bragg disk in the strained diffraction pattern and the corresponding Bragg disk in an unstrained diffraction pattern, the exact form of this is in (18). A sketch of this pipeline is provided in Figure 5.
For each precessed diffraction pattern we compute centres for each spot in the inner-most ring on the pattern. Predicted and computed centres are only ever compared like-for-like relative to centres computed with the same algorithm using an un-deformed sample as reference.
In the notation of (9), we define the true centres of each spot as in the case of discrete piece-wise affine deformation, for the continuous deformation map R computes an average deformation where T is the known thickness of the phantom.
where γ p i is computed from an un-deformed diffraction pattern, D 0 α . We fit a zeromean Gaussian to the TRT modelling errors and so the important error measure is the Euclidean distance between detected and expected centres, i.e. the error variance. We report the values of error = 100 · |c true − c| |c true | (19) which is scaled to percentage error. This is convenient because with σ = 1%, a naive detection algorithm of c = γ p i (i.e. zero strain) corresponds to an average error of 1.   below this are super-resolved.
• Increasing the precession angle in this range reduces the errors for all models and centre detection methods.
• Comparing centre detection algorithms, both have comparable maximum accuracy yet the registration method appears much more robust to changes in the simulation mode and phantom. It also converges much faster with respect to precession angle with little gain between 1 • and 2 • .
• Errors for the continuously deformed phantom are noticeably worse than with the piece-wise constant phantoms. An example disc is shown in Figure S.1 and we see it is qualitatively very different from those shown previously in Figure 4. The more smoothly varying deformation causes an elliptical blurring of the disc which may make it harder to consistently detect the centre.
• The high-energy model achieved optimal accuracy without precession. This suggests the dominant benefit of precession in these examples is smoothing the non-linearities of the model rather than accounting for rotation out-of-plane.
In a worst case, with 2 • of precession, we observe errors between peak-finding and the TRT model approximation of 0.12%, corresponding to a signal-to-noise ratio (SNR) over 8. On the other hand, the registration peak finding algorithm consistently achieves an average accuracy of 0.04% corresponding to SNR= 25.

Tomographic Reconstruction Validation
Section 5 considers the analytical properties of the continuous inverse problem we wish to solve but in practice we have corrupted and limited data. Here, we perform a reconstruction from a realistic dataset and analyse the accuracy both quantitatively and qualitatively. For this purpose we generate a phantom, F † : [−1, 1] 3 → R 3×3 and simulate data using the model where η is a 0-mean isotropic white noise tensor field.
The choice of noise level and phantom were chosen to be slightly more challenging than suggested by the results of Section 6.1.4. In particular, a phantom is chosen with L = 3 layers, d = 3 dimensional deformation, and an average deformation magnitude of σ = 2%. We perform reconstructions with two levels of noise. Firstly at 0.1%, in line with Table 1, and then at 1% (SNR = 2) to account for any noise and physical modelling errors not considered previously. The final experimental choice is to specify a scan geometry which respects practical time and hardware constraints. 42 tilt directions, shown in Figure 6, were selected and for each tilt we scan over a 50 × 50 grid of beam positions. Directions are chosen down zone axes to guarantee at least two noncolinear Bragg discs in the diffraction patterns, as required by Lemma 5.1. For any rotation R ∈ R 3×3 of the sample, the corresponding direction is the third column of R, x y z . This direction is then plotted with a stereographic projection at point The plot in Figure 6.b assumes that the crystal is initially perfectly aligned with the z-axis. If this is not true then an offset must be computed and passed on the computation of Euler angles for the specimen holder.
For a reconstruction method we choose to perform a standard Total Variation reconstruction [45][46][47]: where β is a manual tuning parameter. With perfect data there is little need for regularisation (β = 0) however, in this example the Total Variation functional is compensating for: • Measurement noise, representing modelling errors at either at 0.1% or 1% magnitude • Limited angular range, within 70 • of the initial orientation × ) for all φ ∈ C 1 0 ) as discussed in Section 5. While each of these factors has a different physical or analytical origin, numerically they are all incorporated into a choice of β > 0. In both reconstructions the parameter was coarsely tuned to β = 10 −4 2 . Many other choices of variational methods exist, for instance those compared in [46], which each account for noise and 'fill in' missing data in their own characteristic fashion. Total Variation is commonly chosen because it promotes sparse jumps in the reconstruction [46,48]. This specifically reflects the structure of the phantom in this Section but is also often accurate for other physical samples. Figure 7 visually compares reconstructions from low/high noise data against the original phantom. We see that the general symmetry of the phantom is preserved,   reconstructions are uniform in the x-and y-axes and partition into clear slices along the z-axis. The errors in the reconstruction from low noise are imperceptible but at higher noise levels we see the layer interfaces have a slight blur. Figure 8 allows us to quantify these errors more precisely. The cross section at low noise shows that the structure of the deformation is well recovered, however, in the flat regions a small consistent error is made in the deformation tensor. Errors at high noise have the same structure but larger magnitude, approximately a factor of 6.5 larger error for a factor of 10 larger Low noise High noise noise. The interfaces are still well identified however the jump is more visibly blurred. In all cases, approximately 3 pixels away from a discontinuity errors improve rapidly, by up to a factor of 10. More detailed error comparisons are given in Figure S.3 show that the 99 th percentile is a representative reference for the general structure of errors.

Interpretation of errors
The three main sources of error are from noise (or modelling error), acquisition geometry, and the null space in the skew component. Figure 8 allows us to compare the impact of each of these factors. Errors from the noise should distribute uniformly over the reconstruction. This error should be constant in uniformly strained regions (near z = 0 and z = ±50) and is visibly much lower than the error at interfaces. Comparing between low and high noise, we also see that errors scale approximately linearly with the noise level.
Tomography with a limited angular range, called limited angle tomography, is common in electron microscopy [46,49,50]. From this literature, we know that all changes in the deformation in directions near-orthogonal (at angles greater than 70 • ) to the z-axis are all missing from the observed data. Uncertainty over where jumps occur lead to blurred edges which are seen as the spreading of errors in the z direction. The jump from crystal to vacuum provides a worst-case scenario and, as previously commented, this blurring is approximately 3-4 pixels in radius. This radius is consistent between different noise levels and the full/symmetric strain components.
Comparing the second and third columns of Figure 8, it is clear that the error on the symmetric components is no smaller than the full error. This indicates that the contribution of error from the null space of the skew component is negligible.
Returning to the question of symmetric strain reconstruction, there are two conventions for the definition of strain, either In words, symmetric strain is either the arithmetic mean or the geometric mean. In both cases they are easily computed from the full displacement gradient tensor F recon . The analysis of Section 5 nicely aligns with the first definition, which is the default in this work, but the error analysis above shows that this does not bias the quality of the reconstruction. Both definitions of symmetric strain are reconstructed with equivalent levels of accuracy in this example. The total accuracy is dictated much more by the acquisition geometry than noise (representing modelling error) or missing data in the skew component.

Discussion and future steps
In this work we have proposed a tomographic model for the 3D strain mapping problem, analysed and extended the known analytical properties of the resulting inverse problem, and provided numerical results for both modelling and inversion steps. Table 1 shows that our forward model is accurate up to an SNR of 20 with respect to the TRT model. Beam precession is key to this accuracy and disk registration based methods are more stable to this uncertainty relative to centres of mass. We note that the 0.08% found here agrees exactly with a comparable quantity of [29] despite very different deformations considered in each study. In Figure 8 we have shown that reconstructions can be performed with realistic experimental parameters and achieve accurate results, even with errors much larger than predicted. Because of this, it is our belief that there is no benefit to quantifying errors beyond this SNR of 20 within the scope of this work, namely the validity of the dynamical model tested in Section 6. Realistic reconstructions can be recovered well at this level of error, although this does not take into account many factors such as detector performance, electron optical aberrations, and inelastic scattering. These factors could potentially be the dominant causes of reconstruction error in practice and should be minimised experimentally and assessed further.
Our proposed framework requires diffraction patterns to be recorded near zone axes where diffraction patterns have straight lines of spots. It would be much faster experimentally to acquire many diffraction patterns with single arced lines of spots, called 'off-axis' diffraction patterns, however this would take us away from the TRT model. On the theoretical side, there have been recent advances in histogram tomography which could assist the well-posedness of strain tomography [51]. In particular, in this study we compute the centre of mass (first moment) of each spot in a diffraction pattern and the resulting inverse problem is the TRT which always has a large null-space. If we also extracted second order moments from each spot then the model is no longer the TRT and the extra data may remove the issue of nonunique solutions. Finally, there is an interesting conflict in the desired scan geometry due to the physical model and the TRT. As commented in Section 5.4, theoretical results of the TRT rely on three orthogonal tilt series whereas we only have access to data at a discrete set of orientations which, dependent on the crystal structure, arise (approximately) uniformly over the sphere. It would be interesting to unify these two pressures and analyse characteristics of the tomography problem in such a constrained geometry.
In other words, there exists a repeating unit v : [0, max(|a|, |b|, |c|)] 3 → R such that Proof. By direct computation, Therefore u 0 (x + a) = u 0 (x) for all x. By symmetry, the same holds for b and c. Define Note that a, b and c span R 3 therefore for any x there exist x j ∈ R such that x = x 1 a + x 2 b + x 3 c. We can conclude as required.
Appendix B. Theorem 3.2 and Lemma 3.3 Theorem (Theorem 3.2). If where A is invertible then we can express its Fourier transform as: Proof.

Proof. From Theorem 3.2 we know
Thus, to complete the lemma, it suffices to show This is verified by an arbitrary test function,

Appendix C. Probability background
We recap some concepts and technical results from probability theory which are needed in the proofs in Appendix D.
• X : t → X t ∈ C can be called a random variable where random (complex) values of X can be sampled by sampling indices t ∈ [t 0 , t 1 ] uniformly at random, i.e. t ∼ Uniform[t 0 , t 1 ).
• For a random variable, X, we define its expectation to be If t is a discrete index then integral can be replaced with summation and, for t 0 , t 1 ∈ Z, this becomes In quantum physics this would also be called the expectation value, E X = X .
• For a random variable X and value x ∈ C we say denote the probability that X t = x to be with the appropriate +1 in the denominator if t is discrete.
• Let X and Y be random variables. For x, y ∈ C, we say the events X t = x and Y t = y are independent if probability(X t = x and Y t = y) = probability(X t = x) · probability(Y t = y) • We say that two random variables, X and Y , are independent if events X t = x and Y t = y are independent for all x, y Lemma Appendix C.2. A standard result from probability theory: If X and Y are two independent random variables then Also, for any continuous function, ϕ, the random variable ϕ(X) i,j,t = ϕ(X i,j,t ) is also independent of Y .
Lemma Appendix C.3. Let ϕ : R n → R be a twice differentiable function and let η : [0, 1] → R n a random variable then Proof. Because the length of the path (size of domain of η) is one, we can replace expectation with integration and then replace ϕ with its standard Taylor expansion: (Taylor expansion about E η)

Appendix D. Algebraic justification of Section 4
In Section 4 we give a physical motivation for why centre of mass calculations should accurately predict average strain values. In this section we provide a formal mathematical link from the precessed Ewald sphere model to this same goal. Some of the statistical assumptions made here are highly technical and it would be hard to justify them from a purely physical perspective. However, as in the main text, the core justification which we rely upon is the simulation study of the whole pipeline.
In the remainder of this section, we first sketch the proof at a high level listing a sequence of results and then the details of the longer proofs appear at the end of the section.
To recap, the starting point of this argument is the physical model which we assume to be exact. In particular, The first step is to spatially localise the strain which generates the signal in a diffraction pattern. A key part of tensor tomography is that signal only depends on the strain contained in blocks (or voxels) which directly intersect the electron beam. This is formalised in the following result which also expands the notation to take advantage of the particular structure of u.
As can be seen, this expression is still highly non-linear in terms of the strain parameter A j and indeed too non-linear for us to develop a linear model directly. To make this possible we introduce the beam precession technique. The heuristic is that if we modify our data at acquisition then we can analyse it as if it were generated by a simpler model. There is of course a trade-off here, a small amount precession will usefully smooth the problem but too much will begin to blur out the desired structures. The following two lemmas make this statement precise and their combination is exactly the statement of Approximation 1.
Proof. This is just the definition of variance, This claim is the most physically ambiguous. It suggests that coherent precession is a good approximation of incoherent precession. The justification of this is that it only needs to be true when looking at the whole pipeline. In particular, the variance term may be large in magnitude but so long as it does not strongly bias the locations of the centres of the diffracted peaks then it will not affect the overall approximation.
, and b j are independent random variables over (j, t) and the high-energy limit is valid (Assumption 1) then for some new weights w i ∈ C.
While the assumptions of this lemma appear highly technical they are also not unreasonable from a physical standpoint. b j and A j represent translations and strains respectively. These are physically distinct quantities and so b j should also be statistically independent from the first two terms. Looking closer at these terms, A j R t γ is the first two columns of A j R t and A j R t 0 0 1 is the third. If α is small then this is just a statement that the out-of-plane strain is independent to the in-plane strain.
Finally, we need to justify that the centre of mass is an accurate predictor of average strain. The difficulty here again is the squared modulus, if the exit waves were imaged directly then this step would be direct but the square has the potential to bias towards points of maximum intensity. The assumptions necessary at this stage are some form of symmetry on the strain field, the technical statement is as follows.
Lemma Appendix D.3. Suppose the conditions of Approximation 1 hold and the diffracted spots are symmetric and non-overlapping (Assumption 3 and Assumption 4). If the random variables A j 1 + A j 2 and A j 1 − A j 2 are independent over random pairs of indices {(j 1 , j 2 ) s.t. γ β j 1 = γ β j 2 = 0} then for each i: whenever the denominator is not 0 and where r > 0 is the separation of spots given by Assumption 3.
Theorem 2 is the combination of the previous three lemmas where any violations of the exact conditions becomes absorbed into the error term. We now provide the proofs of these lemmas.
Proof of Theorem 3.4. By Theorem 3.2 we have which can be split into its (x, y) and z components. Combining this again with Theorem 3.2 we can expand To simplify this, observe that for all smooth functions, φ, ψ: Thus we derive Finally, if the beam is smaller than the width of a single block (from Assumption 5, 2r < ρ) then only one column of blocks directly on the beam path contribute to the diffraction signal. Without loss of generality, this is the set of blocks j such that β j,x = β j,y = 0, equivalently γ β j = 0. With this simplification, we can expand Substituting this above gives as required.
Proof. This is a purely algebraic proof: The theorem is concluded by gathering the new constants into w i . Proof. The proof is direct:

Lemma
Proof of Lemma Appendix D.3. To compute centres of mass, it shall be convenient to define some new functions and abbreviations. We define the functions: kf (k − 1 / 2 c)f (k + 1 / 2 c)dk and the points q i = γ p i , q i,j = γ A j p i to remove excessive use of the γ projection. Also note by Lemma Appendix D.5 that F 1 (c) = 0 for all c. We now compute the relevant integrals starting with the formula of (7): Using Assumption 3, we know that the only i for which the summand is non-zero on this integral domain is i = i. Also using Assumption 3, we know that f (k − q i,j ) = 0 outside of the restricted domain and so we drop this constraint to simplify notation.
Next, we expand the brackets noting that f is a real valued function: The coordinate translation k → k − 1 2 (q i,j + q i,J ) simplifies this to a special case of F 0 : f (k − 1 / 2 (q i,j − q i,J ))f (k + 1 / 2 (q i,j − q i,J ))dk Similarly, Finally, as A j + A J is independent of A j − A J we can translate this to q i,j /q i,J and again apply Lemma Appendix C.2: Ej,J F 0 (q i,j − q i,J ) Figure E1 sketches the choice of precession angle for a deformed sample. There are two triangles of interest, one in the positive z direction which accounts for the curvature of the sphere and another in the negative direction to account for strain. The upper triangle is a right angled triangle with one corner at the origin and the other at a point where the sphere of radius P meets the Ewald sphere, say at point (k, k z (k)). This gives the relation ship

Appendix E. Precession angle estimation
i.e. k z (k) = λP 2 4π . On the other hand, the strain moves the point (P, 0) a maximal distance of σP from its starting point and away from the Ewald sphere. Assuming the worst strain is a rotation, we get an isosceles triangle whose angle can be computed with the cosine rule: Combining, the maximal angle is α = cos −1 1 − σ 2 2 + sin −1 λP 4π .
Ewald sphere P curvature strain P λP 2 4π P σP Examples of strained discs for the layered phantoms are given in Figure 5 however the discs for the continuously deformed phantom looked qualitatively different. An example is given in Figure

S.2. Precession discretisation
For the dynamical simulations, multiple simulations were run with different numbers of points discretising precession and the corresponding values in Table 1 are plotted in Figure S.2. Dynamical results stated in Table 1 are with 2 8 precession points. Because of the number of kinematical simulations, the full plots were not computed. Instead, as α = 2 • was the slowest to converge for dynamical, the number of precession points was increased until the values stated in Table 1 for kinematical simulation at α = 2 • had converged within the necessary two decimal place rounding error. Kinematical results stated in Table 1 are with 2 5 precession points.

S.3. Reconstruction error plots
In Section 6.2 a reconstructed gradient deformation tensor is computed and the 99 th percentile errors are reported. Figure S.3 demonstrates the spread of errors in more detail. The first column represents the middle slice (a best-case analysis) which shows that errors can be up to a factor of four smaller than in the worst case (maximum shown in third column). The 99 th percentile gives an average worst-case. Qualitatively, this agrees much more with the middle slice indicating that the largest errors are achieved on the interface voxels between crystal and vacuum and on the smooth interior errors are much lower, up to a factor of three.  Table 1 for dynamical simulation for different numbers of precession points. The solid and dashed lines shows the convergence of centre of mass accuracy and registration method respectively.