Image Difference Metrics for High-Resolution Electron Microscopy

Digital image comparison and matching brings many advantages over the traditional subjective human comparison, including speed and reproducibility. Despite the existence of an abundance of image difference metrics, most of them are not suited for high-resolution transmission electron microscopy (HRTEM) images. In this work we adopt two image difference metrics not widely used for TEM images. We compare them to subjective evaluation and to the mean squared error in regards to their behaviour regarding image noise pollution. Finally, the methods are applied to and tested by the task of determining precipitate sizes of a model material.

disadvantages [12]. In HRTEM several image comparison methods are already in use, mainly for iterative digital image matching. These methods include various image agreement factors, calculated with the cross-correlation factor [13], the fractional mean absolute difference and discrimination factor [14], or the χ 2 goodness-of-fit test [15]. Usually, when simulated images are compared to the experiment using current methods, background noise is removed by applying Bragg filters to the Fouriertransformed images [16]. The noise removal, however, in turn introduces new artifacts [17]. Thus, one of our main criteria to judge the usefulness of an image difference metric for HRTEM images is its robustness against noise. However, robustness alone is useless when the difference metric in turn ignores fundamental changes in the image. The, in theory, perfect difference metric should ignore changes caused by noise while still retaining its sensitivity to small changes of the underlying structure of the image. We acknowledge that this criterion is to a great extent subjective as the concept of a defining structure of an image is ambiguous and largely dependent on the specific case which is the reason for the multitude of existing image difference metrics.
For HRTEM images, however, the image structure often coincides with the atomic structure of the material. Thus, among other reasons, we have chosen our image metrics based on their ability to recognize position and orientation of periodic shapes.
The first part of this work will deal with evaluating three promising image difference metrics by their response and robustness to different types of noise in the image. We put a strong focus on shot noise as it is ever present in experimentally acquired im-ages and even artificially added to simulated images when the goal is direct comparison, while other sources of noise can be reduced by cooling the detector or use of direct detection. In the second part of this work we apply the investigated difference metrics to the model task of automatically detecting precipitate sizes in simulated high-resolution images of Nb 3 Sn. While mainly serving as a further demonstration of the versatility of the chosen difference metrics, the method can easily be extended to experimental applications where the automatic quantification of precipitates or similar structures is of interest.

Methods
We use three different methods to gauge the difference of an image compared to a reference image: the structural similarity index measure (SSIM) [18], the scale invariant feature transform (SIFT) algorithm [19] and the mean squared error (MSE). (2µ x µ y + C 1 )(2σ xy + C 2 ) 2(µ 2 x + µ 2 y + C 1 )(σ x + σ y + C 2 ) , where µ x is the mean and σ x the variance of a circular image batch around x, σ xy the covariance be- where c x = {c x,i |i = 1, ..., N } and c y = {c y,i |i = 1, ..., N } are sets of coefficients of the wavelet transform extracted at the same spatial location in the same wavelet subbands from images A and B, respectively, and K is a small, real, positive constant.    (MSE) as  Fig. 2
The fragile nature of these materials makes them  In order to demonstrate the feasibility of the approach, we take the pristine Nb 3 Sn crystal from  ference background due to noise. In this case the detection algorithm is no longer valid and no reliable particle diameter can be determined. Fig. 4(g)

Detectability of Precipitates
indicates that above an electron dose dependent particle size the detected diameter increases linearly with increasing diameter. Thus, the detected precipitate diameter is underestimated by an approximately constant value for a given electron dose.
The error in counting for lower doses is not primarily a result of the inability of the SIFT image difference metrics to distinguish between the different crystal structures, as it would be for the cw-SSIM metric (see Fig. S3). It is, however, a consequence of the chosen threshold and, in turn, the algorithm responsible for counting the unit cells which we wanted to keep as general as possible. For completeness, we present the detected precipitate diameters when an electron dose dependent threshold function more optimized to the specific problem is used (Fig. 5).

Conclusion
In this work we have examined the feasibility of applying difference metrics to tasks involving HRTEM images. We find that the three chosen In the future we plan on using the investigated image difference metrics on energy-filtered maps with the intent of parameter optimisation for orbital mapping [46], where an automatic image analysis method can replace the need of manually investigating thousands of images.
We acknowledge financial support by the Aus- consisting of the local image signal mean where x i are the points of the batch around x and C 1 is a small constant to avoid instability when µ 2 x + µ 2 y is close to zero. In the next step the local mean is removed from the signal and subsequently the contrast is compared. The contrast comparison function c(x, y) = 2σ x σ y + C 2 σ x + σ y + C 2 (3) and a small constant C 2 . In the next step of the workflow the image signal is normalised by its own standard deviation. Lastly, the structure comparison function is defined as with the local correlation coefficient σ xy between x and y In order to achieve more robustness against image noise, the similarity index is extended to the complex wavelet transform domain [2]. The continuous complex wavelet transform of a real signal f (x) is given by (8) where ψ(x) is a continuous, complex function We use the complex Morlet wavelet [3] (or Gabor wavelet) defined by where ω 0 is the center frequency of the wavelet.
where N = 30 is the number of coefficients and K is a small, positive constant. Subsequent averaging over all points leads to a total image similarity measure, exactly as in the real space case.

SIFT
The image difference based on the scale-invariant feature transform (SIFT) [4] is calculated according to the workflow diagram in Fig. S1.
where * is the convolution operation in x and y, and G(x, y, σ) = 1 2πσ 2 e −(x 2 +y 2 )/2σ 2 is the Gaussian function. After an octave, i.e. the doubling of σ, the Gaussian image is resampled by a factor of 2. In each octave the difference between images of scales separated by a constant multiplicative factor k is calculated, resulting in the difference-of-Gaussian (DoG) function D(x, y, σ) = L(x, y, kσ) − L(x, y, σ).
Blurring an image with a Gaussian kernel suppresses only high-frequency spatial information.
Thus, the DoG acts like a band-pass filter, attenuating spatial frequencies outside of the range between σ and kσ. The pyramid of Gaussians and DoG can be schematically seen in Fig. S2. In this work we have used 3 layers of DoG per octave which is also the value used in [6].