HTRSD: Hybrid Taylor Rayleigh-Sommerfeld diffraction

: Computing wave propagation is of the utmost importance in computational optics, especially three-dimensional optical imaging and computer-generated hologram. The angular spectrum method, based on fast Fourier transforms, is one of the efficient approaches; however, it induces sampling issues. We report a Hybrid Taylor Rayleigh-Sommerfeld diffraction (HTRSD) that achieves more accurate and faster wave propagation than the widely used angular spectrum method.


Introduction
As Joseph W. Goodman says, "The phenomenon known as diffraction plays a role of utmost importance in the branches of physics and engineering that deal with wave propagation [1]." The wave equation describes the propagation of light through most media, and leads us to regard optical imaging as a specific mapping of object light into image light distributions. Advanced computational techniques along with the developed digital devices have promoted the contemporary computational imaging [2], which demands precise and high-speed numerical wave propagation that can work over a wide range of, distances, especially in situations that require numerous wave propagations, or where the target fields consist of a large amount of data, such as three-dimensional (3D) imaging, which requires volume diffraction [3,4] or multiple-slice beam propagation methods [5][6][7][8], computer-generated holograms (CGH) [9][10][11] for 3D [12] AR/VR [13] display and imaging with inverse problem solving [4,7,14].
Fast-Fourier-transform (FFT) convolution approach is widely used to implement the diffraction formulas for its speed. However, it induces sampling issues that affect the numerical accuracy of computing diffraction [15,16]. A lot of efforts have been made to achieve accurate propagation in a wide range of distances [17], such as intermediate plane techniques [18], band-limit angular spectrum method (ASM) to avoid aliasing [19], padding to expand the band-width [17,20,21], adaptive-sampling to utilize the space-bandwidth product while reducing the computational complexity [22], non-uniform FFT to achieve fast calculating [22,23]. Most methods focus on sampling issues [24][25][26].
This paper improves the Rayleigh-Sommerfeld diffraction computation without concentrating on the sampling issue. Instead, by Taylor expanding the exponential term in the diffraction convolution kernel, the angular spectrum diffraction formula can be approximated such that a similar numerical accuracy is attainable with fewer bits formats, e.g., single-precision floatingpoint data type. This cuts off the required computational memory by half compared to the usual double-precision floating-point data types, and significantly reducing run-time consequentially.

Theory
We consider free-space propagation of a fully coherent monochromatic scalar wave field, in the spatial domain. Let λ be the wavelength of the scalar field, and k = 2π/λ be the wavenumber. In a 3D Cartesian coordinate system (x, y, z), we consider wave propagation along the z-axis, and the transverse 2D plane r ⊥ = (x, y) is of our interest. | · | denotes vector lengths in ℓ 2 . We are interested in how an unbounded initial field u(r ⊥ , 0) evolves into a new field u(r ⊥ , z) [1], after a propagation distance of z: where ⊗ denotes convolution, and h(r ⊥ , z) is the point-spread-function (PSF) for distance z. The convolution theorem implies computing Eq. (1) using Fourier transforms [15]: where F /F −1 are forward/inverse Fourier transforms, is the Fourier dual of r ⊥ , and H (ρ ⊥ , z) is the coherent transfer function (CTF). The former equation is named as FFT-based direct integration (FFT-DI) and the latter is ASM. In this paper, we consider the widely used free-space Rayleigh-Sommerfeld diffraction (RSD), since in contrast to Fresnel and Fraunhofer diffraction, the RSD is valid in non-paraxial conditions. According to the Rayleigh-Sommerfeld diffraction formula [1,27], the point spread function h(r ⊥ , z) writes as where r = |r| and r = (x, y, z). In Fourier space, the transfer function [24] based on the Weyl expansion is [28]: The performance of the two approaches is different due to the sampling of the FFT, while the band-limited ASM [19] is widely used for its relative accuracy, which will be used as a benchmark baseline in this paper. Except for the sampling issues in the numerical diffraction implementation, we found that accuracy depends on floating-point precision. Because the visible light's wavelength is hundreds of nano-meter and the sensor pixel pitch is at the micro-meter scale, the terms √︁ 1 − λ 2 |ρ ⊥ | 2 in Eq. (4) and exp(jkr) in Eq. (3) cannot be accurately represented by a single-precision floating point number, i.e., the conventional numerical diffraction requires double-precision representations. This limitation is demonstrated in Fig. 1. This limitation presents an obstacle in particular for implementations on graphics processing units (GPUs), which have both limited memory resources, and are significantly faster at processing single vs. double precision numbers. Single precision arithmetic is commonly used in deep learning frameworks [29] and physics-based neural network training [3], but naïvely using single precision for diffraction simulations leads to inaccurate results due to numerical errors in the phase terms.
In the following, we show the proposed Hybrid Taylor Rayleigh-Sommerfeld diffraction (HTRSD) method. With a proper Taylor expansion and hybrid FFT-DI and ASM, we can achieve precise numerical wave propagation with a single-precision float number. We also show the time and memory efficiency. For convenience, we name ASM-single and ASM-double as the conventional band-limited ASM [19] method work in single-and double-precision float format, respectively.
For the PSF in Eq. (3), direct sampling may lead to severe numerical rounding errors caused by exp(jkr), in a general setting when k ≫ 1/z, we can approximate r using Taylor series, that is   ) and Eq. (4) using different computational approaches. The Naïve way computes the kernels by brute force computing the square root, resulting in large numerical errors. In contrast, Taylor expanding the square root leads to smaller numerical errors, and further converging to the ground truth with more Taylor terms, here demonstrated by the first two terms. valid when |r ⊥ |<z: Equation (3) thus can be rewritten as: Assuming |r ⊥ | ≪ z, Eq. (6) reduces to the Fresnel diffraction formula. Similarly, for the CTF in Eq. (4), √︁ 1 − λ 2 |ρ ⊥ | 2 causes errors in the single-precision float number case. This term can be approximated by the Taylor series, assuming λ|ρ ⊥ | ≪ 1, which is a plausible setting in most situations: The CTF for λ|ρ ⊥ |<1 is thus re-written as: Figure 1 shows the improvements when using the proposed Taylor approximations for computing the PSF and CTF. As mentioned in previous paragraph, the double-precision float number requirement leads to significant time and memory consumption, thus limiting the numerical diffraction's applications.
Here we show that with the Taylor expansion, we can eliminate the double-precision float number requirement in the Rayleigh-Sommerfeld diffraction. Figure 1 demonstrates the limitations when computing the PSF and CTF naïvely and with the Taylor expansion in single-precision float number. Figure 1(a) shows the real parts of exp(j(kr − kz)) with respect to |r ⊥ |/z, and Fig. 1(b) shows the real parts of exp(jkz( √︂ 1 − λ 2 |ρ ⊥ | 2 − 1)) with respect to λ|ρ ⊥ |. The difference (red lines) between the naïve and the Taylor expansion show that numerical precision could decrease when using single floating points for computing the PSF and CTF via Eq. (3) and Eq. (4). A naïve approach suffers from severe numerical rounding errors (first rows). In contrast, using Taylor series and by approximating the phase terms, converges to the ground truths computed using double precision accuracy (second rows). Approximation accuracy further increases when adding more terms (third rows). As for the IEEE floating point standard, for single precision the smallest spacing is 1.1921 × 10 −7 and for double precision it is up to 2.2204 × 10 −16 [30]. The accuracy could not beyond this.
Since the ASM is not suitable for long distance propagation [26,31], combine the ASM and FFT-DI with the conditions of the Taylor expansion, the finial diffraction formula is: where z c is the critical diffraction distance for transfer function and impulse response methods, defined as z c = N × ∆x 2 /λ [26], N is the sampling number of the complex field in the object plane, ∆x is the sampling pixel pitch. It is worth noting that compared to ASM that requires two FFTs, three FFTs do not increase significant additional time cost if F (h e (r ⊥ , z)) or F (h (r ⊥ , z)) are pre-calculated in advance.

Numerical verification
Following the band-limited ASM, in the implementation, we pad the original field with zeros for two times and crop the field to the original size after propagation, as shown in Fig. 2. We implemented the code with PyTorch 1.10 [29]. For generalization, we set the input field as a volume of size N x × N y × N z . The CTF or PSF are the same size as the input field. This can be easily applied to both 2D and 3D wave propagation by setting N z = 1 or N z >1. We set the Taylor expansion order as n = 10. The following numerical experiments were conducted on a workstation with two Intel Xeon E5-2690 2.60GHz CPUs.  The proposed method is verified on circular and square apertures with analytical expressions. For a circular aperture of radius a, at different propagation distance z, under normal uniform plane-wave illumination [1], the on-axis scalar fields on the optical axis can be expressed exactly [32,33]: Here we set a = 10λ. Figure 3 shows the magnitude and phase profile with respect to the diffraction distance z. The plot is over a range of 10 −1 ≤ z/λ ≤ 10 4 , which corresponds to a Fresnel Number ranges from 0.01 to 1000, make sure that the diffraction calculation range covers the full wave equation, Fresnel diffraction (near field), and Fraunhofer diffraction (far field). From the pink and olive zoom-in parts, we can observe that both ASM-single and ASM-double deviate from the analytical result in the far-field region. In contrast, the proposed method is closer to the analytical one. This outperforms the conventional one. The on-axis field can not reflect the performance of the whole field propagation since we usually detect the 2D fields perpendicular to the optical axis in practice. Here we perform the comparison of circular and rectangle apertures. In the following, we set the pixel size of the original field as N x = N y = 255, the width and length are L x = L y = 0.5 m, a = 0.125, and λ = 500 nm (same parameters as in Chapter 5 of Ref. [31]). We measured the diffracted fields at z = {1000, 2000, 4000, 20000}m respectively, which correspond to Fresnel number N f = a 2 λz = {31.25, 15.625, 7.813, 1.562}. This setting ensures that the measured diffraction images are within the Fresnel region. Figure 4 shows the comparison of circular aperture diffraction. From both the 2D images in Fig. 4(a) and the plots of the crossed center line in Fig. 4(b), we can observe that ASM-single method doesn't work at all, and on the contrary, ASM-double and the HTRSD-single work well. The diffraction images of a rectangle aperture in Fig. 5 shows the similar results as in the circular aperture diffraction case. In this case, we also provide the plot of the RS calculation with numerical integral (RSI) method in Fig. 5(b). The proposed HTRSD method shows a comparable results to both the RSI and ASM with double precision. From both the tests on circular and rectangle apertures, we can say that the proposed HTRSD-single can achieve comparable results as the conventional ASM. The previous paragraphs show that the proposed HTRSD can achieve even better accuracy in a single-precision float number format than the conventional band-limited ASM method with a double-precision float number. Here we show the time and memory cost comparison. To compare the time cost, we propagate a 2D complex field of pixel size 1024 × 1024 for various times with the ASM in double float precision and the proposed method in single float precision. cost is linearly proportional to the propagation number in both cases, while the conventional method costs 1.5 times over the proposed method. This is not surprising since both GPU and CPU can compute higher number operations if numbers have less precision. Notably, calculation of the Fourier transform of the PSF in HTRSD-single was included in the comparison. Even so, the speed of HTRSD-single is still much faster than the conventional ASM in double precision. For the memory costs, the target object is a volume with the same lateral size as in the speed test but with varying depth numbers. Similar to the time cost, the memory cost is also linearly proportional to the target volume size, while the conventional method costs double memory to the proposed method. This is because the double float is a 64-bit point number, and the single float is 32-bit, so a single float number uses half of the memory as the same size double float number uses. It should be noted that the HTRSD might perform more accurately if double float precision is used, however this may come at the expense of speed and memory usage.

Conclusion
We present a numerical diffraction technique, HTRSD, that can achieve more accurate Rayleigh-Sommerfeld diffraction at a much higher speed but consume dramatically less computational resources as well as memory. This could promote the development of contemporary computational optics, especially computational imaging that includes optimization-based inverse imaging and extensive volume diffraction calculation in both optical imaging and displays. This work could also help computational imaging techniques that employ GPUs due to the power of GPU computing on single-precision floating points [34].
Funding. King Abdullah University of Science and Technology.
Disclosures. The authors declare no conflicts of interest.
Data availability. Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.