Turbulence-immune computational ghost imaging based on a multi-scale generative adversarial network

There is a consensus that turbulence-free images cannot be obtained by conventional computational ghost imaging (CGI) because the CGI is only a classic simulation, which does not satisfy the conditions of turbulence-free imaging. In this article, we first report a turbulence-immune CGI method based on a multi-scale generative adversarial network (MsGAN). Here, the conventional CGI framework is not changed, but the conventional CGI coincidence measurement algorithm is optimized by an MsGAN. Thus, the satisfactory turbulence-free ghost image can be reconstructed by training the network, and the visual effect can be significantly improved.

Ghost imaging (GI) can be traced to the pioneering work initiated by Shih et.al.[1], who exploited biphotons generated via spontaneous parametric down-conversion to realize the first entanglement-based ghost image, following the original proposals by Klyshko [2].The framework of quantum entangled light source determines that the initial GI requires two light paths [1].One of the beams is the reference beam, which never illuminates the object and is directly measured by a detector with spatial resolution.The other beam is the object beam, which, after illuminating the object, is measured by a bucket detector with no spatial resolution.By correlating the photocurrents from the two detectors, one retrieves the "ghost" image.In the debate of the physical essence of GI [3], pseudothermal ghost imaging [4,5] and pure thermal ghost imaging [6] have been successively confirmed, which is great progress for GI.However, the conventional GI is not suitable for practical applications because the structure of the two optical paths limits the flexibility of the optical system.Fortunately, Shapiro proposed a single optical path GI scheme: computational ghost imaging (CGI) [7,8].Because it has only one optical path, the imaging frame is closer to classical optical imaging.
Atmospheric turbulence is a serious problem for satellite remote sensing and aircraft-to-ground-based classical imaging.Surprisingly, Meyers et.al. found that a turbulence-free image could be obtained by conventional dual-path GI in 2011 [19][20][21].This unique and practical property is an important milestone for optical imaging because any fluctuation index disturbance introduced in the optical path will not affect the image quality.Shih et.al. revealed that the turbulence-free effect was due to the two-photon interference [21,22].Li et.al. summa-rized the necessary conditions for turbulence-free imaging [23].However, CGI is a classical simulation of GI, so it does not satisfy the conditions of turbulence-free imaging [22].At present, this conclusion is generally accepted.Even to this day, it is a challenge work to realize turbulence-free CGI.Fortunately, generative adversarial network provides a promising solution for this [24,25].
In this article, we demonstrate that turbulenceimmune CGI can be realized by a multi-scale generative adversarial network (MsGAN).In this scheme, the basic framework of conventional CGI is not changed.Correspondingly, we optimize the coincidence measurement algorithm of CGI by an MsGAN.In atmospheric turbulence environment, the satisfactory ghost images can be reconstructed by training the network.In the following, we theoretically and experimentally illustrate this method.
We depict the scheme in Fig. 1.
A quasimonochromatic laser illuminates an object T (ρ); then, the reflected light carrying the object's information is received and modulated by a spatial light modulator (SLM).A photomultiplier tube (PMT) collects the light intensity E di (ρ, t).Correspondingly, the calculated light where • is an ensemble average.The subscript i = 1, 2, ...n denotes the ith measurement, and n is the total number of measurements.The flow chart of the MsGAN is shown in Fig.The multi-scale attention feature extraction units are composed of multi-branch convolution layers and an attention layer (Fig. 4).The multi-branch convolution layers are composed of different sizes of juxtaposed dilated convolutions, which correspond to different sizes of receptive fields, to extract various features.Three branches with receptive fields of 3 × 3, 5 × 5, and 7 × 7 simultaneously extract features of the input images.After the information feature graphs of different scales are obtained, the cascaded feature graphs are readjusted to the input size through the convolution operation.In feature extraction, an attention mechanism is introduced to generate different attention for each channel feature to distinguish between low-frequency parts (smooth or flat areas) and high-frequency parts (such as lines, edges, and textures) of images to pay attention to and learn the key content of the image.First, the global context information of each channel is used to compress the spatial information of each channel by global average pooling.The expression is where X c is the aggregation convolution feature graph with the size of H × W × C, and Z c is the compressed global pooling layer with the size of 1 × 1 × C. ReLU and Sigmoid activation functions are used to implement the gating principle to learn the nonlinear synergistic effect and mutual exclusion relationship between channels, and the attention mechanism can be expressed as where δ and σ are the activation functions of ReLU and sigmoid, respectively, r c is the weight of excitation, Xc is the feature graph after the adjustment of the attention mechanism.The global pooling layer z c goes through the down-sampling convolutional layer and ReLU activation function, and the channel number is recovered through the up-sampling convolutional layer.Finally, it is activated by the sigmoid function to obtain the channel excitation weight r c .The value of the aggregation convolution layer X c channel is multiplied by different weights to obtain the output Xc of the adaptive adjustment channel attention.
The up-sampling part of the U-Net architecture includes up-sampling layers and a multi-level feature dynamic fusion unit.In the up-sampling part of the generator network, the feature graphs at different levels contain different instance information.A multi-level feature fusion unit is proposed to enhance the information transfer among feature maps at different levels.In addition, we propose a dynamic fusion network structure to solve the problem of feature conflict at different levels.The method assigns different weights to the spatial position of the feature map to screen the valid features and filter out contradictory information through learning.First, the feature images of different scales are adjusted to the same size by up-sampling, and spatial weights are set for feature images of different levels during fusion to find the optimal fusion strategy.It can be specifically expressed as: where ω i is the weight corresponding to feature graphs of different levels, F i↑ is the standard feature graph of the ith feature graph adjusted to a uniform size after up-sampling, F * is the final feature graph output by dynamic fusion of all levels of features through adaptive weight allocation.In the end, generator G introduces a skip connection directly from the input to the output, which forces the model to focus on learning residuals.
We input the feature images generated by generator G and the real images into discriminator D for discrimination.Discriminator D adopts the PatchGAN structure and consists of four convolution layers with 4 × 4 convolution kernels.
where I gt denotes real images, and I gen denotes generated images, respectively.On the content loss of image reconstruction, the mean square error loss of generated images and target images is selected to obtain a higher peak signal-to-noise ratio (PSNR).The mean square error loss is Simultaneously, to eliminate artifacts and restore highfrequency details of images, the reconstructed images have high visual fidelity, and the visual loss is introduced as follows Thus, the total loss function is defined as The experimental setup is schematically shown in Fig. 1.A standard monochromatic laser (30 mW, Changchun New Industries Optoelectronics Technology Co., Ltd.MGL-III-532) with wavelength λ = 532nm illuminates a 50 : 50 beam splitter.The reflected light illuminates an object.The light reflected by the object passes through the beam splitter, and the transmitted light is modulated by a two-dimensional amplitude-only ferroelectric liquid crystal spatial light modulator (Meadowlark Optics A512-450-850) with 512 × 512 addressable 15µm × 15µm pixels.Finally, a photomultiplier tube (PMT) collects the modulated light and outputs photocurrent signals.
Correspondingly, the reference signal can be obtained through software.Atmospheric turbulence is introduced by adding heating elements, which are from an electric furnace placed in the optical path.Turbulence with different intensities can be obtained by adjusting the temperature.
In data processing, the superparameter of the loss function is set to α = 0.5, β = 0.01, γ = 0.01.Adam is used for parameter optimization in the training process, and batch size is set to 1.In the data set, 173 and 35 real object images are collected as the training set and test set, respectively.The effect of turbulence on a classical image is easily observed in Figs.5A(b-d), where three classical images were taken by a conventional CCD camera.Experimental results show that classical images are significantly blurred by turbulence.Moreover, with increasing turbulence intensity, the image distortion becomes more obvious.Figs.5B(f-h) show the experimental results of this method.From the human vision, the reconstructed computational ghost images have better image quality than the turbulence blurred images and are similar to the images without turbulence.To quantitatively evaluate the effect of this method, we use the structural similarity (SSIM) and peak signal-to-noise ratio (PSNR) to measure the image quality.Figs.5C and 5D show that the SSIM and PSNR of the reconstructed images are significantly better than the burred classical images.Indeed, SSIM and PSNR show that there is still a certain gap between the image obtained by this method and the complete turbulence-free image.However, from the perspective of human visual effect, the difference between the image obtained by this method and the turbulencefree image is acceptable.Consequently, this method is called turbulence-immune CGI.To prove the effectiveness of this method, we choose the other Rubik's cubes as the object to experiment.The experimental results (Fig. 6) are almost identical to above one.
In summary, the first turbulence-immune computational ghost imaging experiment was demonstrated in this article.Although CGI framework does not satisfy the conditions of turbulence-free imaging, the satisfactory ghost image can be reconstructed using the Ms-GAN method in atmospheric turbulence environment.We hope that this method can provide a promising solution to overcome atmospheric turbulence in applications of CGI and single-pixel imaging.

FIG. 3 .
FIG. 3. Network structure of the generator G. 1 represents the pre-trained convolution module; 2 represents a multi-scale attention feature extraction unit; 3 represents an up-sampling layer and 4 represents a multi-level feature dynamic fusion unit.

FIG. 5 .FIG. 6 .
FIG. 5. (A) The burred classical images caused by atmospheric turbulence, (B) Computational ghost images reconstructed by MsGAN method, (C) and (D) The SSIM and PSNR values of the images with different intensities of atmospheric turbulence.(a) classical images without atmospheric turbulence, (b), (c) and (d) show the burred images caused by different intensities of atmospheric turbulence.