Instant Ghost Imaging: Algorithm and On-chip Implementation

: Ghost imaging (GI) is an imaging technique that uses the correlation between two light beams to reconstruct the image of an object. Conventional GI algorithms require large memory space to store the measured data and perform complicated oﬄine calculations, limiting practical applications of GI. Here we develop an instant ghost imaging (IGI) technique with a diﬀerential algorithm and an implemented high-speed on-chip IGI hardware system. This algorithm uses the signal between consecutive temporal measurements to reduce the memory requirements without degradation of image quality compared with conventional GI algorithms. The on-chip IGI system can immediately reconstruct the image once the measurement ﬁnishes; there is no need to rely on post-processing or oﬄine reconstruction. This system can be developed into a realtime imaging system. These features make IGI a faster, cheaper, and more compact alternative to a conventional GI system and make it viable for practical applications of GI.


Introduction
Ghost imaging (GI) is an imaging technology which reconstructs the image of an object by calculating the correlation between two beams (test and reference).The test beam interacts with the object and is collected by a bucket detector without spatial resolution, and the reference light field is detected using a space-resolving detector without going through the object.It has been demonstrated that correlations of both quantum-entangled [1] and thermal light sources [2][3][4][5] can be used to achieve the GI.The image can be formed without a lens (lensless ghost imaging) [6][7][8] or only by using a single-pixel detector (computational ghost imaging, CGI) [9][10][11].Due to the underlying physics and potential applications in many fields, including lidar [12], tomography [13], and medical imaging [14][15][16], GI has attracted much attention in recent years [17][18][19][20][21][22][23].It has also been extended to different domains with certain freedoms of correlation, including atomic domain [24,25], time domain [26][27][28], and spiral imaging [29,30].
A significant obstacle to practical applications of GI is that reconstituting an image requires massive temporal measurements, which necessitates huge memory space with high space complexity.This limitation stems from conventional GI algorithms.For example, the background subtraction algorithm requires the second-order correlation function calculation [3,4,10,16] where • = (1/N) N n=1 (•) means the ensamble average over N−times measured signals, I(x) is the intensity in certain position x of the reference beam, and S is the bucket signal of the test beam.Calculation of this algorithm is time-consuming and post-processed offline, antithetical to online or realtime computation.There have been many attempts to improve the imaging quality of GI, such as differential ghost imaging (DGI) [31,32], iterative ghost imaging [33,34], and higher-order ghost imaging [35,36].However, few works attempted to reduce the memory required or the space complexity to implement online GI.Compressive sensing [37,38], a convex optimization procedure, reduces the required number of acquisitions for GI while increasing computational resources [39][40][41].To date, on-chip GI has not been perfected due to high space complexity.
To address this issue, we proposed a new algorithm, instant ghost imaging (IGI).Its novelty is that it uses the signals between two consecutive temporal measurements, the (n + 1) th and n th , in the test and the reference beams, S n+1 − S n and I n+1 (x) − I n (x), to reconstruct the image of the object [42][43][44].This differential signal is distinct from that used in DGI [31].To demonstrate the validity and the hardware feasibility of the IGI algorithm, we developed a prototype on-chip hardware system using a single field-programmable gate array (FPGA), without any external memory, that can process 500 measurements per second online.Two variants of IGI algorithms were created, which used either the differential signal of the test beam or the reference beam.IGI offers the following advantages: • IGI can drastically reduce memory requirements and space complexity without increasing computation.
• IGI does not reduce image quality compared to the background subtraction algorithm.
• IGI is a generalized GI algorithm that can be used for lensless ghost imaging and CGI.
• The on-chip IGI hardware system measures the signal and reconstructs the image online: the image is formed immediately once the temporal measurement finishes, without relying on post-processing or offline reconstruction.
• The structure of the on-chip IGI system is compact and much smaller than the computers needed to calculate the correlation function in conventional GI procedures.Moreover, the IGI hardware system could be developed into a realtime imaging system at a frame rate of more than 24 frames per second.These features make IGI a faster, cheaper, and more compact alternative to a conventional GI system and make it viable for practical applications of GI.

The IGI algorithm
Experimentally, we can use the N-times measurements to calculate equation (1) of the background subtraction algorithm where the bucket signal S of the test beam is given by S = ∫ I(x t )T(x t )d x t , I(x t ) is the intensity of the test beam, and T(x t ) is the transmissivity function of the object.
The IGI algorithm we proposed differs from equation (2) in using (N + 1) measurements where S n+1 − S n and I n+1 (x)− I n (x) are the temporal differential signals between two consecutive measurements of the bucket detector and the reference detector.
We can demonstrate that equation (3) of the IGI algorithm is equivalent to equation (2) of the background subtraction algorithm when N is rather large.It can be inferred that equation (3) has four terms When N is rather large, it can be assumed that Note that two successive thermal light measurements are independent of each other.Using the statistical law that A • B = A • B when A and B are independent random variables, the last two terms of the equation ( 4) take the form of According to equation ( 5) and ( 6), we find that equation (4), i.e. equation ( 3), is equal to equation ( 2) when N is rather large This requirement for N is easy to satisfy because the number of measurements in GI is usually of the order of tens of thousands.

Experimental Setup
The schematic of the experimental setup is shown in Fig. 1a.A 532 nm laser light goes through a slowly-rotating ground glass disk to produce pseudo-thermal light.A beam splitter (BS) divides the light into two beams, the test beam and the reference beam.A binary mask object of letters TH is placed in the test beam 300 mm downstream of the disk.The mask is close to a complementary metal-oxide-semiconductor CMOS1 (PYTHON300) which is used to simulate the bucket detector; the bucket signal S is calculated by summing up all the light intensities detected by the CMOS1.Another detector CMOS2 is in the reference beam at a distance of 300 mm from the ground glass disk.Each CMOS can carry out 500 measurements per second.The hardware specification about the experimental setup can be found in the Methods section.
The entire calculation required for image reconstruction is performed in the IGI hardware, which consists of two CMOSs, an FPGA (Xilinx Kintex-7 XC7K325T) and a monitor.The FPGA is used to compute the temporal differential signals S n+1 − S n , I n+1 (x) − I n (x), and their product(S n+1 − S n )[I n+1 (x) − I n (x)]; it can process all the 500 measurements per second made by each CMOS.The monitor shows the intermediate results of IGI for a fixed interval, typically four times per second.The IGI hardware system is completely on-chip because the two CMOSs, the FPGA, and the monitor are integrated on a printed circuit board (PCB).This results in a smaller and much more compact configuration than conventional GI systems.We also want to emphasize that the system contains only a single FPGA without any external memory.
We now introduce the framework and workflow of the IGI hardware system, as shown in Fig. 1b.After the n th measurement has been processed, S n , I n (x), and G n−1 (x), which is defined as illustration of IGI workflow in processing one measurement shows that the on-chip IGI system can make a pair of measurements and process them immediately before the next measurement is made.At every 125 th measurements (i.e.four times per second), the monitor will show the intermediate result G n (x)/(2N).When the number of measurement n increases to the preset N, the reconstructed image of the object is immediately available without any post-processing (hence the Instant in IGI).GI is based on the second-order point-to-point correlation between the test beam and the reference beam.We demonstrate this correlation by conducting the Hanbury Brown and Twiss (HBT) experiment, which takes the form

The Hanbury Brown and Twiss effect
where I(x t ) and I(x r ) are the light intensities detected by CMOS1 and CMOS2.We propose a new algorithm, based on the IGI algorithm It uses the differential signals of two beams to obtain the HBT effect.We have shown that this equation is theoretically equivalent to the HBT algorithm when N is rather large.A detailed proof is given in the Appendix.
The HBT experiment is conducted on the setup shown in Fig. 1a to verify the accuracy of the G I GI H BT algorithm.The mask object is removed, and one pixel of the test beam is fixed, x t = x t0 .The experiment is conducted using both offline and online methods.For the offline experiment, we take 15,000 measurements made at a rate of 25 measurements per second, store them in a computer, and use both the G H BT algorithm and the G I GI H BT algorithm to process these data offline.The results, with image resolution 400×280, are shown in Fig. 2a and 2b.These two algorithms produce almost equal results.
For the online experiment, we use the on-chip IGI system to process the measured data at a rate of 500 measurements per second online.The results, showing the time and number of measurements, are shown in Fig. 2c-h, which show that as time increases, the HBT effect becomes clearer.Note that when the time is 30 s, the 15,000 measurements have all been made and the final result is immediately available.The movie shown in the monitor of the IGI hardware system can be found in the Visualization 1.
The experimental results show that the G I GI H BT algorithm accurately calculates the second-order correlation for the two beams, thus providing a solid foundation for IGI to successfully reconstruct the image of the object.

The Instant Ghost Imaging
In a similar process, we use both the offline and online methods to reconstruct the image of the object (the letters TH), which is located in the test beam extremely close to CMOS1 (Fig. 3a).The image that is directly captured by CMOS1 is shown in Fig. 3b.For the offline experiment, 30,000 measurements are made at a rate of 25 measurements per second, which are stored in a computer.The background subtraction algorithm and the IGI algorithm are used to process these data offline.The results, given in Fig. 3c and 3d, show that the two algorithms can reconstitute a clear image of the object at a resolution of 400×280.
For the online experiment, we use the on-chip IGI system to directly measure and process the data online at a rate of 500 measurements per second.Fig. 3e-n show intermediate images produced by the IGI system; they show that as time increases, the ghost image gets clearer.The image appears within 5 s after 2,500 measurements are processed by the IGI system (Fig. 3i); it becomes much more finely resolved at 60 s after 30,000 measurements (Fig. 3n).The movie shown in the monitor of the IGI system can be found in the Visualization 2.

Two variants of the IGI
We further propose two variants of the IGI algorithm These two variants use only the temporal differential signal of one single beam.This feature makes the algorithms easier to implement on the hardware system because fewer registers are required to store signals from only one beam.It can be demonstrated that these two variants are equivalent to the original background subtraction algorithm; the proofs are very similar to that of the original IGI algorithm.Both of these two variants have been implemented on the hardware system.The results of the HBT experiment and the GI experiment for these two variants are shown in Fig. 4. Furthermore, the two algorithms can also take the form of of measurements increased (Fig. 5).Note that the value in each case is the average of all the pixels in one image.n i=1 S n I n (x) increases much more quickly than G n (x).A conventional GI algorithm needs to store the values of n i=1 S n I n (x), n i=1 S n , and n i=1 I n (x).However, IGI needs only to store the values of G n (x), S n , and I n (x).Thus, the memory requirement of GI increases rapidly as the number of measurements increases while the memory requirement of IGI increases much more slowly, indicating that IGI needs much less memory overall.

Disscussion and conclusion
In this study, we conducted offline and online experiments to investigate both the HBT effect and lensless ghost imaging.The offline experiments validated the IGI algorithm, showing that this algorithm provides the same image quality as the background subtraction algorithm.The online experiments demonstrated the feasibility of implementing the IGI algorithm in hardware.
The results show the capability of the on-chip IGI system and its two variants.The on-chip IGI system can process 500 measurements per second, and the image is reconstituted immediately after the measurement without any post-processing.
The reason why the IGI can drastically reduce the memory requirement and the space complexity of GI were analyzed as follows: Firstly, the use of temporal differential signals removes the need for the space-hungry background term in the data acquisition step.Secondly, IGI requires only one frame of data from both the test and reference beams; hence, it needs less memory space to store the differential signals than conventional GI algorithms, such as a background subtraction algorithm or its normalized version.Thirdly, an empirically determined law of digital circuits states that fewer bits to be processed means fewer hardware computational resources are required to process a signal.
In summary, we have novelly developed an IGI algorithm that significantly reduced the memory requirement of conventional GI without more computational resources and degradation of image quality by using the differential signals between two consecutive temporal measurements of each beam.Although we used a lensless thermal light ghost imaging system to illustrate the capability of IGI, IGI can be directly incorporated in a CGI system.This means that IGI is applicable to GI in general.We also conclude that the development of on-chip IGI is feasible and that all the main components, such as FPGA, CMOSs, and monitor, can be integrated onto a PCB.The on-chip implementation of IGI is significantly cheaper, smaller, and more compact than a conventional GI system, which requires computers and other digital components.These advantages pave the way for practical applications of GI.Our next step is to develop this proof-of-principle setup into a realtime imaging system that operates at more than 24 frames per second.

Appendix: The HBT algorithm
The conventional HBT algorithm is G H BT (x t , x r ) = [I(x t ) − I(x t ) ][I(x r ) − I(x r ) ] .Experimentally, we can use the N-times measurements to calculate the HBT effect by We proposed a new algorithm based on the IGI algorithm in using (N + 1) measurements where I n+1 (x t ) − I n (x t ) and I n+1 (x r ) − I n (x r ) are the temporal differential signals between two consecutive measurements of the test detector and the reference detector.
are stored in the corresponding registers, R S , R I , and R G .When the (n + 1) th signal is detected by two CMOSs, giving S n+1 and I n+1 (x), the FPGA can compute the differential signals S n+1 − S n and I n+1 (x) − I n (x), using S n and I n (x) from R S and R I .S n+1 and I n+1 (x) overwrite S n andI n (x) in R S and R I .(S n+1 − S n )[I n+1 (x) − I n (x)]is then calculated and added to G n−1 (x) to give G n (x), which overwrites G n−1 (x) in R G .This

Fig. 1 .
Fig. 1. a, Schematic of the experimental setup.CMOS1, CMOS2: complementary metal-oxide semiconductor; FPGA: field-programmable gate array.The pseudo-thermal light is produced by passing a 532 nm laser through a rotating ground glass disk and the light beam is split into two: one beam illustrates the object and is collected by CMOS1; the other is directly recorded by CMOS2.The FPGA-based on-chip IGI is used to reconstruct the image of the object using the IGI algorithm, and the intermediate results are shown in the monitor.The distances from the rotating ground glass disk to the object and to CMOS2 are both equal to 300 mm.b, The workflow of the IGI system.Green dashed lines: the FPGA extracts data from the corresponding register; Blue dashed lines: the FPGA stores new data in the register, overwriting old data.An orange ball represents a register unit; S n and I n (x) are stored in the registers R S and R I , and the intermediate result G n (x) = (S n+1 − S n )[I n+1 (x) − I n (x)] is stored in the register R G .

Fig. 2 .
Fig. 2. The Hanbury Brown and Twiss effect.The offline HBT effect from 15,000 measurements obtained a by the conventional G H BT algorithm; b by the G I GI H BT algorithm.c-g The intermediate results and h the final result of the online IGI hardware system using the G I GI H BT algorithm.

Fig. 3 .
Fig. 3.The images acquired by offline and online experiments.a The object and b the object imaged directly on CMOS1; the image of the object reconstituted offline by c the background subtraction algorithm; d the IGI algorithm from 30,000 measurements.e-m The intermediate results and n the final result of the online IGI hardware system with the number of measurements and the time in the top of each image.

Fig. 4 .
Fig. 4. The experimental results of two variants.a The HBT effect and b the results of the variant G I GI S (x); c the HBT effect and d the results of the variant G I GI I (x).

Fig. 5 .
Fig. 5. Analysis for the conventional GI algorithm and the IGI algorithm.Increases in the total value of n i=1 S n I n (x) and G n (x) = n i=1 (S n+1 − S n )[I n+1 (x) − I n (x)] for measured n.