Single-pixel 3D reconstruction via a high-speed LED array

Three-dimensional reconstruction can be performed in many ways, among which photometric stereo is an established and intensively investigated method. In photometric stereo, geometric alignment or pixel-matching between two-dimensional images under different illuminations is crucial to the accuracy of three-dimensional reconstruction, and the dynamic of the scene makes the task difficult. In this work, we propose a single-pixel three-dimensional reconstructioning system utilizing structured illumination, which is implemented via a high-speed LED array. By performing 500 kHz structured illumination and capturing the reflected light intensity with detectors at different spatical locations, two-dimensional images of different shadows with 64 × 64 pixel resolution are reconstructed at 122 frame per second. Three-dimensional profiles of the scene are further reconstructed using the surface gradients derived by photometric stereo algorithm, achieving a minimum accuracy of 0.50 mm. Chromatic three-dimensional imaging via an RGB LED array is also performed at 40 frame per second. The demonstrated system significantly improves the dynamic performance of the single-pixel three-dimensional reconstruction system, and offers potential solutions to many applications, such as fast three-dimensional inspection.


Introduction
Three-dimensional (3D) reconstruction is an intensively explored technique, which is applied to areas such as public security, robotics, medical sciences, and military defense [1]. Due to the high integration, charge coupled device and complementary metal-oxide semiconductor are usually used in 3D reconstruction systems [2,3]. However, light sensors with a pixelated structure are unable to satisfy all application requirements, particularly in low light environment and in spectral ranges out of the visible region [4][5][6][7][8][9]. Alternatively, images can be reconstructed by a single-pixel detector with computational algorithm [10,11]. Therefore, single-pixel imaging has attracted much attention in 3D imaging in recent years [12,13]. A variety of approaches have been proposed for different applications, among which time-of-flight [12] and stereo vision [13,14] are commonly used.
Time-of-flight measurement determines the distance to a scene by illuminating it with pulsed light and comparing the arrival time of the back-scattered light to the emitting time of the laser pulse. A major advantage of time-of-flight based 3D imaging is that its depth resolution is mainly determined by the pulse width of the pulsed light, and not dramatically affected by increases the distance between the system and the object, therefore a good candidate for long-distance 3D measurement [15]. A narrower pulse width means a smaller uncertainty in time-of-flight measurement and less overlapping between back-scattered signals from objects of different depths, which in turn improves the system depth resolution. Single-pixel imaging based time-of-flight system uses pulsed laser for structured illumination and a time-resolving detector for detection [16]. A series of two-dimensional (2D) images can be obtained at different depths, forming an image cube in 3D. However, the method is computational, because images at all different depths need to be reconstructed [17].
Stereo vision uses two or more images obtained simultaneously from different viewpoints to reconstruct a 3D image, while the geometric registration between several images may be problematic. Among stereo vision methods, photometric stereo captures images with a fixed viewpoint but different illuminations [18]. The pixel correspondence is easier to perform than in traditional stereo vision, but images of different illumination have to be taken sequentially. If the images are captured at the same time, the imaging rate of photometric stereo will be improved to be much faster. In structured light illumination based single-pixel 3D imaging, several detectors measure the back-scattered light intensities form different locations simultaneously, solving the problem of shooting sequence [13,19]. The 2D images reconstructed by different detectors appears to be illuminated from different directions. Thus, this architecture simplifies the 3D imaging system to one spatial light modulator, one lens, and several single-pixel detectors without compromising the quality of the reconstruction.
In single-pixel imaging, the number of mask patterns required for one image reconstruction is proportional to the pixel resolution of the image, therefore limiting the frame rate of the system. One approach is to decrease the required number via compressive sensing algorithms of different sophistications [20][21][22][23][24][25][26]. However, these algorithms either have a computational overhead or are incapable of reaching a significant low compressive ratio. An alternative and more fundamental approach will be to increase the rate of structured illumination. Previous works more or less suffered from the limited rate of the structured illumination, reaching 64 × 64 pixel resolution at 8 frame per second with a 22 KHz digital-micromirrordevice [19], or 32 × 32 pixel resolution at 10 fps with a 10 KHz LED array [27]. A 32 × 32 resolution LED array with a 500 KHz illumination rate was proposed in our previous work [28], achieving 1000 fps single-pixel imaging at a 25% compressive rate. However, the poor spatial resolution of the LED array in previous works limited its potential in various applications.
In this paper, we demonstrate a single-pixel 3D reconstruction system with a high-speed LED array, which has a pixel resolution of 64 × 64 and a maximum illumination rate of 2.5 MHz. 3D profiling with 0.50 mm accuracy at 122 fps are performed, chromatic 3D reconstruction results at 40 fps are presented as well.

3D reconstruction principle
The architecture of 3D single-pixel imaging in our experiment is shown in figure 1. The mask pattern P i displayed by an LED array is projected onto the object O through a projection lens, and then a single-pixel detector measures the total intensity of the reflected light as the inner product of P i and O as T denotes the reflectivity of the object at corresponding spatial location, and The 3D profile of the object can be reconstructed using the 2D images obtained from single-pixel detectors of different spatial location, i.e. Detector 1, 2, 3, and 4 in figure 1. The intensity of a pixel location (i, j) in the image obtained by the kth detector can be denoted as where E R is the reflected intensity of the light source,d k is the unit detector vector from the object to the detector, andn ij is the surface normal unit vector of the object at a given pixel location (i, j). In the case that four detectors at different spatial locations are used to yield images from different perspectives, equation (4) can be rewritten as is the unit detector vectors matrix, and is the pixel intensity array acquired by four detectors. The unit surface normal vector of the corresponding pixel is determined aŝ Denote z i,j as the height of the surface at a pixel location (i, j) and {z} as the set of the height for all pixel locations. The relationship between a pixel (i, j) and its adjacent four pixels can be expressed by the Taylor series expansion as Adding the equations and ignoring any terms after the 3rd order, equation (7) can be rearranged as In order to estimate the accurate surface height, a common reconstruction approach [29] is to minimize the least square error function of where the minimum will be yield by Euler-Lagrange equation Consequently, equation (8) becomes and by solving a series of equation (11) determined by all pixels in the obtained images, and using Robin boundary conditions [29] which consider the heights of adjacent pixels inside and outside the boundary are equal, each z i,j in the surface height {z} can be calculated.

Experiment results
In the experiment, the LED array has 64 × 64 pixels resolution with the size of 88 mm × 88 mm. The LED array can display binary patterns with red, green and blue at a maximum rate of 2.5 MHz, which is a significantly improved device of our previous work [28]. The improvements are achieved by choosing a chromatic LED chip with a small size (SMD18-038BT, chip size 1.3 mm). Hadamard matrix is used for the mask patterns, which had a small computational overhead [26]. To display Hadamard patterns efficiently, the same two consecutive display strategy from [28] is employed here, with 128 I/O ports. A lens (f = 150 mm) is located at 515 mm far away from the LED array, and projected the light of patterns onto the object, which is 210 mm behind the lens. Photomultiplier tubes (PMTs, Thorlabs PMT2102) are placed up, bottom, left and right directions of the object, and 100 mm away. A high dynamic range digitizer (PicoScope 6404D) acquires and transfers the intensity data to the computer. Differential measurement is performed, i.e. 8192 patterns are displayed, including 4096 Hadamard patterns and their inverse. We used 50% compressive sensing, which has been proved to acquire images without significant signal-to-noise ratio (SNR) degrading [5,16]. A ladder-shaped object with the size of 22 mm × 22 mm × 8 mm was used for 3D reconstruction as shown in figure 2(a). On the top of the surface, there was a small square with the size of 6 mm × 6 mm. It is worth mentioning that the PMTs used here have a much larger bandwidth (80 MHz) than the 2.5 MHz display rate of the LED array, which is mainly determined by the overall response time of the device, i.e. the switch time of LED chip combined the response time of the electronics on the driving circuit. PMT noise also plays an important role in the experiments. That is, when collecting reflected light intensity, due to the overall response time mentioned above, along with the rise/fall time of the PMTs, only a fractional portion of the acquired signals is valid for image reconstruction. The higher display rate applied on the LED array, the fewer valid samples will be acquired for each measurement, and PMT noise will dominate the measurement and degrade the image quality. Here the display rate of Hadamard patterns is set to 500 KHz, which is an optimal balance between system frame rate and reconstruction quality. 2D images obtained from four directions are shown in figure 2(b). With 50% compressive sensing, the time spent on once acquirement are 8.2 ms, corresponding to 122 fps for 2D imaging. Figures 2(c) and (d) show the 3D image reconstructed by 2D images, and the error map between the depth of 3D reconstruction and the ground truth. The accuracy of the 3D reconstruction, which is determined as the root mean squared error (RMSE) of the error map, is 0.88 mm. The error map shows that the inaccuracy of the reconstruction mainly happens near the edges of the object, which is caused by the fact that the photometric stereo algorithm used in the experiment is based on finite difference principle, and the edges would be reconstructed over smoothed.
To verify the dynamic performance of the proposed system, another experiment of imaging objects in motion is performed. Trapezoid, semicircle and triangle shaped prisms are evenly placed on a rotating disk with an interval of 60 • , as shown in figure 3(a). The disk rotates at the speed of 4 rounds per second, and the objects, placed 33 mm to the center of the disk, move at 8.3 m s −1 accordingly. The parameter configurations of the system are exact the same as the previous experiment, leading to 122 fps 3D reconstruction. Figure  3(b) shows the reconstructed 3D profiles of the trapezoid, semicircle and triangle objects. Figure 3(c) demonstrate the error maps between the reconstructed 3D profiles and their corresponding ground truth, and the RMSEs are 0.50 mm, 0.51 mm and 0.77 mm, respectively. The dynamic results demonstrate a profiling accuracy similar to that of the static one, though being slightly smaller due to the simpler structures  The LED chips on the array are chromatic, that is, each LED chip contains one red, one green and one blue light emitting diode. Therefore, by displaying the set of Hadamard mask patterns in red, green and blue sequentially, the proposed system is capable of chromatic imaging. An experiment is performed using a colored toy, shown in figure 4(a), as an object. Same acquisition configuration as previous experiments is implemented. Figures 4(b) and (c) show the reconstructed 3D image of the toy, and the 2D images used for the reconstruction. It is worth mentioning that the 3D profile of the toy is reconstructed at the frame rate of 122 fps, but the chromatic 2D images in figure 4(c) are obtained at 40 fps because they require three consecutive images, one red, one green and one blue, to recover chromatic information.

Conclusion
In this work, a 3D single-pixel imaging is presented with a high-speed LED array with a maximum 2.5 MHz refreshing rate. Dynamic experiments demonstrate that the proposed system is capable of 3D reconstruction at the frame rate of 122 fps with an accuracy of 0.50 mm, which is a significant improvement over the existing single-pixel imaging based works. The using of RGB LED chips on the array also enables chromatic 3D imaging at 40 fps. Nevertheless, the accuracy of the system could be further improved by an optimized photometric stereo algorithm, an LED array with more pixels or a higher pixel density, and higher performance detectors. The demonstrated system significantly improves the dynamic performance of the single-pixel 3D reconstruction system, and offers potential solutions to many applications, such as fast 3D inspection.