3D Reconstruction for Motion Blurred Images Using Deep Learning-based Intelligent Systems

The 3D reconstruction using deep learning-based intelligent systems can provide great help for measuring an individual’s height and shape quickly and accurately through 2D motion-blurred images. Generally, during the acquisition of images in real-time, motion blur, caused by camera shaking or human motion, appears. Deep learning-based intelligent control applied in vision can help us solve the problem. To this end, we propose a 3D reconstruction method for motion-blurred images using deep learning. First, we develop a BF-WGAN algorithm that combines the bilateral filtering (BF) denoising theory with a Wasserstein generative adversarial network (WGAN) to remove motion blur. The bilateral filter denoising algorithm is used to remove the noise and to retain the details of the blurred image. Then, the blurred image and the corresponding sharp image are input into the WGAN. This algorithm distinguishes the motion-blurred image from the corresponding sharp image according to the WGAN loss and perceptual loss functions. Next, we use the deblurred images generated by the BFWGAN algorithm for 3D reconstruction. We propose a threshold optimization random sample consensus (TO-RANSAC) algorithm that can remove the wrong relationship between two views in the 3D reconstructed model relatively accurately. Compared with the traditional RANSAC algorithm, the TO-RANSAC algorithm can adjust the threshold adaptively, which improves the accuracy of the 3D reconstruction results. The experimental results show that our BF-WGAN algorithm has a better deblurring effect and higher efficiency than do other representative algorithms. In addition, the TO-RANSAC algorithm yields a calculation accuracy considerably higher than that of the traditional RANSAC algorithm.


Introduction
Due to some factors, such as camera shaking and human motion, real-time image blurring easily occurs. For a good visual effect, it is very important to remove the blur and obtain a sharp image [1]. The "intelligent" solutions are essential in solving the blurring problem by using the effective critical thinking procedures to restore the sharp image. Most of the existing image deblurring methods are based on the image prior probability model. Krishnan et al. [2] assumed that the image gradient obeys the Laplace distribution, and Zoran et al. [3] simulated the distribution of the image gradient with a Gaussian mixture model. The image prior probability methods overlap with noise in the frequency domain or transform domain, so the excessive smoothing of texture structures greatly reduces the visual effect. In recent years, many scholars have applied deep learning to image deblurring algorithms. Xu et al. [4] proposed an image deblurring method based on a convolutional neural network (CNN) to overcome the ringing effect in saturated regions of images. Chakrabarti [5] predicted complex Fourier coefficients of motion kernels to perform non-blind deblurring in the Fourier space. Gong et al. [6] used a fully convolutional network for motion flow estimation. Nah et al. [7] adopted a kernel-free end-to-end approach that uses a multiscale CNN to directly deblur the image. However, the CNN method considers the prior features of the image indirectly, which are easily affected by noise.
To solve the problems of the existing deep learning algorithms, we propose a BF-WGAN algorithm, which combines the bilateral filtering (BF) [8] denoising theory with the Wasserstein generative adversarial network [9] (WGAN), to remove motion-blurred images. The BF-WGAN algorithm contains two parts. First, the bilateral filter denoising algorithm is used to remove the noise and retain the details of the blurred image. The advantage of the bilateral filter theory is that it not only considers the spatial distance between pixels but also considers the degree of similarity between pixels, which ensures that the pixel values near the edge are preserved. Second, the blurred image and corresponding sharp image are input into the WGAN. This algorithm distinguishes the motion-blurred image from the corresponding sharp image according to the WGAN loss and perceptual loss [10] functions, which allows the finer texture-related details to be restored and the high-precision contours of the image to be revealed. Further, the BF-WGAN has fewer parameters comparing to multiscale CNN, which heavily speeds up the inference. Therefore, the BF-WGAN obtain state-of-the-art results in motion deblurring while faster than the closest competitor-CNN.
3D reconstruction of the human body is very useful for the rapid and accurate measurement of an individual's height and body shape [11]. With the use of 2D real-time images of the human body taken from different angles, 3D reconstruction technology can quickly and accurately provide information on the growth of children. At present, it is estimated that there are approximately 149 million children under the age of 6 with physical dysplasia worldwide. A child's height and shape can directly reflect his or her magnitude of growth [12]. Because there are many children that need to be evaluated, the traditional manual measurement methods for height and body shape require considerable manpower and time.
For the 3D reconstruction of motion-blurred images, we use the deblurred images generated with the BF-WGAN algorithm to perform the 3D reconstruction. The most important part of 3D reconstruction is the calculation of the camera parameters, which mainly include the global rotation matrix and global translation vector for multiview 3D reconstruction [13]. The global rotation matrix was used to remove the wrong relationship between two views in the 3D reconstructed model. A commonly used method to calculate the global rotation matrix is the RANSAC algorithm [14]. However, the traditional RANSAC algorithm uses a fixed threshold, which can affect the accuracy of the global rotation matrix. This paper proposes a threshold optimization random sample consensus (TO-RANSAC) algorithm that can adjust the threshold adaptively to improve the accuracy of the 3D reconstruction results.
The contributions of this paper are listed as follows: 2088 a) We use deep learning-based intelligent systems to remove the motion blur in images. The BF-WGAN algorithm is proposed, which combines the BF denoising theory with WGAN. The BF denoising algorithm is used to remove the noise and retain the details of the blurred image. The WGAN adopts the blurred image, and corresponding sharp images are input into the WGAN. The BF-WGAN algorithm has a better deblurring effect and higher efficiency than other representative algorithms. b) We adopt the deblurred images generated from the BF-WGAN algorithm to perform the 3D reconstruction. The TO-RANSAC algorithm is proposed, which can remove the wrong relationship between two views in the 3D reconstructed model relatively accurately. Compared with the traditional RANSAC algorithm, the TO-RANSAC algorithm can adjust the threshold adaptively, which improves the accuracy of the 3D reconstruction results.
The remainder of this paper is organized as follows: Section 2 consists of two parts. Part 2.1 presents the deep learning-based intelligent systems to remove the motion blur of images through the BF-WGAN algorithm, and Part 2.2 explains the TO-RANSAC algorithm that we used to perform the 3D reconstruction. In Section 3, we designed and evaluated an experiment to test the performance of the BF-WGAN algorithm and the TO-RANSAC algorithm. In Section 4, we conclude our study and suggest directions for future work.

BF-WGAN Algorithm
Normally, the processing of an image depends upon the quality, and the captured image in poor quality might result in a mistake. The intelligent systems using intelligent decision-making algorithms and techniques can help us to solve the image blurring problem. In a mathematical model, image blurring can be described by the convolution process for an image. The original sharp image x is convolved with the blurring kernel k, while noise n is added. Then, we obtain the blurred image [15]: where Ã is a convolution operator.

Bilateral Filter Denoising Algorithm
A bilateral filter is a nonlinear denoising algorithm that eliminates noise while preserving image details [16]. The general Gaussian filter mainly considers the spatial distance between pixels when sampling but does not consider the degree of similarity between pixels [17]. Compared with the Gaussian filter, the bilateral filter considers both the spatial distance and degree of similarity, thereby suppressing the irrelevant details and enhancing the sharp edges of the image.
Step 1: Compute the Gaussian weight region filter based on the spatial distances: where f n ð Þ and h x ð Þ represent the input image and output image, respectively. n is near the neighborhood centered on x. c n; x ð Þ is the Gaussian weight based on spatial distance, which is used to measure the spatial distance between the center x and the point n.
Step 2: Obtain the edge filter based on the degree of similarity: Þis the weight based on the degree of similarity between pixels: Step 3: Create the bilateral filter by combining the Gaussian weight region filter with the edge filter: where k x ð Þ is the normalization factor: After the local subregion is defined, the discretized form of the formula (8) can be expressed as follows:

WGAN Deblurring Algorithm
This paper proposes a WGAN deblurring algorithm that adopts both the WGAN loss and perceptual loss functions [18]. The WGAN loss function ensures that the generated samples are diverse, thereby allows the fine texture-related details to be restored. The input and output results of the WGAN deblurring algorithm are shown in Fig. 1. The input is the motion-blurred image, and the output result is the deblurred image [19].
The WGAN between generator G and discriminator D is the minimax value using Kantorovich-Rubinstein duality [20]: where x represents the original sharp image and E represents the expectation. À is the set of 1-Lipschitz functions. P r is the data distribution, and P g is the model distribution, defined byx ¼ G z ð Þ, where the input z represents the blurred image. D x ð Þ represents the probability that x is a real image.

① WGAN framework
As shown in Fig. 2, the framework of the WGAN deblurring algorithm consists of a generator and a discriminator [21].

② Loss Function
The loss function of this paper consists of the WGAN loss and perceptual loss functions. The total loss function L is defined as follows: where ¼ 0:01 and is set according to the experience value.
where I B represents the blurred image. Deblurring is performed by the trained generator G h G and discriminator D h D . N represents the size of the training data [22].
Perceptual loss. The perceptual loss function is defined as follows: where W i;j and H i;j are the dimensions of the feature maps. f i;j is the feature map obtained by the j-th convolution before the i-th maxpooling layer within the VGG19 network [23]. I S represents the sharp image, and I B represents the blurred image.

Multi-view 3D Reconstruction Based on the TO-RANSAC Algorithm
Multiview 3D reconstruction is mainly composed of four parts: (1) Feature extraction and matching; (2) Camera parameter calculation; (3) 3D point cloud calculation; and (4) Bundle adjustment. The camera parameter calculation mainly involves the global rotation matrix and global translation vector for multiview 3D reconstruction [24]. The global rotation matrix is used to remove the wrong relationship between two views in the 3D reconstructed model.
The most commonly used method to calculate the global rotation matrix is the RANSAC algorithm [25]. However, the traditional RANSAC algorithm adopts a fixed threshold, which can affect the accuracy of the global rotation matrix. To improve the calculation accuracy, a threshold optimization random sample consensus algorithm (TO-RANSAC) is proposed. The TO-RANSAC algorithm can adjust the threshold adaptively, which prevents errors caused by different thresholds in the 3D reconstruction results.
The global rotation matrix is calculated by the relative rotation matrix through the least-squares optimization algorithm. The formula is shown in (15): where R ij is a known relative rotation matrix, R i and R j are two global rotation matrices that need to be calculated respectively. First, we calculate the global rotation matrices. The wrong relationship between two views needs to be removed. Then, the global rotation matrices can be calculated with the formula (15).
This paper proposes a TO-RANSAC algorithm to remove the wrong relationship between two views in the 3D reconstructed model. TO-RANSAC is a combination of the RANSAC algorithm and the threshold optimization concept. The use of different threshold parameters for the traditional RANSAC will affect the algorithm results. To avoid this problem, the TO-RANSAC algorithm is used to determine whether the model is reliable on the basis of the NFA (number of false alarms) value [26]. Generally, the smaller the value of NFA, the more reliable the model is. The calculation formula is: where M is the calculated model parameter, k is the number of assumed correct samples, n 0 is the number of possible models, n is the number of total samples, n s is the minimum number of samples used to generate the model M , l k M ð Þ is the k-th smallest error for the model M , and a 0 is the probability that the random error is 1.
The flow chart of the TO-RANSAC algorithm is shown in Fig. 3. The TO-RANSAC algorithm consists of five steps, which are expressed below: Step 1: Determine the sampling times N . We used formula (17) to determine the sampling times N .
where p is the confidence value, which was set to be p ¼ 0:99. q represents the minimum number of samples required for the calculation model, which was set to be q ¼ 3. e is the interior point rate, which was set to be e ¼ 0:95.
Step 2: Calculate the initial global rotation matrix. Formula (16) is used, where M represents the initial global rotation matrix, which is calculated by the random spanning tree; n is the number of all two-view relationships, and n s is the number of edges on the random spanning tree.
Step 3: Calculate the errors for the remaining edges and sort the edges by the magnitude of the error. The error was calculated as the angle difference between the relative rotation matrix and the global rotation matrix, and the formula used is: In formula (18), D a; b ð Þ is the angle between the vectors a and b.
Step 4: Calculate the value of NFA M ; k ð Þand update its minimum value. If N > 0, the algorithm returns to step 2, and the sampling times N are reduced by 1; otherwise, the algorithm proceeds to Step 5.
Step 5: Select the edge set that minimizes the value of NFA M ; k ð Þ according to the correct two-view relationship.

Experiments
For the performance evaluation of our approach, we collected 3000 real-time images of children from a kindergarten. There were 100 children aged 2-6 years, including 50 female students and 50 male students. A total of 30 real-time images were collected for each student in the JPG format. To evaluate the effect of the 3D reconstruction method for motion-blurred images, we simulated the method in three parts. First, simulated noise images and blurred images were generated. The noise images and blurred images were generated by a ThinkPad S3-490 computer [27]. The algorithms for the simulated noise images and blurred images were run by MATLAB 2018b. Second, the BF-WGAN algorithm was run on GeForce RTX 2080Ti GPU and executed with Python. Moreover, the TO-RANSAC algorithm was run on a ThinkPad S3-490 computer for deblurring images, which was executed by MATLAB 2018b.
The children were aged from 2-6 years, and one student of each age was selected as an example. Fig. 4 shows the original sharp images of five students from five different angles. The first child was a boy who was 2 years old, and his height was 80.3 cm. The second child was a girl who was 3 years old, and her height was 92.4 cm. The third child was a girl who was 4 years old, and her height was 101.7 cm. The fourth child was a girl who was 5 years old, and her height was 112.3 cm. The fifth child was a boy who was 6 years old, and his height was 123.1 cm. The size of the original sharp images was 512 × 512 pixels.

Generation of Simulated Noise and Blurred Images
We chose the images of a 2-year-old boy and a 4-year-old girl to simulate the experiment. For the generation of simulated noise and blurred images, we mainly considered two aspects: the image noise parameters and motion blur parameters.

Image Noise Parameters
Gaussian noise is a common type of noise that occurs with camera shaking [28]. The MATLAB library includes a function that adds noise to an image, the imnoise function. We used the imnoise function to add Gaussian noise to the image. Fig. 5 shows the Gaussian noise image with variances V ¼ 0:01, V ¼ 0:008 and V ¼ 0:04.

Motion Blur Parameters
We used the MATLAB special function to blur the image and mainly considered two aspects: The blur angle and blur amplitude. For the blur angle, the blur amplitude was set to 15 pixels, and the blur angles studied were 30°, 45°, and 60°. Fig. 6 shows the generated images of the two students with different blur angles and blur amplitudes. Fig. 7 shows the image restoration results with noise and blurred image. Set the image restoration results with Gaussian noise variance V ¼ 0:01, and the image restoration results with a blur amplitude of 15 pixels and a blur angle of 45 . Fig. 7 shows that BF-WGAN algorithm effectively removes the noise and restores the fine texture-related details.

Quantitative Evaluation ① Time Contrast Experiment
For the time contrast experiment of image deblurring, the images of a 2-year-old boy and a 4-year-old girl were selected. The experiment was repeated 3 times for each group, and then, the average value of three measurements was used for analysis.

② Accuracy Contrast Experiment
We adopt the peak signal-to-noise ratio (PSNR) [29,30] to measure the accuracy of image deblurring. For the blurred image of the 2-year-old boy, the images had Gaussian noise variance V ¼ 0:008, a blur amplitude of 20 pixels, and blur angle of 60 . Figs. 10 and 11 show the PSNR results of the blurred image for the five algorithms. The PSNR value of our BF-WGAN is higher than the other four representative algorithms, and it yields a better restoration effect.   Fig. 12 shows the 3D reconstruction results for the 2-year-old boy. According to the 3D reconstruction results, the height, shoulder width and head width of the 2-year-old boy were 79.1 cm, 25.3 cm and 14.2 cm, respectively. Compared with the actual measured data of the 2-year-old boy, the differences in the height, shoulder width and head width were 1.2 cm, 0.7 cm and 0.5 cm, respectively. Therefore, the AC-RANSAC 3D reconstruction algorithm presents a reasonable reconstruction effect.   13 shows the 3D reconstruction results for the 4-year-old girl. According to the 3D reconstruction results, the height, shoulder width and head width of the 4-year-old girl were 102.5 cm, 28.7 cm and 17.2 cm, respectively. Compared with the actual measured data of the 2-year-old boy, the differences in the height, shoulder width and head width were 0.8 cm, 0.6 cm and 0.4 cm, respectively. Therefore, the AC-RANSAC 3D reconstruction algorithm also presents a reasonable reconstruction effect.

Performance of 3D Reconstruction
The TO-RANSAC and RANSAC algorithms were used to remove the wrong two-view relationships. For the 2-year-old boy and 4-year-old girl, Tabs. 2 and 3 show the comparison of the wrong edges removed with the TO-RANSAC and RANSAC algorithms. The threshold parameter of the RANSAC was set to 1°, and the TO-RANSAC algorithm used the adaptive threshold parameters. The second column in the table shows the number of wrong edges removed after using the TO-RANSAC and RANSAC algorithms. The third column shows the percentage of wrong edges removed to the total number of edges. Compared with the RANSAC algorithm, the TO-RANSAC algorithm preserves more relationships between two views in the 3D reconstructed model.
For the 2-year-old boy and 4-year-old girl, Tabs. 4 and 5 show the comparison of the 3D reconstruction results determined with the TO-RANSAC and RANSAC algorithms. The 3D reconstruction results of the TO-RANSAC algorithm were better than those of the RANSAC algorithm. The RANSAC algorithm used a fixed threshold, which can affect the accuracy of the global rotation matrix. Therefore, the TO-RANSAC algorithm obtained more 3D points and exhibited higher accuracy. Compared with the RANSAC algorithm, the two algorithms required almost the same amount of time to run, which indicates that the TO-RANSAC algorithm is stable.

Conclusion
The "intelligent" solutions are essential to take care of solving the blurring problem, which uses effective critical thinking procedures to restore the sharp image. First, we propose a BF-WGAN algorithm to remove the motion-blurred images, which combines the BF denoising theory with a WGAN. In this algorithm, the bilateral filter denoising algorithm is used to remove the noise and retain the details of the blurred image. Then, the blurred image and corresponding sharp image are input into the WGAN. This algorithm distinguishes the motion-blurred image from the corresponding sharp image according to the WGAN loss and perceptual loss functions, which allows the fine texture-related details to be revealed and the highprecision contours of the images to be revealed. Second, we used the deblurred images generated by the BF-WGAN algorithm to perform 3D reconstruction. The TO-RANSAC algorithm is proposed, which can remove the wrong relationships between two views in the 3D reconstructed models relatively accurately. Compared with the traditional RANSAC algorithm, the TO-RANSAC algorithm can adjust the threshold adaptively, which improves the accuracy of the 3D reconstruction results. The experimental results show that our BF-WGAN has a better deblurring effect and higher efficiency than do other representative algorithms. In addition, the TO-RANSAC 3D reconstruction algorithm yields a calculation accuracy considerably higher than that of the traditional RANSAC algorithm.
In a word, deep learning is significant for successfully executing image deblurring tasks. Effective deep learning algorithms can help yield more accurate 3D data, which can be used to measure individuals' height and shape quickly and accurately. The vast use of these intelligent systems is due to its intelligent decisionmaking algorithms and techniques. However, deep learning trends in intelligent systems have the possibility of slowing down the entire computing process. There may be significant performance pressure on the processing and evaluation of images. In order to overcome these limitations in accuracy and computational time, we need to incorporate an effective deep learning image processing algorithm with an efficient data processing architecture in the future.