Optimization of 3D Point Clouds of Oilseed Rape Plants Based on Time-of-Flight Cameras

Three-dimensional (3D) structure is an important morphological trait of plants for describing their growth and biotic/abiotic stress responses. Various methods have been developed for obtaining 3D plant data, but the data quality and equipment costs are the main factors limiting their development. Here, we propose a method to improve the quality of 3D plant data using the time-of-flight (TOF) camera Kinect V2. A K-dimension (k-d) tree was applied to spatial topological relationships for searching points. Background noise points were then removed with a minimum oriented bounding box (MOBB) with a pass-through filter, while outliers and flying pixel points were removed based on viewpoints and surface normals. After being smoothed with the bilateral filter, the 3D plant data were registered and meshed. We adjusted the mesh patches to eliminate layered points. The results showed that the patches were closer. The average distance between the patches was 1.88 × 10−3 m, and the average angle was 17.64°, which were 54.97% and 48.33% of those values before optimization. The proposed method performed better in reducing noise and the local layered-points phenomenon, and it could help to more accurately determine 3D structure parameters from point clouds and mesh models.


Introduction
With the increasing demand for accelerating plant breeding and improving cropmanagement efficiency, it is necessary to measure various phenotypic traits of plants in a high-throughput and accurate manner [1]. The fast development of advanced sensors and automation and computation tools further promotes the capability and throughput of plant-phenotyping techniques, which allows the nondestructive measurement of complex plant parameters or traits [2]. Plant three-dimensional (3D) morphological structure is an important descriptive trait of plant growth and development, as well as biotic/abiotic stress responses [3]. 3D plant phenotyping has great potential for multiscale analyses of the 3D morphological structures of plant organs, individuals and canopies; for building functional structure plant models (FSPM) [4], for evaluating the performance of different genotypes in adaptation to the environment, for predicting yield potential [5] and for facilitating the accurate management of breeding and crop production with key technical support.
Different 3D sensors and imaging techniques have been developed to quantify plants' 3D morphological structural parameters at different scales. These sensors can be classified into passive and active sensors [6]. Generally, passive sensors build a 3D model from the images of different views. Some systems have been developed for obtaining a 3D model, such as an RGB camera combined with a structure from motion (SFM) algorithm and a multiview stereo vision system [7,8]. Rose et al. [9] found that the SFM-based photogrammetric method can yield high correlations to the measurements and was suitable the adjacent-pixel-gradient-vector-field of depth image, they achieved segmentation of shade leaves. This approach can be effectively applied to automatic fruit harvest and other agricultural automation work. However, their work only focused on a single-frame point cloud, which led to incomplete data for the plant. Meanwhile, the complete plant point clouds were more complex, with more noise and a layered-points phenomenon, which their algorithm could not solve. Anduja et al. [28] proposed reconstructing maize in the field with the Kinect Fusion algorithm. They monitored segmentation of maize, weeds and soil through height and RGB information and studied the correlation between volume and biomass. The coefficient of the correlation between the maize biomass and volume was 0.77, while that between the weed volume and biomass was 0.83. It was clear that the correlation coefficient was not as high as that in Paulus's study [19] because of rough point clouds with poor quality caused by the complex field environment and complexity of the plants, and they did not perform point-cloud optimization. Wang et al. [29] measured the height of sorghum in the field using five different sensors and established digital elevation models. All the coefficients of correlation between the values generated by the models and those measured manually were above 0.9. They proposed that the Kinect could provide color and morphology information about plants for identification and counting. However, the data acquired by Kinect were, again, rough and noisy, and they were not suitable for the extraction of other parameters.
According to the above studies, multiple complex parameters were effectively extracted using laser scans because of the high-quality point clouds. Kinect, by contrast, performed well in height estimation and object segmentation because these two tasks do not require high-quality data. To extract more parameters efficiently in a low-cost platform, it was necessary to obtain complete and high-quality plant 3D data using a TOF camera. However, there was a layered-points phenomenon in the plant point clouds based on multiple frames [30] because of the errors from the TOF camera and registration algorithm, a common problem.
To improve the quality of the plant point cloud, we proposed an optimization method to reduce the impact of noise and layered-points. A simple and low-cost platform based on Kinect was used for data acquisition, which makes the proposed method more widely applicable. In this study, we optimized the quality of single-frame point clouds by removing all types of noise while preserving the integrity of the plant data. We also eliminated the local layered-points phenomenon to improve the quality of plant point clouds registered from multiple frames.

Experimental Setup and Data Acquisition
The data used in this study were collected for one oilseed rape cultivar (Brassica napus L. cv. Zhe Da 619) in a closed indoor imaging platform, mainly comprising a Kinect V2 sensor, turntable and computer. The Kinect V2 (Windows version, Microsoft, Redmond, WA, USA) consisted of an RGB camera (1920 × 1080), near-infrared camera (512 × 424), and nearinfrared light for acquiring color and depth data, respectively. The acquisition platform and point cloud acquisition are shown in Figure 1. As shown in Figure 1a, the Kinect V2 was about 0.75 m away from the main stem (vertical axis) of the plant, and the shooting angle was 30 • . The rotary speed of the turntable was 14.4 • /s for changing the plant pose, and the measured plant was placed on the center of the turntable. The computer controlled the Kinect V2 and acquired and processed the raw data. It had an Intel core i5-4590 processor, a Windows 10 64-bit operating system and 8GB of ECC RAM. Data processing was performed on Point Cloud Library (PCL) and Open3D Library in Visual studio 2013 (Professional version, Microsoft, Redmond, WA, USA). Before acquisition, the Kinect V2 camera was calibrated by Zhang's [31] method, and the transformation matrix between the RGB and depth cameras was adjusted to optimize the mapping relationship for the two types of images to ensure the consistency of the color and depth of each point (Figure 1b). studio 2013 (Professional version, Microsoft, Redmond, WA, USA). Before acquisition, the Kinect V2 camera was calibrated by Zhang's [31] method, and the transformation matrix between the RGB and depth cameras was adjusted to optimize the mapping relationship for the two types of images to ensure the consistency of the color and depth of each point (Figure 1b). A point-cloud-processing pipeline was developed to optimize the quality of the entire plant point cloud. As show in Figure 2, the pipeline of workflow mainly comprised three steps: (1) point-cloud noise removal; (2) point-cloud smoothing; (3) registration optimization based on neighboring meshes. A point-cloud-processing pipeline was developed to optimize the quality of the entire plant point cloud. As show in Figure 2, the pipeline of workflow mainly comprised three steps: (1) point-cloud noise removal; (2) point-cloud smoothing; (3) registration optimization based on neighboring meshes.

Point-Cloud Noise Removal
The point cloud acquired by Kinect V2 was generally disordered, with many noise points that would have a significant effect on the reconstruction accuracy and computation speed. The viewpoint feature and normal feature of the point cloud were used to remove the noise based on the spatial topological relationship established by the k-dimensional (k-d) tree. The spatial topological relationship was used for searching neighboring points.
There were three types of noise in the point cloud: the background noise (BN), which consisted of nontarget points away from the targets; the outlier noise (ON), which consisted of scattered points, mostly around the targets, caused by the sensors, and the flying pixel noise (FPN) from the boundaries of two objects [32]. Traditionally, the BN has mainly been eliminated with a pass-through filter, and the ON removed based on the neighboring points. The pass-through filter limited the ranges of the X, Y and Z axes and removed the points outside the ranges. Due to FPN, points covering different objects of different depths were distant. The vector made up of FPN and viewer points was almost perpendicular to the FPN point's normal vector. The FPN points could be removed based on these two features.

Point-Cloud Noise Removal
The point cloud acquired by Kinect V2 was generally disordered, with many noise points that would have a significant effect on the reconstruction accuracy and computation speed. The viewpoint feature and normal feature of the point cloud were used to remove the noise based on the spatial topological relationship established by the k-dimensional (k-d) tree. The spatial topological relationship was used for searching neighboring points.
There were three types of noise in the point cloud: the background noise (BN), which consisted of nontarget points away from the targets; the outlier noise (ON), which consisted of scattered points, mostly around the targets, caused by the sensors, and the flying pixel noise (FPN) from the boundaries of two objects [32]. Traditionally, the BN has mainly been eliminated with a pass-through filter, and the ON removed based on the neighboring points. The pass-through filter limited the ranges of the X, Y and Z axes and removed the points outside the ranges. Due to FPN, points covering different objects of different depths were distant. The vector made up of FPN and viewer points was almost perpendicular to the FPN point's normal vector. The FPN points could be removed based on these two features.
Because the central axis of the plant is not strictly perpendicular to the camera-projection direction during data acquisition, it is very difficult to eliminate the BN points while preserving the integrity of the plant by using the traditional pass-through filter method. In this study, a combination of a pass-through filter and minimum oriented bounding box (MOBB) was proposed. The MOBB was a cuboid that contained the object Because the central axis of the plant is not strictly perpendicular to the cameraprojection direction during data acquisition, it is very difficult to eliminate the BN points while preserving the integrity of the plant by using the traditional pass-through filter method. In this study, a combination of a pass-through filter and minimum oriented bounding box (MOBB) was proposed. The MOBB was a cuboid that contained the object as tightly as possible, with the smallest volume in the defined coordinate system. In 2D space, assuming the camera tilt angle was θ, this coordinate was aligned parallel to the camera coordinate. If the data of the object (red box) had a rectangular distribution under the ideal condition as shown in Figure 3a, the MOBB (black box) was equivalent to this rectangle. In this case, after the MOBB was rotated in the β counterclockwise direction around Point A, the object was aligned and parallel to the camera coordinate, and θ = β. Normally, the distribution of the object was irregular (red box) as shown in Figure 3b. The relationship between the angles was calculated as below: where θ is the camera tilt angle, α, α 2 and β are the angles between the MOBB and camera coordinate and α 3 is the angle between the object box and camera coordinate. β 2 and β 3 are the angles between the MOBB and object box and were equivalent in value.
where θ is the camera tilt angle, , and are the angles between the MOBB and camera coordinate and is the angle between the object box and camera coordinate. and are the angles between the MOBB and object box and were equivalent in value.
In this case, and can be obtained from the MOBB orthogonal and the camera coordinates. After the object was rotated in counterclockwise around Point A', firstly, and the MOBB was rotated counterclockwise around Point A, secondly, the object was aligned and parallel to the camera coordinate. The same method was applied to 3D space, in which the rotation of the object was achieved with Euler's formula. The MOBB orthogonal coordinate was established with the center of the point cloud data as the coordinate origin and the length, width and height of the MOBB, which can improve the performance of the pass-through filter. After the removal of the BN points, there were still many ON and FPN points that needed to be removed. A radius-density-based outlier filter was implemented to remove the ON points [33]. For each point of data, it takes into account both the number K and the average distance ̅ ( ) of the neighboring points within a certain radius r of the selected point. If the selected point was judged as ON, the following conditions were met: In this case, α and β can be obtained from the MOBB orthogonal and the camera coordinates. After the object was rotated in β 2 counterclockwise around Point A', firstly, and the MOBB was rotated β counterclockwise around Point A, secondly, the object was aligned and parallel to the camera coordinate. The same method was applied to 3D space, in which the rotation of the object was achieved with Euler's formula. The MOBB orthogonal coordinate was established with the center of the point cloud data as the coordinate origin and the length, width and height of the MOBB, which can improve the performance of the pass-through filter.
After the removal of the BN points, there were still many ON and FPN points that needed to be removed. A radius-density-based outlier filter was implemented to remove the ON points [33]. For each point p i of data, it takes into account both the number K and the average distance d(p) of the neighboring points within a certain radius r of the selected point. If the selected point was judged as ON, the following conditions were met: where p j is the neighboring point of the selected point p, µ is the average distance between neighboring points, σ is the standard deviation of µ, n is the multiple of σ, and k is the defined point number.
As for FPN, it can be removed based on the angle θ of the normal vector (1) Establishing the spatial topological relationship of the source data using the k-d tree.
(2) Obtaining the maximum and minimum values of the three coordinate axes in the point cloud and searching six boundary points with x min , y min , z min , x max , y max and z max respectively. A radius-density-based outlier filter is used on these six points. If they are outliers, then delete and repeat this step. Otherwise, proceed to the next step.
(3) Building up the MOBB, rotating it with Euler's formula and removing BN points using a pass-through filter. (4) Removing ON points using the radius-density-based outlier filter for all points. In order to evaluate the effect of noise removal, the benchmark point cloud was segmented manually in Geomagic Studio [34]. Thus, the valid-point percent (VPP) was proposed. The closer the VPP to 100%, the fewer the non-target points.

Point-Cloud Smoothing
The bilateral filter is a nonlinear filtration tool used for edge-preserving smoothing [35]. Due to the wiggling error caused by the Kinect sensor, the fitting surface of the data acquired by the Kinect was not smooth [22]; this could be solved by this filter. Several 3D bilateral filters are based on the mesh model [36,37]. However, the mesh or fitting surface was easily affected by the noise. Based on the neighboring points, the disordered bilateral filter was used to smooth the point cloud while preserving the edge features of the point cloud [33].
where p i is the selected point and p i r are the neighboring points within the radius of r. W c is related to the smoothness and σ c is the distance factor. W s is related to the ability to preserve features and σ s is the hue factor.

Registration Optimization Based on Neighboring Meshes
The purpose of registration was to unify the point clouds from different coordinate systems into the same coordinate system [38]. Multiple neighboring point clouds were registered into a single point cloud using fast-point-feature histograms (FPFH) for rough registration and an iterative-closest-point (ICP) algorithm for fine alignment [39]. However, the local layered points could be observed after registration due to the complex refraction and reflection situations in the interiors or surfaces of the leaves [30]. The accuracy of the algorithm also affected the layered-points phenomenon. These stratified leaves were close, and the layered-points phenomenon could be optimized by adjusting the related points' position. In the point-cloud model, there was no geometrical relationship between the points, and the topological relation supported by the k-d tree was only applicable to searching neighboring points. A mesh model based on a triangular patch was more suitable for solving the issue of layered points. Three definitions were proposed to explain the triangular patch relationship in Figure 4. The symbol stands for a triangle patch. cable to searching neighboring points. A mesh model based on a triangular patch was more suitable for solving the issue of layered points. Three definitions were proposed to explain the triangular patch relationship in Figure 4. The symbol △ stands for a triangle patch.    Based on these three definitions, two frames of the point cloud used for registration were meshed by using the greedy-projection-triangulation algorithm. Supposing that abc was one patch of the first frame point cloud, mnq was the neighboring patch of abc in the second frame and p mid was the median plane of these two patches, the angle α tri and distance d tri were then calculated. If the sin −1 α tri was less than 10 −6 , these two patches were parallel, otherwise, the relation of these two patches needed to be computed using Möller's theory [40]. In the intersecting or plane intersecting relationships, if α tri was larger than α tri−min , each vertex of mnq was projected onto the p mid forming a new patch m n q . However, in the plane intersecting relationship, the distance d pro between the projection point and origin point was considered. If d pro was bigger than the point moving threshold d pro−max , which meant that these two patches were not close enough, the projection operation was cancelled. Meanwhile, for the parallel condition, both abc and mnq were projected onto the p mid , forming two new patches, a b c and m n q . After projection, the distance d cen between the geometric centers of two new patches was the basis for determining whether the projection operation was effective or not. If d cen was larger than the threshold d cen−max , which meant that these two patches were not close enough, the projection operation was cancelled. However, the retrieval of the proposed neighboring patches was based on the k-d tree and patches' geometric center, so d pro was always less than d pro−max and d cen was always less than d cen−max . Iteration produced the best result for two frames, and incremental registration optimization made all the frames into one. The detailed optimization algorithm is presented in Algorithm 2.

Algorithm 2: Registration optimization based on neighboring meshes
Input: different frames point clouds of plant after denoising and smoothing Output: one frame of complete plant point cloud Setting: global transformation matrix M glo (1) Registration: At the beginning, the first two frames are selected for processing. Fast global registration [41], which is more efficient than FPFH, is applied for rough registration, and ICP is applied for fine alignment, producing the temporary matrix M temp .

M glo = M glo * M temp
(2) Meshing: Greedy projection triangulation is used to form triangular patches for these two frames.
(3) Searching neighboring patches: Calculating the patches' geometric center and getting two center point clouds p c1 and p c2 . For each point in p c1 , searching the neighboring points of selected point in p c2 based on the k-d tree. The center point corresponds to the patch, so the neighboring patches of the patch abc of p c1 are a set T = { mnq i | i = 1, 2, 3 . . . . . .} (4) Calculating the relationship between the patches: For each patch in set T, calculating the relationship between this patch and patch abc. After projection, the new patch will take the place of the old one. (5) Iteration and repetition: If α tri is less than the minimum angle threshold α min , or d pro is less than the minimum distance threshold d min , the optimization of these pairs of patches is completed. Repeating Steps 3-5 for all patches in the first frame. (6) Down-sampling: After optimization, these two frames are combined into one frame, which is set as the new first frame. Due to repeated points, down-sampling is applied to reduce the point-cloud density. (7) Applying to all frames: Taking the next frame from memory as the new second frame and then repeating the above operations until all frames are used.

Results and Discussion
The experiments were carried out on raw data obtained from 10 pots of oilseed rape. For each pot of plant, 10 frames of point cloud data from different views, which cover Sensors 2021, 21, 664 10 of 17 360 • and these data were processed by the proposed method to show the performance and robustness of proposed method.

Point-Cloud Noise Removal
At the beginning, there were approximately 210,000 points of raw date in each frame. Most of them were noise points, as shown in Figures 5a and 6a. According to the definition in Section 2.1.1 and function (8), the red points in Figure 5b were valid points, and other points were noise points. The performance of removing BN points was evaluated by VPP. However, the perpendicular requirement, mentioned in Section 2.1.1, between the center axis of the plant and the camera-projection direction was not strictly satisfied (Figure 6a), the data still retained lots of BN points after using the pass-through filter directly ( Figure 6c). As shown in Figure 6b, the point cloud data was rotated by MOBB to satisfy the perpendicular requirement, then BN points were removed more effectively by pass-through filter (Figure 6d). The comparison results between the above two methods for removing BN points in 10 frames of point cloud of one plant (plant 1) are presented in Table 1. In this experiment, the thresholds of the pass-through filter were (−9,40), (−25,50) and (35,70) cm in the X, Y and Z directions, respectively. These thresholds can preserve a more complete plant point cloud. Compared with the average VPP of the pass-through filter, which was 75.64%, the average VPP of the pass-through filter based on the MOBB was 92.05%. Several valid points removed by the method based on MOBB were mostly FPN points and accounted for a small number, so the method wouldn't affect the quality of the point cloud. Table 2 shows the results of 10 pots of plants applied with above methods. The average VPP (AVPP), which was the average value of 10 frames' VPP of each plant, was used. The AVPP still remained at a high level, with an average value of 92.28% and standard deviation (SD) of 2.27. According to the VPP and AVPP, the performance and the robustness of the proposed method was revealed by higher average value and smaller SD value, which indicated that the proposed method performed well with different frames of point cloud of a plant and different plants.
After removing BN points, there were still many ON and FPN points (Figure 7a,e). According to groups of experiments [33,42], the point cloud has good quality when r = 2 mm, K = 30, n = 2, and θ = 85 • . As shown on the front view images in Figure 7a-d, all methods performed well in removing ON points. However, when it was switched to the side view, the results in Figure 7e-h indicated that there were significant differences between several methods. In Figure 7f, the data filtered by the radius-based outlier filter still contained many ON points around the leaves. As shown in Figure 7g,h, both the radius-density-based outlier filter and the proposed method generated relatively clean data. As mentioned in Section 2.1.1, FPN points existed at the edge of the leaf but were different from ON points, so the radius-density-based outlier filter could not deal with FPN points well. The proposed method got a better result by removing more FPN points on the boundary of the pot of the plant and ON points outside the leaves. As presented in Table 3, the radius-density-based outlier filter removed more ON points and had a higher average noise-reduction ratio (NRR) compared with the radius-based outlier filter. Further, considering the fact that the proposed method removed more FPN and ON points than the other two methods, it was reasonable that the proposed method reached average noise-reduction ratio of 14.06%. It was noteworthy that at the locations close to the boundary of leaves, the proposed method would mistake a few boundary points for FPN points and removed them from the point cloud, which brought a big SD of the noise-reduction ratio in Table 3. As for the whole plant, the proposed method showed comparable performance in removing ON and FPN points with high noise-reduction ratio ( Note: Method A is pass-through filter method, and method B is pass-through filter based on MOBB method. MOBB represents the minimum oriented bounding box. VPP represents the valid-point percent. SD represents the standard deviation. (a) (b)  Note: Method A is pass-through filter method, and method B is pass-through filter based on MOBB method. MOBB represents the minimum oriented bounding box. VPP represents the valid-point percent. SD represents the standard deviation.
(a) (b)       Note: Method C is radius-based outlier filter method, method B is radius-density-based outlier filter MOBB method and method C is the proposed method. SD represents the standard deviation.
Above all, the proposed method performed well both in different frames of point cloud data of one plant and data of different plants. The small SDs from Tables 1-4 indicated that the method had strong robustness.

Point-Cloud Smoothing
The smoothing effect of the bilateral filter mainly depends on σ c and σ s . The larger σ c , the smoother the point cloud was after processing. At the same time, the larger σ s , the more the point-cloud features that were preserved after processing. The optimal σ c and σ s were determined based on the different datasets acquired in this study. As shown in Figure 8, when σ c = 10 and σ s = 0.1, the distribution of the normals of the points was neat, which meant that the point cloud was smooth. Above all, the proposed method performed well both in different frames of point cloud data of one plant and data of different plants. The small SDs from Tables 1-4 indicated that the method had strong robustness.

Point-Cloud Smoothing
The smoothing effect of the bilateral filter mainly depends on and . The larger , the smoother the point cloud was after processing. At the same time, the larger , the more the point-cloud features that were preserved after processing. The optimal and were determined based on the different datasets acquired in this study. As shown in Figure 8, when = 10 and = 0.1, the distribution of the normals of the points was neat, which meant that the point cloud was smooth.

Optimization of Registration Based on Neighboring Meshes
The method proposed in this study was based on neighboring meshes, so the triangulation algorithm had a certain influence on the processing results. Meanwhile, the number of neighboring meshes processed also affected the results. If there are too many meshes, overlapping may occur. According to several sets of experiments, the optimization effect was best when α min = 20 • , d min = 2 × the distance of the patches' geometric center, the maximum number of iterations was 100, and the number of neighboring patches ≤ 3. Under this condition, 10 groups from different views covering 360 • were tested, and each group had two adjacent frames of point cloud data. As shown in Table 5, the average Euclidean distance (AveEd) between parallel patches after optimization was 2.65 × 10 −3 mm, and the average angle (AveAn) between intersecting and plane intersecting patches was 17.30 • , which were 64.79% and 42.07% of these values before optimization, respectively. The smaller distance and angle indicated that the optimization method made neighboring patches from different frames of point cloud data become more appressed. The SDs of AveEd and AveAn after registration with optimization were low, which indicates that the optimization method had strong robustness. According to Table 6, the AveEd and AveAn were close to half of those value before optimization. The optimization method performed well in different plants stably with small SDs (Table 6).  Note: A is the AveEd (10 −3 mm) before registration; B is the AveAn ( • ) before registration; C is the AveEd (10 −3 mm) after registration without optimization; D is the AveAn ( • ) after registration without optimization; E is the AveEd (10 −3 mm) after registration with optimization; F is the AveAn ( • ) after registration with optimization. SD represents standard deviation.
From the above results, the proposed methods including the point-cloud noise removal method and the optimization method proved to have good performance and strong robustness not only in different frames of point cloud data of one plant but also different plants. Thus, we used 80 frames of data of one plant, which covered 360 • to obtain a complete plant. 80 frames ensured a small angle between adjacent frames. Comparing Figure 9a with Figure 9b, the local layered points phenomenon improved. The leaf had three layers (in the red box of Figure 9a) before optimization, while it only had one layer (in the red box of Figure 9b) after optimization. after registration with optimization; F is the AveAn (°) after registration with optimization. SD represents standard deviation.
From the above results, the proposed methods including the point-cloud noise removal method and the optimization method proved to have good performance and strong robustness not only in different frames of point cloud data of one plant but also different plants. Thus, we used 80 frames of data of one plant, which covered 360° to obtain a complete plant. 80 frames ensured a small angle between adjacent frames. Comparing Figure  9a with Figure 9b, the local layered points phenomenon improved. The leaf had three layers (in the red box of Figure 9a) before optimization, while it only had one layer (in the red box of Figure 9b) after optimization.

Efficiency
In order to obtain the point cloud data of a complete plant, we used 80 frames of point cloud data. In the tests of 10 pots of different plants, the average total time taken for the acquisition of the point cloud data of a complete plant was about 93.8s, and the number of output plant points about one hundred thousand. Figure 10a illustrates the time consumed for each step in the proposed method. The longest time consumed by method was registration optimization based on neighboring meshes, which accounted for 64% of the total time consumed (Figure 10b). The calculation would be much faster if multithread processing was applied on a high configuration computer.

Efficiency
In order to obtain the point cloud data of a complete plant, we used 80 frames of point cloud data. In the tests of 10 pots of different plants, the average total time taken for the acquisition of the point cloud data of a complete plant was about 93.8 s, and the number of output plant points about one hundred thousand. Figure 10a illustrates the time consumed for each step in the proposed method. The longest time consumed by method was registration optimization based on neighboring meshes, which accounted for 64% of the total time consumed (Figure 10b). The calculation would be much faster if multi-thread processing was applied on a high configuration computer.

Conclusions
The plant 3D point-cloud optimization method proposed in this paper proved to be reliable for improving the quality of the plant point cloud. The point cloud was rotated into a better pose based on MOBB, and background noise points were totally removed with a pass-through filter, which preserved more valid points. For different plants, the

Conclusions
The plant 3D point-cloud optimization method proposed in this paper proved to be reliable for improving the quality of the plant point cloud. The point cloud was rotated into a better pose based on MOBB, and background noise points were totally removed with a pass-through filter, which preserved more valid points. For different plants, the method kept the valid point percent up to 92.28%, while that value was 82.24% only using passthrough filter. It was applicable to the plant-point-cloud data without plane objects due to the MOBB. The viewpoints and surface normals were effective in removing the outlier noise points and flying pixel noise points. In addition, we proposed applying neighboring mesh patches optimization during registration. After optimization, the average distance between the patches was 1.88 × 10 −3 mm, and the average angle was 17.64 • , which were 54.97% and 48.33% of those values before optimization, respectively. The impact of the layered-points phenomenon was effectively reduced, and the quality of the plant data were improved. The proposed method offers the potential to obtain complete and accurate plant data and may help to promote the popularization of plant-phenotyping research with low-cost sensors.