Phenotyping of Silique Morphology in Oilseed Rape Using Skeletonization with Hierarchical Segmentation

Silique morphology is an important trait that determines the yield output of oilseed rape (Brassica napus L.). Segmenting siliques and quantifying traits are challenging because of the complicated structure of an oilseed rape plant at the reproductive stage. This study aims to develop an accurate method in which a skeletonization algorithm was combined with the hierarchical segmentation (SHS) algorithm to separate siliques from the whole plant using 3-dimensional (3D) point clouds. We combined the L1-median skeleton with the random sample consensus for iteratively extracting skeleton points and optimized the skeleton based on information such as distance, angle, and direction from neighborhood points. Density-based spatial clustering of applications with noise and weighted unidirectional graph were used to achieve hierarchical segmentation of siliques. Using the SHS, we quantified the silique number (SN), silique length (SL), and silique volume (SV) automatically based on the geometric rules. The proposed method was tested with the oilseed rape plants at the mature stage grown in a greenhouse and field. We found that our method showed good performance in silique segmentation and phenotypic extraction with R2 values of 0.922 and 0.934 for SN and total SL, respectively. Additionally, SN, total SL, and total SV had the statistical significance of correlations with the yield of a plant, with R values of 0.935, 0.916, and 0.897, respectively. Overall, the SHS algorithm is accurate, efficient, and robust for the segmentation of siliques and extraction of silique morphological parameters, which is promising for high-throughput silique phenotyping in oilseed rape breeding.


Introduction
Oilseed rape (Brassica napus L.) as an important oil source of food [1,2] is one of the most important oil industrial crops worldwide. The yield and quality of oilseed rape are largely determined by the development of siliques, i.e., the number, length, and volume of siliques at the mature stage, because the silique is one of the important organs of an oilseed rape plant, playing a role not only in its bearing of oilseeds but also in photosynthesis and delivering developmental signals to maturing seeds [3][4][5]. Therefore, breeding a genotype with a large number of siliques and an ideal silique appearance is a prioritized goal for high-yield breeding. The number of siliques varies significantly among different genotypes and under different cultivation practices; therefore, the accurate quantification of the parameters regarding silique development is critical for yield prediction and breeding high-yielding oilseed rape varieties [6,7].
Currently, the traditional measurement of silique development parameters such as silique length (SL) and silique number (SN) depends largely on manual work, which is in vasive, timeconsuming, and inaccurate [8]. With the development of computer vision technologies, effective and image-based approaches have emerged to cope with the above problems. To date, methods have been developed to determine silique parameters based on 2-dimensional (2D) or 3-dimensional (3D) data. Of these, 2D images have been the most widely used data thus far to phenotype a plant. Liu et al. [8] cut off the siliques from the plants and placed them on the conveyor belt; then, an RGB camera was used to obtain 2D images from the topside. The silique was detected by the concave point detection segmentation method and measured by the pose estimation method. Wang et al. [9] cut and took photos of stems with and without siliques. The morphological appearances were then marked, and the angle between the silique and main stem was measured using AutoCAD software. However, because they are restricted by illumination problems and the one-sided perspective inherent to 2D imaging [10], methods based on 2D image data are less robust and difficult to use to obtain the complete spatial information of plants. The above methods are limited to obtaining simple morphological parameters and cannot be widely applied. In contrast, 3D data acquired using different 3D sensors, such as time-of-flight cameras and laser scanners [11], can represent a more detailed spatial information of the targets, ensuring more high-throughput plant phenotyping.
Clustering-based algorithms are the most widely used methods for 3D plant phenotyping. Xu et al. [12] used an RGB-D camera to obtain reconstructed point clouds of oilseed rapes. They used morphological operations (dilation and erosion) to segment stems and siliques and applied the clustering algorithm to extract the siliques. Lin et al. [13] obtained oilseed rape point clouds of high quality using a laser scanner. The region growing-based clustering algorithm with normal information was applied to segment siliques and stems. However, clustering-based methods are generally applied to plants with population scales or simple structures [14][15][16], and the performance is highly influenced by the initial value setting and tedious parameter tuning. Moreover, clustering-based algorithms have a limited understanding of the contextual information of the data, making it less reliable to analyze the topological relation between plant skeletons and plant organs.
The skeleton extraction algorithm has been proposed to tackle the above challenge in 3D plant phenotyping analysis due to its ability to depict the plant skeleton and the details of a plant structure [17]. Many researchers have applied skeleton data to analyze morphological parameters. Wu et al. [18] used Laplacian contraction to extract the skeleton of maize data by multiple-view stereo reconstruction and the nearest neighbor clustering method to separate the leaves and calculate the height, leaf length, and leaf tilt angle. Gaillard et al. [19] obtained the skeleton of sorghum by a thinning algorithm based on voxel data. On the basis of the skeleton model, they compared the classification performance between support vector machine, support vector machine with radial-based function, and multilayer perceptron. Because maize and sorghum have simple structures, these methods ignore the influence of occlusion caused by complex structures with multiple branches on skeleton extraction. For plants with complex structures such as trees, skeleton algorithms have also been widely applied. Chaudhury and Godin [20] used B-spline curve lines with an expectationmaximization algorithm to extract and optimize the skeletons of trees. Preuksakarn et al. [21] extracted the skeleton of apple trees by the space colonization algorithm. Li et al. [22] used the k-means method and clustering method to separate the branches and main trunk and consequently calculated the skeleton points by breadth-first search in the undirected graph of Toona trees and peach trees. They constructed a skeleton model fitting trees well. However, the current methods for skeleton extraction vary because of the experimental conditions and the complexity of the plant structure, and some of them require prior knowledge and labor-intensive manual optimization. In addition, there is less occlusion between tree branches, which is significantly different from oilseed rape. The current skeleton methods cannot be generalized to analyze the oilseed rape directly. Therefore, developing an accurate method for skeleton extraction and silique segmentation of oilseed rape point clouds is necessary for phenotyping studies.
In this study, we proposed a novel algorithm by skeletonization combined with hierarchical segmentation (SHS) on point clouds of a complete oilseed rape at the mature stage, which has not been reported to our knowledge. The goals were to (a) develop the skeleton model of oilseed rape, (b) achieve accurate segmentation of siliques, and (c) obtain the morphological parameters of siliques for yield prediction. Eventually, we hope to introduce a noninvasive phenotyping method with morphological parameters that could be applied to real production.

Plant materials
Eight oilseed rape (Brassica napus L.) cultivars were selected for the study, with 3 biological replicates in this study. Three cultivars (c1 to c3) were grown in pots in the greenhouse at the Zijingang Campus of Zhejiang University, Zhejiang Province, China (30°18′N, 120°6′E). Five cultivars (c4 to c8) were grown in the field at the Changxing Agricultural Experiment Station of Zhejiang University, Zhejiang Province, China (30°53′N, 119°38′E). Basically, these oilseed rapes were divided into 3 types based on the structure of the branches: the plants of few-branch broom shape (FBBS), which only had few branches, and the canopy of the plant was spread out and looked like a broom; the plants of multibranch broom shape (MBBS), which had more branches than FBBS; and the plants of multibranch cylinder shape (MBCS), which had multiple branches, and the canopy of the plant looked like a cylinder. 3D point clouds of the whole plants at the silique mature stage were acquired using a handheld 3D laser scanner. The SN, SL, and seed weight (SW) were measured. The SN was counted manually, and the SL was measured with a tape measure. The seeds of each silique were naturally dried before weighing using an electronic balance (BSA224S, Sartorius, Germany). The details of the cultivars and measured parameters are summarized in Table 1.

Point cloud data acquisition
A handheld laser scanner (PRINCE775 Inc., SCAN Tech., Hangzhou, China) with 15 red laser beams was used to harvest 3D point cloud data (Fig. 1), and the spatial resolution was 0.05 mm. The plants in pots were scanned in the laboratory, while the field-grown plants were transported to the laboratory for scanning to avoid interference from wind and other plants.
The acquisition of raw data involved continuous manual scans from different perspectives. Two blackboards with marker dots provided a coordinate reference for the scanner (Fig. 1B).
Because of the limited field of view, the acquired data covered the horizontal range of 240° (Fig. 1C).

Data processing
As we only focused on plant points for subsequent analysis, the data preprocessing was first implemented by removing the noise and pot points within the raw data (Fig. 3B). The skeleton point extraction, connection, and skeleton optimization were then performed using SHS, followed by the segmentation of siliques. Finally, silique morphological parameters such as SN, SL, and SW were extracted based on the segmentation results.
The overall workflow is shown in Fig. 2.

Preprocessing
Preprocessing of point clouds was performed to remove outlier noise points using statistical filtering [23,24]. Because the distribution of the pot points approximated an inverted frustum of a cone, the least-squares method (LSM) was employed to fit the circle [25]. On the basis of the fitting circle with the maximum radius found by LSM, we removed the points from up to the bottom inside the fitting circles. In Fig. 3B and C, O max and O 1 are the center points of the top and bottom fitting circles of the pot, respectively, while r max and r 1 are the corresponding radii. h is the distance between the top and bottom as described in Eq. 1: The fitting processing began from the bottom point P 1 (x 1 , y 1 , z min ), which had the smallest z value. The points inside the range of [z min , z min + ∆d] (∆d was the step distance) were projected into the plane xP 1 y, and the point O 1 was calculated by LSM. This procedure was then repeated in the range of [z min + (n − 1)*∆d, z min + n*∆d] to obtain the fitting circle points O n (n = 1,2,3,…) until the largest fitting circle center point O max and radius r max were found. The pot data in the range of [z min , z min + h] were then removed.

Skeleton point extraction
The L1-median point was defined as any point that minimized the sum of Euclidean distances to all points in the dataset. It had a good antinoise performance, and the L1-median point cloud could be set as the only global center of a given point set [26]. In this study, we took the L1-median point as a local skeleton point. For a given local point set J = {p i } with n points, j = 1,2,3,…,n, J ⊂ R 3 , the weighted center point of these points was computed iteratively by Eq. 2 to obtain the L1-median point p L1 .
where w j is the weight factor. As the L1-median points could present a sparse distribution where the local center points were excessively clustered together [27,28], the conditional regularization was used to shift each point into its respective geometric center position to avoid sparse distribution [18,27,28]. A repulsive force was applied between the L1-median points to keep these points at an appropriate distance from each other. The position of each skeleton point was then calculated by attractive item A(J) and repulsive item R(I) (Eq. 3). The Gaussian kernel function θ(•) and the distance function ‖•‖ were used to give the weights of neighborhood points (Eqs. 4 and 5) so that the impact of random and uneven distribution of points could be reduced. where p k+1 L1 is the selected L1-median point, and k is the number of iterations. J is the source neighborhood points of p k+1 L1 , while I is the neighborhood L1-median points of p k+1 L1 . j is the attractive factor, while k i is the repulsion factor. τ is a factor for balancing attraction and repulsion. k i is the directionality degree of p k L1 , which is used to give the repulsion direction. In each iteration, the radius of the neighborhood was increased to avoid the position deviation of the L1-median point (red dot). However, the increased radius (blue circle) could introduce interferential points (black points) from other objects, which were not in the same object as the green points and thus caused its position deflection ( Fig. 4A to C). To avoid interference of the points from different objects, the calculating range was limited by the fitting plane ∏ R (Fig. 4D). Because of the flat and long shape of a silique, we applied the random sample consensus algorithm (RANSAC) for detecting the fitting plane ∏ R [29]. Only the points whose Euclidean distance from plane ∏ R was less than the limited distance threshold T d were used to calculate the skeleton point's position. The neighborhood points were updated by Eqs. 6 and 7: where J R is the point set constrained by ∏ R from source J. I R is the point set constrained by ∏ R from I. ‖p j , ∏ R ‖ and ‖p i , ∏ R ‖ are the distances between the points and ∏ R .

Skeleton point connection
Because there were no topological relationships between discrete skeleton points, it was necessary to connect the skeleton points for silique segmentation. The local partition and growing algorithm were used to connect the skeleton points. The density-based spatial clustering of applications with noise (DBSCAN) performed well in finding various shapes [30], and it was first employed to roughly cluster skeleton points into several classes. A graph theoretical algorithm was proposed to connect the points in the same class. Here, we used the weighted unidirectional graph (WUG) to connect the skeleton points of the same class. WUG was used to solve the shortest-path problem, and it built connections between the skeleton points after giving an initial searching direction with the Euclidean distance as the weight factor. In general, the tangent vector of a skeleton point can represent the extension or searching direction. Because the skeleton changes smoothly and the position of the initial point p i within the class is random, the tangent vector function can be simplified as Eq. 8. If p i (x i , y i , z i ) is the end point, then p i and the nearest point are used to get a tangent vector. Otherwise, the nearest 2 points We set the initial direction of the initial point to be positive. For skeleton points in the same class, the WUG stores the weight factor between 2 points in the form of an adjacency list. Without correcting the searching direction, all the weights are positive, with d ij = d ji (Fig. 5A), indicating that the connection of skeleton points depends on the absolute value of distance, which may cause the wrong connection lines of skeleton points. For example, if weight d A 1 D 1 is smaller than d A 1 B 1 in sub-skeleton 1 (Fig. 5C), then A 1 would be wrongly connected with D 1 . Therefore, the searching directions of each point are corrected by previous connecting points, and some weights in the adjacency list changed to be negative (Fig. 5B). We defined the out-degree (OD) and in-degree (ID) to limit the searching direction. Given a point p, OD is the number of lines coming out of p, while ID is the number of lines coming into p. End points only have one edge, indicating OD = 1 or ID = 1. For nonend points, we limited the points to have only 2 edges, indicating OD = 1 and ID = 1. As shown in Fig. 5C, sub-skeleton 1 with yellow points and sub-skeleton 2 with green points are 2 classes divided by DBSCAN. They are adjacent in the 3D space, and the WUG searches only the points within the same class. In sub-skeleton 1, the initial point A 1 is a non-end point and only connects the nearest 2 points, B 1 and C 1 . If the initial point is an end point such as A 2 in sub-skeleton 2, then it would only connect to the nearest point B 2 , and then C 2 would be connected by B 2 .

Skeleton optimization
Some silique skeletons were still incorrectly connected because of the limited clustering performance of DBSCAN. These incorrect cases would reduce the accuracy of silique segmentation. As shown in Fig. 6A, one silique or stem is clustered into 2 classes (yellow and green). Because the skeleton changes smoothly, the distance (d e ) between end points and the angles ( 1 e , 2 e ) between end points' tangent vectors and end points' connection lines (red dotted) are small. Thus, we calculated the average distance (d sk ) between connected skeleton points and connected the 2 adjacent classes that met the following 2 requirements: (a) Both 1 e and 2 e were smaller than 15°; (b) d e was smaller than 5 × d sk . In addition, different siliques could also be clustered into the same class. Thus, we set the angles ( 1 sk , 2 sk ) between each point's 2 edges larger than 165° to divide nearby siliques into different classes; only if 1 sk and 2 sk were both larger than 165°, then the sub-skeletons were considered as the same silique class (Fig. 6B).

Silique segmentation and morphological parameter extraction
After skeletonization and optimization, each silique or stem had its own unique corresponding sub-skeleton, and the skeleton points were ordered under the given direction in the sub-skeletons. On the basis of the assumption of the normal distribution or near-normal distribution of SL [6,31,32], we used siliques with SL within the 95% confidence interval as the effective siliques (SL ef ) and counted the SN. Additionally, silique abortion was a normal phenomenon for oilseed rape [33], and the stunted siliques were tenuous or small in shape and without seeds. Thus, siliques shorter than 15 mm were removed. Here, the sub-skeleton length l {bj} was calculated as the sum of the distances of all adjacent skeleton points in this sub-skeleton {b j } (Eq. 9). Because of the influence of extreme values such as long stems, only 60% of the middle length data was used to calculate the mean value μ of the normal distribution of SL (Eq. 10). After removing aborted siliques, SL ef was defined as Eq. 11. The total SL was the sum of SL ef (Eq. 12).
(9) where m is the number of skeleton points in the sub-skeleton, and p i is a skeleton point. ‖p i − p i+1 ‖ represents the Euclidean distance between p i and p i+1 . σ is the standard deviation (SD). m ′ is the number of effective siliques. Furthermore, the classified silique points were obtained by searching the neighborhood source points of the RANSAC constraint plane, and the silique volume (SV) was calculated based on the silique points. As shown in Fig. 7, the green points are the source points of the siliques, and the red points are the skeleton points. In Fig. 7A and B, � ⃗ v is the direction of a skeleton point A, and plane α is the normal plane of A. Points B and C are the adjacent skeleton points of A (Fig. 7B and D). d 1 and d 2 are the corresponding distances. The source points inside the range of [A − d 1 /2, A + d 2 /2] were projected onto plane α (Fig.  7C). Assuming that the distribution of the projected points was rectangular, W and L were the width and length of the rectangle, and the partial volume of skeleton point A was calculated (Eq. 13). All the partial volumes were added to obtain a single silique (Eq. 14). Similarly, we also calculated the total SV by Eq. 15: where n is the skeleton point number and m ′ is the number of effective siliques.

Validation and statistical analysis
The architecture of oilseed rape was related to the branch. We denoted 3 types of branches as follows: the first branch (B 1st ), also called the main stem; the second branch (B 2nd ), grown on the first branch; and the third branch (B 3rd ), grown on the second branch [34].
To validate the performance of the SHS algorithm, the recall (Re EL ) was computed by Eq. 16, representing the proportion of estimated siliques (SN E ) to siliques counted by the laser point Additionally, to test the correlation between SHS estimations and manual measurements, the coefficient of determination (16)  (R 2 ) and the root mean squared errors (RMSE) were calculated. Because mature siliques were easily broken during the transportation of oilseed rape, measuring the length of these siliques was difficult. Therefore, we used the total SL of a single plant grown in the greenhouse to verify the accuracy of SL estimation. According to linear regression analyses, we used the correlation coefficient R to explore the relationship between SL, SN, and SV and the yield of a plant (YOP), where YOP was denoted as the total SW in siliques. The proposed algorithm was developed by Point Cloud Library [35] and Open3D Library [36] in the Visual Studio 2015 and PyCharm 2011 development platforms. The algorithm was implemented on a desktop workstation with the configuration of an Intel Core i7 processor and 16 GB of memory.

Results
The visualization of plant architecture

Siliques segmentation
We counted the lengths of sub-skeletons of the 3 plant architectures, and the sub-skeletons included stems/branches and siliques. Through manual verification, 95% of the sub-skeletons over 200 mm were long branches, which was significantly different from that of siliques (Fig. 9A). After removing the branch  sub-skeleton, the length distribution of all sub-skeletons of a single plant followed the assumption of a normal distribution or near-normal distribution (Fig. 9B). In addition, the aborted siliques without seeds were removed. Finally, we obtained the effective siliques in a 95% confidence interval of length distribution, which also followed the above assumption (Fig. 9C). At the same time, these results proved that our materials were conventional, and the results were universal. The effective siliques were rendered by different colors (Fig. 10).
On the basis of the differences in branches, we obtained 13, 6, and 5 plants of FBBS, MBBS, and MBCS, respectively. There were significant differences between these 3 plant architectures ( Because of the SHS algorithm being carried on the laser point cloud, the estimation of the SN was affected by the data quality of the point cloud and the performance of the algorithm. Therefore, the results were evaluated from these 2 perspectives. For all cultivars, the average recall (Re LM ) between SN L and SN M was at a high level of 92.23% with an SD of 8.82, indicating that most of the plants were well scanned, and the laser point cloud of siliques was complete. The average recall (Re EL ) between SN E and SN L was 90.90% with a low SD of 6.07, indicating the accurate and stable performance of the SHS algorithm. On the basis of the data of good quality and algorithm performance, the SN E was well correlated with SN M , and the average recall (Re EM ) between SN E and SN M was 84.23% with an R 2 of 0.922 (Fig. 11A). Moreover, the structure of plants affected the data quality and performance of the algorithm. The plants of FBBS and MBBS architectures were well scanned with the point cloud of high completeness, and the siliques were well-segmented, so the average Re LM s were of high value (FBBS was 94.47%; MBBS was 93.45%), as well as average Re EL s (FBBS was 93.91%; MBBS was 90.27%). In contrast, the plants of MBCS architecture had significant differences, with more second branches, third branches, and siliques. Many silique data were missing in the scanning process, causing low Re LM s, which affected the data quality and the performance of the algorithm, decreasing the average Re EL to 83.81%. Additionally, the total SL was accurately estimated with an R 2 of 0.934 (Fig. 11B)

The relationship between silique morphological parameters and the yield of a plant
The differences in plant architectures affected YOP (Fig. 12A to C). The structures of plants of FBBS and MBBS were similar, and the main difference was the branch number, so the YOP of MBBS was higher than that of FBBS. The structure of MBBS was significantly different and its yield was higher. The SN, total SL, and total SV estimated by our method were significantly positively correlated with YOP, and the R values were 0.935, 0.916, and 0.897, respectively.

Acquisition of high-quality point cloud data
Currently, the phenotypic identification related to oilseed rape yield is still dependent on manual methods, which are timeconsuming and labor-intensive, and the results are greatly influenced by subjective factors. In contrast, methods based on 3D imaging have the advantages of high automation and more objective results. Among them, the laser point cloud is accurate and reliable and has been widely applied for agricultural research, such as plant reconstruction and parameter extraction [21,22]. However, the laser point cloud scanning process is tedious. Because the plants are nonrigid objects that are irregularly  deformed by the wind, the scanning process requires a windless environment, which greatly affects the acquisition efficiency. In addition, silique breakage and falling occurring during plant transfer and silique abortion also affect the results of morphological analysis, resulting in incorrect or missing data. Therefore, the optimization of point cloud data acquisition is essential. RGB-D cameras and stereo cameras have shown potential for field-based plant phenotyping [17,37,38], but these sensors have shortcomings in accuracy and stability. It is necessary to improve the resolution and accuracy of these sensors and develop corresponding 3D imaging algorithms for the acquisition performance. To achieve accurate, efficient, and invasive yield estimation in the field, a field phenotyping platform with integrated multiple sensors needs to be further investigated.
Occlusion has an important effect on the quality of the point clouds. It is one of the main limitations of acquiring plant parameters by computer vision technology and is always accompanied by missing data. The probability of occlusion is different for plants with different structures, and complex structures are more likely to introduce the occlusion problem. There is less occlusion and more space between organs in plants with spreadout structures. In our study, MBBS and FBBS both incorporate spread-out structures. Therefore, although the plants of MBBS have more branches, the data quality was similar to that of FBBS plants. For the MBCS plants, the situation was more complicated. If the branches were clustered (Fig. 13A), then there would be serious occlusion between different organs, and the quality of the laser point cloud was poor. In contrast, if the branches were spread out (Fig. 13B), then the occlusion was not serious, and we could still obtain complete data. The Re LM of c7-1 was only 60.01%, while that of c8-1 was 94.29%. More parameters need to be applied for more detailed morphological quantitative analysis, such as the dispersion of different levels of branches. Similar problems have also been found in other studies, in which occlusion makes it challenging to segment leaves and other organs [39,40], thereby affecting the inversion of the vegetation index [41,42] and the estimation of yield [43]. The occlusion effect is an emergent problem that needs to be approached for the future application of 3D phenotyping for plant populations or field scales.

Applicability of the SHS algorithm in 3D phenotyping
The skeleton has been proven to be a piece of effective structural information for plant organ segmentation [44,45], and it has been applied to monocot plants such as sorghum and maize [44,45]. However, these methods demand the manual inclusion of prior knowledge, such as height information and initial growth points [46,47]. In addition, the plant material used in such earlier studies has been simple in structure, generally with only one main stem and no multiple branches. In contrast, oilseed rape is a dicotyledonous plant, and its structure is complex and varied. The SHS algorithm proposed in this study achieves skeleton extraction and optimization by combining the ideas of the local clustering segmentation and the growing algorithms for skeleton points and realizes the segmentation of siliques on this basis. Our method uses DBSCAN to cluster the skeleton into multiple classes and then applies the similarity of local geometric information of the same skeleton for optimization, so it can reduce the effect of other organs on its skeleton and has good performance on plants with different structures (see the Supplementary Materials). Moreover, our method requires no manual prior knowledge and is suitable for oilseed rapes of various postures. In addition, for the oilseed rapes grown in the field, there are more factors such as the interference between different plants and organs, which will be the limitation of the application of our method. It is necessary to establish the relationship model between the individual scale and the group scale to better apply our method.
In recent years, deep learning has been applied for the segmentation of plant organ point clouds with good results [48][49][50]. Compared with deep learning algorithms, SHS is driven by geometric rules rather than data, so it has no requirement of training a large amount of data. However, the methods based on geometric rules require researchers to have a comprehensive understanding of object morphology. For the structural diversity of oilseed rape, many special cases have been easy to ignore. In our study, the method is based on a common assumption that siliques are flat and long. Because of genetic and environmental effects, some siliques become short and wide, which will be treated as aborted siliques by our method [51]. For other plants, we need to adjust the method according to the morphological characteristics of their organs. For example, the rice tillers are slim and long, which are similar to oilseed rape branches, and our method can be applied directly for extracting tillers. As for the cotton boll, we need to design strategies to figure out the globular organs. Deep learning is driven by data, which makes it easier to understand geometric rules and extract features of morphological structures. Researchers' tasks are to build big datasets and design robust frameworks. It is necessary to collect data on various morphological architectures and cultivars of oilseed rape plants for further research. Nevertheless, on the basis of the performance of the SHS algorithm, it can also be applied to the task of data labeling in deep learning research.

Structural parameter relationships with oil seed yield
Seed yield is directly determined by the SN, seed number in siliques and thousand SW, and is also indirectly affected by silique traits [52][53][54][55]. For example, SL is positively correlated with seed number but poorly correlated with seed yield [52,53]. Plants differ greatly in their structures of branches and the numbers of branches, and the SN of lateral branches of a single plant has a high correlation with yield [56,57]. Moreover, many studies have shown that silique structural parameters are influenced by quantitative trait loci and the environment [6,32,51,[58][59][60][61]. In our study, we reached a similar conclusion that the environment had a great influence on the plant structure, and the structural differences further affected the yield of each plant. The greenhouse environment was controllable, but the cultivation and fertilization treatments were not optimal. Therefore, the greenhouse cultivars (c1 to c3) were not only small in SN and yield but also had few differences in structure. In contrast, there were many factors in the field environment, resulting in large differences in plant structure. There were 3 architectures of FBBS, MBBS, and MBCS in field cultivation. However, the reasonable control of fertilization and cultivation in the field resulted in a higher YOP. Moreover, the number of siliques, and branches and the levels of branches were different in different architectures. Among them, the plants of MBCS architecture had the most multiple branches, which resulted in more siliques and higher yield. The architecture of MBBS plants was similar to that of FBBS plants, but the former had more multiple branches and, correspondingly, more siliques and higher yields. Despite the influence of the environment, the traits of siliques (SN, SL, and SV) showed a high correlation with yield. Moreover, affected by environmental factors, such as high temperature, seed abortion can occur easily in effective siliques [62,63]. It is difficult to provide information on seed abortion inside siliques by noninvasive 3D methods at the plant scale, which affects the application of silique parameters (especially length and volume) for yield estimation. In addition, the breakage of mature siliques resulting in seed loss also affects yield estimation.
It is worth mentioning that, because of the limitations of traditional artificial or 2D image methods, the current research on oilseed rape mainly focuses on the relationship between SN, SL, and yield, while the correlation analysis of other phenotypic traits, such as SV, is limited. In this study, 3D imaging technology was used for the digitalization of plants, which could show the structural characteristics of plants more intuitively and provide for the possibility of quantitative analysis of plant structural phenotypes. In addition to SN, SL, and SV, we can further explore the phenotype information of oilseed rape to quantify traits such as the thickness of the silique canopy (see the Supplementary Materials), which are also parameters of particular interest to breeders [64]. Furthermore, 3D imaging of the total plant can retain the complete physical structure information of the plant, which provides possibilities for traceability and future exploration of potential traits.

Conclusions
Conventional skeleton extraction requires prior knowledge or ignores the occlusion caused by complex structure limits to simple plants, which is not suitable for obtaining silique traits of oilseed rape with multiple branches. To overcome these problems, we proposed an accurate SHS method for morphological parameter extraction of oilseed rape siliques. After removing noise and non-plant data, the point cloud was skeletonized and hierarchically segmented to determine the siliques followed by morphological trait extraction. The performance of SHS has good accuracy, adaptability, and robustness for the cultivars of oilseed rape plants with 3 different structures: FBBS, MBBS, and MBCS. The average recall (Re EL ) between SN E and SN L was 93.91%, 90.27%, and 83.81%, respectively. For all cultivars, the R 2 of SN E and ground truth was 0.922. In addition, the traits SN, total SL, and total SV of a single plant extracted by SHS were significantly correlated with YOP, and the R values were 0.935, 0.916, and 0.897, respectively. Therefore, this noninvasive method could be applied for yield estimation and phenotyping analysis and provide technical support for oilseed rape breeding. Further work should focus on enriching the 3D data of oilseed rape with a larger number of different structures and combining deep learning for research.

Data Availability
The datasets during and/or analyzed during the current study are available from the corresponding author upon reasonable request.