EasyIDP: A python package for intermediate data processing in 3D based plant phenotyping

Background: The use of 3D based high-throughput phenotyping improves the eﬀiciency of crop management and monitoring practices. The structure-from-motion and multi-view stereo photogrammetry (SfM-MVS) technique, applicable to common RGB digital cameras, has been widely used for this and can be implemented by many commercial and open-source tools. By using such tools, several outputs such as digital orthophoto map (DOM), digital surface model (DSM), and point cloud data (PCD) can be generated. However, there is a gap between these outputs and the final 3D plant phenotyping. For example, calculating plant height and canopy ground cover requires the segmentation of each plot from the whole DOM, DSM, or original image. These intermediate processes are time-consuming, and to the best of our knowledge, there are no easy-to-use alternatives currently available. Results: In this study, a software package called EasyIDP (easy intermediate data processor) was developed to link the products of SfM-MVS techniques with 3D based plant phenotyping. A lotus ( Nelumbo nucifera ) breeding field was used to demonstrate the following points: 1) clipping (segmenting) SfM-MVS products according to a given plot boundary or region of interest (ROI); 2) transforming the ROI of the SfM-MVS products into high-quality raw images to assist in object detection; and 3) evaluating the accuracy of the previous transformation using manual annotation. Conclusions: The proposed intermediate data processing tool showed an acceptable accuracy and potential to process the products from SfM-MVS techniques. By using the EasyIDP, a bridge between SfM-MVS products and plant phenotyping was conveniently achieved.

Background 3 Compared with traditional manual field mensuration, which is time-consuming, la-4 bor intensive, and subjective, recently developed 3D reconstruction techniques pro-5 vide a high-throughput solution for plant phenotyping. The structure-from-motion 6 and multi-view stereo photogrammetry (SfM-MVS) technology, which requires only 7 a common digital camera, has been widely used in both indoor and outdoor applica- 8 tions. For indoor applications, good quality point cloud data (PCD) for individual 9 crops is generated using SfM-MVS [1,2], and several studies have focused on devel-10 oping algorithms to classify or segment point clouds to calculate geometric traits 11 [3,4]. For outdoor experimental field application, although PCD is also an impor-12 tant data source for crop modeling and trait extraction [5, 6, 7], digital orthophoto 13 map (DOM) and digital surface model (DSM) are often required for plot man-14 agement, to simplify the difficulties encountered when analyzing crops [8,9,10]. (soil, grass, etc.) in the field, make data analysis and plot extraction more com-23 plicated when combined with plants. Hence, one typical data processing method 24 is clipping those outputs according to plot sectors or region of interest (ROI) and 25 making it easier for plot management and this also benefits time-series analyses. 26 Second, practical field applications experience variable and complex environmental 27 conditions, such as light and wind, making it challenging to obtain raw image qual-28 ity for SfM-MVS products (eg. DOM). For example, Duan et al., [13] reported that 29 79% of the plots showed overestimation of the ground cover from DOM compared 30 to raw images. Hence, referring to raw images directly to compensate for the loss of 31 DOM or PCD quality would improve the performance of accuracy and time cost. 32 The objective of this study was to develop an easy-to-use software package to  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65 SfM-MVS outputs to small parts or sectors by a given plot boundary or ROI; 2) 36 transforming the ROI into SfM-MVS products to correspond to the position on 37 original quality raw images; and 3) testing the accuracy and performance of the 38 previous transformation using a case study. Image collection and 3D reconstruction 53 The quality of the raw image is fundamental for the SfM-MVS process as well as the 54 EasyIDP package. It is important that images with balanced brightness, exposure, 55 and minimum motion blur. Overlap of over 50% among each adjacent image is also 56 recommended. Although ground control points (GCPs), manual tie points (MTPs), 57 or even real-time kinematic (RTK) are optional for SfM-MVS software, it is strongly 58 recommended to set several objects, tags, or scale bars to calibrate and define 59 geographic positions. One option is Chilitags (https://github.com/chili-epfl/ 60 chilitags), as the automatic workflow decreases the workload in the current SfM-61 MVS software.

62
After obtaining acceptable quality image data, the SfM-MVS workflow is required.

106
Several external Python site packages are used to input these files, all of which are 107 loaded as numpy.ndarray based data structure for faster matrix algebra calculations.

108
The *.txt pure text files can be read by the numpy.loadtxt module. For *.jpeg is not the same as the geotiff projection. For 3D polygon *.dxf file, the ezdxf package 120 will be used to convert the numpy.ndarray polygon in the future.

121
After importing all these materials into the EasyIDP package, a clip function 122 is provided to clip both point cloud data and geotiff GIS maps into small sectors.  is used to clip pixels out of the raw DOM. After that, the left corner offset of the 131 clipped boundary is calculated according to the DOM pixel resolution to obtain the 132 geo-header of the clipped sector, and finally saved to a geotiff file using the tifffile 133 package.

134
Geometry from real world to image pixel 135 In the geometry between the image and the real world, there are three coordinate 136 systems. The first is the real-world geographic coordinate (xyz geo , unit is m), the 137 second one is the offset camera coordinate (xyz cam ), which makes the camera posi-138 tion to the origin (0,0,0) of coordinates and the camera medial axis is used as the 139 z-axis, and the last one is the raw image pixel coordinate (xy pix , unit is pixel).
When using the camera location P i as the coordinate origin (Figure 3. (Figure 3.c): where c x and c y is the image center in pixel (Figure 3.b), width (horizontal), and 152 length (vertical).

153
Camera distortion correction 154 The image shown in Figure 3.c is an ideal undistorted image. However, the lens

161
The transform from the original distorted image pixel coordinates (x d , y d ) (known) 162 to the corrected undistorted pixel coordinate (x u , y u ) is described by [15]. First, for Then, the loading camera model (camera distortion coefficients) from the SfM-

165
MVS software outputs, for Pix4D, the radial distortion (r 1 , r 2 , r 3 ) and tangential 166 distortion (t 1 , t 2 ) are in the calibrated_camera_parameters.txt file: The (x hd , y hd ) is the corrected result for ( Equation 2:   168   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63 64 65 Transforming the ROI from the raw image to another raw image 169 The geometry of transforming ROI from one image to another is shown in Fig-170 ure 3.a. Transforming a polygon can be simplified to a repeatable polygon corner 171 point transformation. Assuming a point K (value unknown) in the real-world coor-172 dinate (xyz geo ) is recorded as k 1 and k 2 on image raw 1 and raw 2 by the camera, 173 respectively, the raw 1 is where the ROI is marked while the (raw 2 ) is the target 174 image we want this ROI transformed to, and the k 1 (value known) is one ROI 175 polygon corner. As the projection of the 3D real world point to a 2D pixel point can only obtain the equation with Z ′ 1 as parametric: This means that without specifying the exact value of Z ′ 1 , the previous step will 180 produce a line rather than a specific point in the real world as well as on the 181 other raw image. The ideal and most accurate method is marking the same ROI 182 on two images (get k 1 and k 2 known), and then calculating their intersection to 183 obtain the exact Z value of K r , followed by K r coordinate on the third image. In this study, the height value of the camera was 185 used as Z ′ 1 directly, and reasonably, our case study pre-experiment result ( Figure   186 S3, Additional file 3) showed that this assumption still needs some improvements 187 to make the accuracy acceptable.

Future prospects 413
The EasyIDP package is currently only a pre-release as it is still under construction.