Interlaboratory comparison of femur surface reconstruction from CT data compared to reference optical 3D scan

The present study contrasts the accuracy of different reconstructed models with distinctive segmentation methods performed by various experts. Seven research groups reconstructed nine 3D models of one human femur based on an acquired CT image using their own computational methods. As a reference model for accuracy assessment, a 3D surface scan of the human femur was created using an optical measuring system. Prior to comparison, the femur was divided into four areas; “neck and greater trochanter”, “proximal metaphysis”, “diaphysis”, and “distal metaphysis”. The deviation analysis was carried out in GEOMAGIC studio v.2013 software. The results revealed that the highest deviation errors occurred in “neck and greater trochanter” area and “proximal metaphysis” area with RMSE of 0.84 and 0.83 mm respectively. In conclusion, this study shows that the average deviation of reconstructed models prepared by experts with various methods, skills and software from the surface 3D scan is lower than 0.79 mm, which is not a significant discrepancy.

Digital imaging and communication in medicine (DICOM) file formats must be converted to a format readable by CAD or finite element software for further 3D analyses, e.g. the stereolithography (STL) format. The first milestone of construction workflow of image-based biomechanical analysis is accurate segmentation (extracting desired bone geometry from medical image data) from source data [8][9][10]. Therefore, FE analysis for the aforementioned purposes requires accurate 3D shape representation of bones. This is fundamental to obtain segmented bone data as authentic as possible to the patient's morphology. Since there are many commercial segmentation software packages and algorithms, The STL models segmented with commercially available software packages may have discrepancies compared to the actual bone, and the accuracy of the segmented bone could then vary based on the segmentation method and operator's skills. Hence, the accuracy assessment of 3D reconstructed bone based on medical images has been recently investigated extensively [8,[11][12][13][14][15][16][17][18][19][20][21][22][23][24][25]. Yet, it is concerned that such segmented models do not represent the accuracy of the original bone in a FE model. Therefore, the purpose of this study is to investigate the effect of using different segmentation methodologies conducted by experts with different experiences and skills through a round robin test including seven biomechanics laboratories. The deviation of segmented bone models from the cadaver bone geometry obtained by optical 3D scanning was investigated and the discrepancies between reconstructed models from CT images and the model obtained from optical 3D scanning was quantified.

Methods
Seven different research groups were invited to participate in this interlaboratory comparison. The results of this research were anonymized and only the principal investigator had access to non-anonymized data. The laboratories were randomly numbered from 1 to 7. In case that more than one model was created by a laboratory, the first and second model of each laboratory was named by additional letters A and B, respectively. Figure 1 presents a flowchart showing the research methodology applied in this study. It briefly shows how the reference STL file and also different laboratories STL file were created.

CT scan acquiring
The CT image of the right femur of a 58 year-old male cadaver was acquired from Trauma Center Murnau with a SOMATOM Definition AS + CT scanner (Siemens AG, Erlangen, Germany). This unique CT image was used by all seven laboratories for segmentation. Soft tissues were removed from the bone prior to the scan. The CT image was saved as DICOM with resolution of 0.29 × 0.29 mm and slice thickness of 0.6 mm for deviation analysis. To reduce partial volume effects, the femur was scanned in a water bath. In order to achieve a calibrated scan, a bone mineral density phantom was scanned in the same setup.

Optical 3D digitization
The outer geometry of the femur was scanned using an optical measuring system at the Fraunhofer Application Centre of Large Structures in Production Engineering (AGP) in Rostock (Fig. 2). The ATOS series of industrial optical 3D scanners provide accurate scans with detailed resolution at high speeds (GOM-Gesellschaft für Optische Messtechnik mbH, Braunschweig, Germany). Instead of measuring single points, ATOS captures an object's full surface geometry and primitives precisely in a dense point cloud or polygon mesh. The 3D scanner consists of two cameras and a projection system in which the projector projects special stripe light patterns on a surface of an object, being recognized by the cameras. Every scan produces a point cloud with up to 4 Billion points. Several scans from different points of view can be registered by special reference points with a defined diameter, which will be automatically detected by the software. Table 1 presents the specifications of the ATOS scanner for scanning the femur. Reconstruction of 3D models DICOM files obtained from the CT scan were used to segment the surface of femurs by four different segmentation software packages: AMIRA ® (FEI Visualization Sciences Group, Oregon, USA), Mimics ® (Materialise N.V., Leuven, Belgium), YaDiv (Welfenlab, Leibniz Universität Hannover, Hannover, Germany) [26], and Fiji Life-Line [27]. The bony regions were labeled in all slices of the CT images based on the certain values of Hounsfield units (HU) for bones. The process of labelling of bony structures is a visual and subjective procedure in which a primary HU value for bones was selected from the literature which is around 200-250 up to 3000 [28][29][30][31]. The automated segmentation of the bone started by thresholds of HU and was followed by manually editing the slices to obtain more accurate surfaces [28][29][30][31]. A triangulated surface of the femurs was created with the segmentation software using a semi-automatic method. Removal of the holes and sharp edges which were formed due to semi-automated segmentation was implemented in the above mentioned software [31][32][33][34]. Software information, methods and duration of the segmentation process for each model are tabulated in Table 2. The reconstruction process of CT image was performed by researchers with 3-5 years of experience in segmentation.

Deviation analysis
Stereolithography (STL) files were collected from all project partners and imported into GEOMAGIC studio v.2013 (Raindrop Geomagic, NC, USA) for deviation analysis. Thereby, the researcher conducting the analysis was blinded towards the participant's identity in  order to avoid bias. Prior to comparison, the femur was divided into 4 areas: "neck and greater trochanter" area, "diaphysis", "proximal metaphysis" and "distal metaphysis". Five different planes were defined in global coordinates to divide all models into above-mentioned areas. Proximal end, upper proximal metaphysis, lower proximal metaphysis, distal metaphysis and distal end were the predefined planes for splitting the models into four aforementioned parts [35]. Figure 3 illustrates the predefined cutting planes of the femur. The Neck area includes "neck and greater trochanter" and the "proximal metaphysis" contains the area of lesser trochanter. The "diaphysis" defined as long bone known as the femur shaft and the last part excludes the epiphysis named "distal metaphysis".

Results
Seven laboratories prepared nine reconstructed models out of one single human femur CT scan. The outer surface of the femur was scanned with high point resolution using an optical 3D scanner. The results of 3D deviation analysis for the four femoral parts were processed in GEOMAGIC studio and compared to the other models. Table 3 presents the average deviation values of the 9 different segmented models for all four predefined areas of the femur. The highest deviation was observed in "neck and greater trochanter" area with RMSE of 0.84. The negative values for the estimated percentage error of the surface areas represent the deviation of the underestimated areas and the positive values show the overestimated areas (see Table 3). Root means square error (RMSE) was used as a standard statistical metric for comparison and evaluation of simulation models performance [36,37]. Figure 4 also illustrates the visual deviation using color-coded map to show the differences of each model compared to the bone optical 3D scan. Figure 5 illustrates the estimated surface areas of the nine segmented models and the bone optical 3D scan. "Diaphysis" and "neck and greater trochanter" areas have the largest percentage errors of outer surface area with 2.92 and 2.57% respectively. This figure indicates that the outer surface areas of the reconstructed models are not exceedingly different from the reference model.

Discussion
Finite element models are commonly used based on specific geometrical characteristics extracted from medical imaging data. This study presents a deviation analysis to evaluate different segmentation methods based on CT scan compared to optical 3D surface scan of the same bone. Thereby, the reconstruction results of seven different biomechanics laboratories were compared to evaluate how human skills, methods of segmentation, and different software packages can cause imprecision in image-based reconstructed models.
This study investigated a variety of conditions, which may have influenced the accuracy of the segmentation process. The segmentations of "neck and greater trochanter" area and "proximal metaphysis" showed the greatest deviations with RMSE of 0.84 and 0.83 mm respectively (Table 3). Thevenot et al. [38] reported the accuracy of a novel method for automatically reconstructed 3D model from 2D hip radiograph and Verim et al. [39] evaluated the reconstructed proximal femur from different images of different devices. They both found that the greatest error happened in the trochanter area which is in good agreement with our results. Vaananen et al. [40] assessed the 3D shape of proximal femur using two different methods; shape template and bone mineral density image. They also found out that the maximum discrepancies are in trochanter area, ranging from 0.7 to 2.6 mm. Our results also showed the similar range of discrepancies. Schumann et al. [41] used clinically relevant morphometric parameters measurement of the proximal femur to examine the accuracy of their reconstructed method. In their study, the highest average deviations were also observed in trochanter area. Rathnayaka et al. [42] conducted a study to compare the accuracy of MRI and CT reconstructed 3D models where they also estimated the highest deviations was observed in the "neck and greater trochanter" region. The highest deviations, usually observed in the "neck and greater trochanter" area, are probably due to geometrical complications exist in this area. In the current study, the highest estimated discrepancy from the reconstructed models is 0.79 mm. The previous studies of Glaude et al. [11,43] on accuracy assessment of reconstructed models based on medical images, suggest that the mean 3D deviation of reconstructed models should be in the range of 1 mm. Since clinical hip fractures commonly occur in the neck area [38], more accuracy in reconstruction of this area is required to have more precise FE analysis results. Furthermore, as observed in Fig. 5, there was no outlier in the accuracy assessment comparison, and all the reconstructed 3D models have similar range of deviations. However, if peak discrepancies are observed, they can be simply disregarded because they are local. The outer surface areas of the reconstructed models providing the surface meshes for FE analysis were also estimated in this study. Highest errors of the outer surface area were observed in "diaphysis" and "neck and greater trochanter" regions. This is also illustrated in Fig. 4 using color-coded map to show that "diaphysis" and "neck and greater trochanter" regions have the highest surface discrepancies compare to the real bone 3D optical scan. Therefore, these two regions are the most critical regions for reconstruction of 3D models based on medical images and should be processed carefully. The results also suggest, that the quality of the image segmentation is rather independent of reconstruction processing software. The differences observed in segmentation times can be associated with either individual investigator speed of segmentation or the usability of the software. For future works, the effect of the negligible discrepancies on the FE analysis results will be examined using a controlled load case in an experimental setup.

Conclusion
This study shows that the average deviation of CT based models, prepared by experts with different skills using various software packages, from a bone surface scan is very low. This reveals that the effect of human expertise and use of different software packages and corresponding methodologies have a negligible effect on the accuracy of the reconstruction procedure from medical images. Therefore, image-based reconstructed models are reliable to use in FE models for clinical applications.