Automated analysis of rabbit knee calcified cartilage morphology using micro‐computed tomography and deep learning

Abstract Structural dynamics of calcified cartilage (CC) are poorly understood. Conventionally, CC structure is analyzed using histological sections. Micro‐computed tomography (µCT) allows for three‐dimensional (3D) imaging of mineralized tissues; however, the segmentation between bone and mineralized cartilage is challenging. Here, we present state‐of‐the‐art deep learning segmentation for µCT images to assess 3D CC morphology. The sample includes 16 knees from 12 New Zealand White rabbits dissected into osteochondral samples from six anatomical regions: lateral and medial femoral condyles, lateral and medial tibial plateaus, femoral groove, and patella (n = 96). The samples were imaged with µCT and processed for conventional histology. Manually segmented CC from the images was used to train segmentation models with different encoder–decoder architectures. The models with the greatest out‐of‐fold evaluation Dice score were selected. CC thickness was compared across 24 regions, co‐registered between the imaging modalities using Pearson correlation and Bland–Altman analyses. Finally, the anatomical CC thickness variation was assessed via a Linear Mixed Model analysis. The best segmentation models yielded average Dice of 0.891 and 0.807 for histology and µCT segmentation, respectively. The correlation between the co‐registered regions was strong (r = 0.897, bias = 21.9 µm, standard deviation = 21.5 µm). Finally, both methods could separate the CC thickness between the patella, femoral, and tibial regions (p < 0.001). As a conclusion, the proposed µCT analysis allows for ex vivo 3D assessment of CC morphology. We demonstrated the biomedical relevance of the method by quantifying CC thickness in different anatomical regions with a varying mean thickness. CC was thickest in the patella and thinnest in the tibial plateau. Our method is relatively straightforward to implement into standard µCT analysis pipelines, allowing the analysis of CC morphology. In future research, µCT imaging might be preferable to histology, especially when analyzing dynamic changes in cartilage mineralization. It could also provide further understanding of 3D morphological changes that may occur in mineralized cartilage, such as thickening of the subchondral plate in osteoarthritis and other joint diseases.


| INTRODUC TI ON
Calcified cartilage (CC) is a mineralized tissue delineated from the noncalcified articular cartilage by the tidemark, and from the subchondral bone by the cement line (Madry et al., 2010). The CC has an important role in anchoring the articular cartilage to the subchondral bone via individual collagen fibrils (Sophia Fox et al., 2009). For healthy conditions, the relative CC thickness (CC.Th) to the total cartilage is nearly constant, but the CC volume relative to the total cartilage volume varies and has been shown to range from 3.23% to 8.8% (Müller-Gerbl et al., 1987). Blood vessels from the subchondral bone extend into the CC layer, providing nutrients to the local chondrocytes (Madry et al., 2010). Furthermore, based on the current literature, CC is a dynamic tissue undergoing changes with mechanical loading, ageing, and joint pathology, e.g. osteoarthritis (Hoemann et al., 2012).
The thickness of articular cartilage (Cohen et al., 1999; and subchondral bone (Milz & Putz, 1994) vary greatly in different areas of the knee joint with a high thickness in heavily loaded areas. It can be hypothesized that similar changes are present in the CC as well. An early study on CC.Th revealed regional differences within the human femoral head (Müller-Gerbl et al., 1987). Furthermore, clear regional differences in equine CC have been reported (Kim et al., 2013;Martinelli et al., 2002). By contrast, in canine knees, only minor regional differences have been found . These differences related to anatomical location could be linked to the local loading environment.
In general, exercise and loading are thought to affect the CC structure. The intensity of exercise on heavily loaded joint regions is associated with thicker CC in equine tarsi ) and carpus, even without changes in the overlying non-CC (Murray et al., 1999). An increase in the canine CC.Th was observed with high-intensity exercise (Oettmeier et al., 1992). By contrast, unloading of knees with immobilization resulted in thinner CC in canine knees . In the human knee joint, similar findings have been reported; both articular and CC are thick in load-bearing areas and thin under the menisci of the knee (Thambyah et al., 2006).
Two competing events occur in ageing CC: calcification of the deep articular cartilage via advancement of the tidemark (Havelka et al., 1984) and endochondral ossification (bone replacing CC at the cement line) (Doube et al., 2007). The latter is likely dominant as ageing accelerates the thinning of CC and increases the number of tidemarks (Doube et al., 2007;Lane & Bullough, 1980). Although CC.Th varies across humans and different animal species (Frisbie et al., 2006), similar changes in ageing CC have been found in animal models. Thinning of CC, increases in vessel invasion (Pan et al., 2012), as well as chondrocyte apoptosis (Adams & Horton, 1998) have been reported in murine CC with ageing. On the other hand, Murray et al. reported an age-related increase in CC.Th in the equine tarsometatarsal joint . Joint pathology can also induce tissue responses in the CC. Remodelling of CC (Doube et al., 2007;Lane & Bullough, 1980) occurs during OA progression, contributing to a decrease in articular cartilage thickness (Goldring & Goldring, 2007). Microfractures in the CC, subchondral bone plate, and the trabeculae lead to the formation of cysts and channels, thereby affecting the cross-talk between articular cartilage and subchondral bone (Madry et al., 2010).
Traditionally, CC imaging has been performed on images obtained from histological sections (Müller-Gerbl et al., 1987) as well as backscattered scanning electron microscopy (SEM) in equine (Doube et al., 2007) and human joints (Ferguson et al., 2003;Gupta et al., 2005). Both histology and SEM require extensive and time-consuming sample processing protocols and allow for twodimensional (2D) imaging only. Nowadays, three-dimensional (3D) volumetric reconstruction of histological (Gerstenfeld et al., 2006) and SEM images (Guo et al., 2014) is possible with serial sectioning and imaging, but the associated processing is laborious and has the potential to introduce errors.
Micro-computed tomography (µCT) has been widely used to characterize 3D morphology at the micron level, including CC (Kerckhofs et al., 2012;Mehadji et al., 2019). In contrast to histology and SEM, only minimal sample processing is required in µCT. We showed previously that µCT images of the human subchondral plate contain both the mineralized CC and the subchondral bone (Finnilä et al., 2017). Indeed, CC cannot be separated from bone with lowresolution µCT imaging but becomes visible only in high-resolution µCT images (Rytky et al., 2020). However, because of the very minor difference in mineralization between the subchondral bone and CC, it is challenging to delineate the interface between CC and subchondral bone also in high-resolution µCT imaging.
The identification of the tidemark and cement line from µCT images is often conducted manually by researchers. This is a subjective and highly time-consuming endeavour, especially for tissues with complex shapes. Deep convolutional neural networks (CNNs) have recently shown great promise for automating various segmentation problems. U-Net (Ronneberger et al., 2015) has been the most popular segmentation architecture for biomedical images in recent years, and it has also been applied to µCT data . analyzing dynamic changes in cartilage mineralization. It could also provide further understanding of 3D morphological changes that may occur in mineralized cartilage, such as thickening of the subchondral plate in osteoarthritis and other joint diseases.

K E Y W O R D S
animal models, bone, histology, osteoarthritis, segmentation However, the newly introduced Feature Pyramid Networks (FPN) allow for capturing both low-resolution global features as well as high-resolution local features at a low computational cost (Lin et al., 2017). Conventional training of CNNs is conducted by initializing the coefficients from a random distribution. An alternative training approach is transfer learning, in which the network is initialized from an existing model, often pre-trained on ImageNet dataset (J. Deng et al., 2009;Ng et al., 2015). Notably, such an approach works efficiently across domains beyond natural images (Shin et al., 2016;. For example, transfer learning from deep residual networks (He et al., 2016) has been used to classify pulmonary nodules from CT images (Nibali et al., 2017), or segment the lungs in chest X-rays (Solovyev et al., 2020).
In this study, we propose an accurate framework for automated µCT-based evaluation of the CC.Th in 3D. This requires introducing state-of-the-art deep learning architectures for CC segmentation. To demonstrate the validity of the method, we perform direct comparisons of CC.Th between µCT and conventional histology.
We utilized osteochondral samples of New Zealand White rabbits, a frequently used animal model for various musculoskeletal diseases.
Furthermore, we hypothesize that the CC.Th varies in different anatomical locations of the knee. We demonstrate the capability of our automatic framework by assessing differences in CC.Th between the different anatomical locations.

| Sample collection
Sixteen knees were collected from twelve healthy, skeletally mature female New Zealand White rabbits (strain 052 CR). Eight knees were collected from four rabbits (age: 14 months) and eight knees from eight rabbits (age: 12.5 months). Each knee was dissected and divided into six anatomical regions: lateral and medial femoral condyle, lateral and medial tibial plateau, femoral groove, and patella (n = 96, Table 1). Details on animal housing, husbandry conditions, and diet are described in a previous study (Mustonen et al., 2019). All experiments were carried out under the guidelines of the Canadian Council on Animal Care and were approved by the committee on Animal Ethics at the University of Calgary (Renewal 3 for ACC Study #AC11 0035).

| Imaging
The dissected osteochondral samples were formalin-fixed. Prior to imaging, samples were wrapped in moist paper, and placed in plastic vials (Cryo.s™) for positional stability. The samples were subsequently imaged using a desktop µCT scanner (Skyscan 1272; Bruker microCT) with a tube voltage of 50 kV, current of 200 µA, and a 0.5 mm aluminum filter. The scanning was conducted in a step of 0.2° over 360° and finally, 1800 projection images with an isotropic pixel size of 3.2 µm were obtained.
The images were reconstructed using the manufacturer's software (NRecon, version 1.7.0.4, beam hardening correction applied). A narrow window with attenuation coefficients 0.085-0.141 mm −1 was used to provide high contrast between the bone and CC. The volumes-of-interest (VOI) of all samples were selected from the central load-bearing area (VOI size = 2 mm × 2 mm × sample height). This selection reduced the µCT image stacks to a reasonable size (from ~12 GB to ~700 MB per sample) for the subsequent analysis. See Figure S1 for examples of the preprocessing steps.
After the µCT imaging, samples were prepared for histological analysis. Samples were decalcified using a standard protocol with ethylenediaminetetraacetic acid solution, paraffin-embedded, and cut into 5-µm-thick sections using a microtome (three sections from each region). The sections were stained with Masson-Goldner's trichrome for identification of the CC layer and imaged using a light microscope (Axioimager 2; Carl Zeiss MicroImaging Gmbh; control software = AxioVision; resolution = 2.56 µm). A total of 281 sections were used in this study.

| Training CC segmentation models
A total of 253 histology images were segmented manually from 87 osteochondral samples. The boundaries of the CC were drawn based on the distinct collagen staining of CC compared to the articular cartilage and subchondral bone plate. At the interface between CC and articular cartilage, the topmost tidemark was followed. The discrimination was also supported by the higher staining intensity of Aniline blue in CC. Subchondral bone has the highest Aniline blue intensity and guides the segmentation at the complex interface between CC and subchondral bone. However, narrow CC cavities (>10 pixels) and small isolated areas that are not connected to the CC layer were excluded (Figure 1a, red arrows). For the µCT segmentation, we used ResNet-18 as our base model, and also an FPN decoder, which had instance normalization as well as the spatial dropout. Briefly, the normalization reduces bias for individual features with large values, whereas dropout reduces model overfitting by zeroing random nodes of the network. This model was also trained in fourfold cross-validation but for 60 epochs due to faster convergence.
We used a combination of binary cross-entropy and soft Jaccard index as the optimization loss function. Binary cross-entropy is one of the most popular segmentation metrics and can result in stable convergence. However, Jaccard index can account for class imbalance, such as an imbalance between the CC and the surrounding tissue. To facilitate a robust segmentation model, we used several image augmentation techniques (Table S1) from the SOLT library (https://github.com/MIPT-Oulu/solt) to diversify the training data.
To assess the final segmentation performance, we calculated the loss and Dice score coefficient as an average from the evaluation folds. The selection of the encoder and decoder was done based on an ablation study (Figure 2; Figure S2).

| Model application on new images (inference)
During inference, CC was predicted for the full histology images, by combining smaller tiles with a sliding window (512 × 1024 -pixel window with 256 × 512 -pixel steps), averaging the overlapping predictions. The tiling was used to avoid memory issues on the graphical processing unit while segmenting larger areas of CC. The tiles were combined, averaging the overlapping areas and predictions from every fold. Subsequently, a threshold was applied to the prediction map by using a probability of 0.8 (a high threshold was used for the exclusion of ambiguous areas from the maps, especially for the µCT images). In the case of the µCT stacks, the inference was conducted slice-by-slice with similar tiling. The predictions were averaged from every fold as well as the coronal and sagittal planes for obtaining the final probability map.
The histology masks were post-processed by removing regions smaller than 500 pixels. This ensured the removal of small artefacts while retaining large CC regions that could be disconnected due to a fold in the histology section ( Figure S3). In the µCT post-processing, masks were subjected to a sweep operation to keep only the largest F I G U R E 1 A histological section from the rabbit femoral condyle segmented manually (a) and automatically with the neural network (b). Co-registered µCT image from the same region with manual (c) and automatic (d)

| Morphological analysis
The full analysis procedure of CC.Th is summarized in Figure 3. The thickness estimation of the CC layer was performed automatically using a Python-based implementation of the local thickness algorithm. In the 2D case, the thickness assessment relies on mask skeletonization, a Euclidean distance transformation, and finally a simple circle-fitting algorithm (Hildebrand & Rüegsegger, 1997 To further investigate the applicability of the automatic segmentation on CC.Th analysis, a 2D analysis was performed between the manual segmentations and the out-of-fold predictions of the selected models. The thickness values were averaged for each sample with multiple histology sections or µCT slices.

| Validation with histology
To compare the CC analysis between histology and µCT in 2D,

| Statistical analysis and performance evaluation
For the co-registration experiment, a two-tailed Pearson correlation and Bland-Altman analyses were conducted to compare CC.Th between the µCT and histology. The deep learning segmentation models were validated against the manual CC segmentations from µCT and histology using the Dice score. The thickness analyses using outof-fold predictions and manual segmentations were compared using

| Deep learning-based segmentation
For both imaging modalities, the quality of the deep learning model predictions against the manual annotations (out-of-fold validation) is summarized in Figure 2 and Figure S2. By comparing the four different model architectures, ResNet-34 with the U-Net decoder yielded the highest mean Dice score for histology (Dice score = 0.891), whereas ResNet-18 with FPN yielded the best performance for µCT segmentation (Dice score = 0.807). The quality of the segmentation on the full dataset was visually confirmed from virtual sections on orthogonal planes ( Figure S4).
In addition, we compared the 2D CC.Th analysis for the manual and predicted CC segmentations for both modalities ( Figure 2 bottom; Figure S5). With the selected model architecture, a high Pearson correlation was achieved between the manual and automatic CC.Th quantification from histology (r = 0.984, p < 0.001).
The correlation between predicted CC.Th and manually segmented CC.Th in µCT images was also strong, although considerably smaller (r = 0.801, p < 0.001). This correlation analysis further supported the choice for model architecture (Figure 2, bottom).

| Validation with histology
Examples of µCT images co-registered with histology are shown in Figure 4. The results of the quantitative comparisons are shown in Figure 5 (predicted CC) and Figure S6 (manual segmentation).
The automated µCT-based measurements of CC.Th had a strong

F I G U R E 3 A flowchart summarizing the present study.
Example from the femoral groove is shown with magnified insets, to highlight the similarities and differences between histology and µCT. After sample preparation, the tissue samples were imaged with µCT. Subsequently, the samples underwent histology processing, sectioning, and imaging with a light microscope. The preprocessing steps for the µCT data are illustrated in Figure S1. During the automated analysis process, the CC layer is predicted using the deep learning models, thickness analysis is conducted, and finally, quantitative parameters are estimated from the estimated thickness maps.
The obtained values were used in the validation of the methods as well as for comparison between the anatomical regions of the knee. µCT, micro-computed tomography; CC, calcified cartilage correlation (r = 0.897, p < 0.001) with a similar analysis on the coregistered histology images. Furthermore, the µCT analysis had a good agreement (bias = 21.9 µm, standard deviation = 21.5 µm) with histology, based on the Bland-Altman analysis. However, the residuals were not normally distributed, due to larger differences in the patellar region. Furthermore, one of the patella samples yielded a larger difference than the 95% limit. Manual segmentation yielded a smaller correlation (r = 0.852, p < 0.001) as well as greater bias (36.9 µm) and standard deviation (30.9 µm) than the comparison using predicted masks. This time, two patella samples resulted in a difference outside the 95% limits of agreement.

| Anatomical locations
An example of a thickness map and VOI inside a lateral plateau sample is shown in the Video S1. The differences in CC.Th based on anatomical variability are illustrated in Figure 6. According to the Linear Mixed Effects Model analysis on the histology and µCT results (Table 2), the mean CC.Th varies greatly between the studied anatomical regions (p < 0.001). The thickest CC was in the patellar region, whereas the thinnest CC was in the tibial regions (lateral and medial plateau). The histology analysis allowed for further separation of the lateral and medial femoral condyles (p = 0.026). Although the absolute differences in CC.Th were larger using histology analysis than with the µCT approach, the µCT results had a smaller variance for individual regions than that observed with histology, allowing for separation of the anatomical locations.

| DISCUSS ION
Morphological analysis of CC may reveal novel understanding of musculoskeletal physiology and pathology. A suitable tool for structural analysis of CC would be µCT; however, the separation between bone and CC is extremely challenging. In this study, we developed a µCT-based framework for 3D analysis of CC morphology. The framework utilizes state-of-the-art deep learning segmentation for automated analysis of CC.Th. Finally, we compared CC morphology on different locations within the healthy rabbit knees. Our results demonstrate that CC.Th can be quantified not only from histology but also from µCT, which is feasible and efficient due to an automatic segmentation approach. The proposed method enables studying the 3D morphology of the mineralized CC without the time-consuming and destructive histological processing and with minimal user-induced bias.
Our results revealed that different CNN architectures were best suited for CC segmentation from histology and µCT images ( Figure 2; Figure S2). The FPN decoder is computationally more efficient, but it introduces an up-sampling layer for the model output. As

F I G U R E 4
Examples from the co-registered histology slices and µCT images. Scalebar for 200 µm is shown in the top left. The CC can be assessed using both imaging modalities, although the thinnest CC areas are not visible in the µCT images. Likely, these areas have a similar level of mineralization as the subchondral bone. The histology preparation could cause swelling of the tissue. This likely causes the largest proportional differences on the patella, which has a thick CC layer. µCT, micro-computed tomography; CC, calcified cartilage a result, U-Net provides more detailed predictions because the CC is predicted without a subsequent interpolation. The results show that the U-Net decoder provided a slight advantage for segmenting the more complex CC structures in histology images. In the µCT images, such details are not visible, and FPN decoder yielded better results than the U-Net one. Encoder-wise, the deeper ResNet-34 might yield even better performance than the ResNet-18 encoder (He et al., 2016). However, the ResNet-18 encoder with fewer layers than ResNet-34 performed better on the µCT data than ResNet-34.
Thus, we suspect that the more complex ResNet-34 may overfit when images become ambiguous, as in the case of the µCT images.
The automated CC segmentation performed particularly well for the histology samples ( Figure S5). A relatively high Dice score coefficient (0.891) and similar CC.Th results compared with the manual annotations (r = 0.984) suggest that the automated and manual methods give virtually identical results. For the µCT data, the performance was weaker than for the histology data (Dice = 0.807, r = 0.801). However, the segmentation of CC from the µCT images is much more difficult than segmentation from histology slides.
Therefore, this result was expected. Based on our experience, there is also a significant variation in manual CC segmentation between human annotators. However, when comparing the estimated 2D CC.Th between histology and µCT for co-registered regions, there was strong agreement (r = 0.897). Although not explicitly shown in this study, we note that the proposed segmentation method generalizes fairly well, and could potentially be used to predict CC structures in diseased samples. This is supported by our initial experiments on osteoarthritic CC, and we aim to characterize both healthy and diseased CC in the future. Furthermore, one could also include manual annotations of diseased structures in the training data to increase segmentation performance.
We have previously shown that the subchondral bone plate imaged with µCT contains also the CC layer (Finnilä et al., 2017).
Consequently, automated labelling of the CC layer could identify the true subchondral bone tissue accurately. The proposed method requires high-resolution for resolving the mineralized cartilage. We believe that this is of high interest for studies that focus on the subtle changes in the bone plate, such as thinning due to increased remodelling. Such thinning of the bone plate has been suggested to occur already in the early stages of OA (Burr & Gallant, 2012).
The CC.Th measured from histology was on average 21.9 µm thicker compared with µCT, with highest differences on thickest regions such as the patella ( Figure 5) with bone has previously been reported using X-ray diffraction (Rey et al., 1991;Zhang et al., 2012). However, multiple research groups have reported a higher mineralization of CC in backscattered electron imaging (Burr, 2004;Ferguson et al., 2003;Gupta et al., 2005) and Raman microscopy (Das Gupta et al., 2020)  Interestingly, CC.Th depends greatly on anatomical location, as identified with both imaging methods. This is also consistent with our hypothesis. In the patellar region, CC.  (Dougherty & Kunzelmann, 2007). Fifth, the segmentation models might require fine-tuning to data acquired from a different microscope or µCT scanner to ensure sufficient performance on new samples. In the future, more detailed CC structure could potentially be extracted by combining the presented approach with contrast-enhancement (Kerckhofs et al., 2018;Nieminen et al., 2015) and/or imaging with devices capable of submicron resolution (Akhter et al., 2017). Finally, the proposed histology segmentation does not account for multiple tidemarks.
Some evidence for tidemark duplication was found in few samples especially on medial femoral condyle, but the duplicated tidemarks were only faintly highlighted. We believe that the lack of duplicate tidemarks might be mainly due to the fact that we studied healthy rabbits but could also be caused by the properties of the chosen histological stain.
In conclusion, we have presented a promising method for the morphological analysis of CC with µCT. To the best of our knowledge, this is the first automated method for quantitative 3D analysis of CC.Th that has been sufficiently validated against the histological gold standard. It is a relatively simple extension to current µCT pipelines that allow 3D analysis of CC morphology. As a proof of concept, we could detect anatomical variation in the rabbit knee; the patellar region has the thickest CC and the tibial plateau region the thinnest.
This structural difference between regions is presumably related to the diverse biomechanical environments, and thus the different requirements of the joint surfaces in different regions of the knee.
Combined with other bone analysis, µCT imaging could provide an efficient alternative to histology when studying dynamic processes of the osteochondral junction, such as the tidemark advancement or bone plate remodelling.

ACK N OWLED G M ENTS
The full source code of the project is openly available on our re- ***p < 0.001.