Optimization of the failure criterion in micro-Finite Element models of the mouse tibia for the non-invasive prediction of its failure load in preclinical applications.

New treatments against osteoporosis require testing in animal models and the mouse tibia is among the most common studied anatomical sites. In vivo micro-Computed Tomography (microCT) based micro-Finite Element (microFE) models can be used for predicting the bone strength non-invasively, after proper validation against experiments. The aim of this study was to evaluate the ability of different microCT-based bone parameters and microFE models to predict tibial structural mechanical properties in compression. Twenty tibiae were scanned at 10.4 μm voxel size and subsequently tested in uniaxial compression at 0.03 mm/s until failure. Stiffness and failure load were measured from the load-displacement curves. Standard morphometric parameters were measured from the microCT images. The spatial distribution of bone mineral content (BMC) was evaluated by dividing the tibia into 40 regions. MicroFE models were generated by converting each microCT image into a voxel-based mesh with homogeneous isotropic material properties. Failure load was estimated by using different failure criteria, and the optimized parameters were selected by minimising the errors with respect to experimental measurements. Experimental and predicted stiffness were moderately correlated (R2 = 0.65, error = 14% ± 8%). Normalized failure load was best predicted by microFE models (R2 = 0.81, error = 9% ± 6%). Failure load was not correlated to the morphometric parameters and weakly correlated with some geometrical parameters (R2 < 0.37). In conclusion, microFE models can improve the current estimation of the mouse tibia structural properties and in this study an optimal failure criterion has been defined. Since it is a non-invasive method, this approach can be applied longitudinally for evaluating temporal changes in the bone strength.


Introduction
Osteoporosis and osteoarthritis are the most common chronic diseases of the musculoskeletal system. Animal models are fundamental for the development and testing of new bone physical or pharmacological interventions before clinical translation, and the mouse is among the most common models for the ability of controlling the environment, the relatively low costs, and the possibility of performing high-resolution assessment of bone and other musculoskeletal tissues (Bouxsein et al., 2010). In terms of clinical translation, bone strength is a relevant endpoint and many studies have focused on the development of tools for the accurate prediction of strength in the human femur and vertebra under relevant loading conditions (Keaveny et al., 2014;Qasim et al., 2016;Zysset et al., 2015). Nevertheless, in preclinical studies the strength of the mouse tibia is commonly measured using three-point bending tests (Jepsen et al., 2015), which presents a number of limitations. First, these tests are invasive and they can only be performed ex vivo in cross-sectional experiments, which are associated to the usage of a large number of animals. Second, three-point bending is not representative of the physiological loading conditions, which would be better replicated by compressive tests (Holguin et al., 2013). Third, this testing approach is affected by experimental artifacts, mainly due to the fact that the aspect ratio of the tibia is low and its cross-section is not constant along the longitudinal direction (Wallace et al., 2014). In order to overcome these limitations, micro-Finite Element (microFE) models can be used for predicting bone strength under compression non-invasively, if properly validated against experiments. In particular, this approach based on in vivo micro-Computed Tomography (microCT) can be utilised in a longitudinal experimental design , to dramatically reduce the usage of mice in bone research, in line with the 3Rs (replacement, reinement and reduction of the usage of animals in research) (Viceconti and Dall'ara, 2019).
MicroCT-based microFE models of the mouse tibia have been applied to study the effect of different bone interventions, including ovariectomy (Roberts et al., 2019), mechanical loading (Patel et al., 2014;Razi et al., 2015) and parathyroid hormone injections . Nevertheless, their validation against experimental data is limited. Some studies have validated the local strains predicted by microFE models against strain gauge measurements (Patel et al., 2014;Razi et al., 2015;Yang et al., 2017), which are limited to a few spatial locations over the tibia. Additionally, the application of the sensor itself may cause a local stiffening of the specimen, as shown for the mouse forearm (Begonia et al., 2017). Digital Image Correlation measurements have also been used to validate microFE strain distributions on the surface of the tibia (Pereira et al., 2015). Lastly, the local displacements of the tibia tested under compression in the elastic regime have been measured with Digital Volume Correlation and used to validate the microFE models outputs (Oliviero et al., 2018). However, the above studies mainly focused on validating the models at the local level and in the elastic range, while a comparison of the structural mechanical properties (stiffness and failure load) predictions versus experimental measurements is missing for the mouse tibia.
The predictions of structural properties from FE models have been extensively validated for different bone types , such as trabecular bone specimens (Schwiedrzik et al., 2016;Wolfram et al., 2010), human vertebral bodies (Crawford et al., 2003;Dall'ara et al., 2012;Gustafson et al., 2017), human femur Pottecher et al., 2016;Schileo et al., 2008) and human distal radius (Macneil and Boyd, 2008;Pistoia et al., 2002;Varga et al., 2011), while for mouse bones only a few validation studies have been reported. Nyman et al. (2015) evaluated the accuracy of microFE models for the prediction of the mouse vertebra strength in compression, as well as the inluence of material properties deinitions. They found good agreement between experiments and FE predictions (R 2 = 0.62-0.89), even though accuracy was dependant on the assigned material properties (Nyman et al., 2015). Varga et al. (2020) reported recently that the microFE approach could accurately predict mouse femur failure load in four-point bending tests, by comparison with experimental tests (R 2 = 0.93) (Varga et al., 2020). In these studies, the failure criterion used for mouse bones was based on outputs from linear models by assuming that bone fails when a certain number of elements are deformed beyond yield strain, adapting a rule originally deined by Pistoia et al. (2002) for the human distal radius. Nevertheless, the best failure criterion to predict the mouse tibia failure load under compression with microCT-based microFE is still unknown.
The aim of this study was to evaluate the ability of morphological and densitometric properties estimated from in vivo microCT and structural properties estimated with microFE models to predict experimentally measured structural mechanical properties of the mouse tibia when loaded in compression.

Materials and methods
An overview of the methods used in this study is presented in Fig. 1. Briely, twenty mouse tibiae were microCT scanned and subsequently tested under compression. The microCT images were used to generate specimen-speciic microFE models for the prediction of structural mechanical properties of the tibia. From the experimental curves, stiffness and failure load of each tibia were obtained. Finally, the best predictor of experimentally measured mechanical properties from morphological, densitometric and estimated mechanical properties with the microFE models was identiied with regression analyses. Fig. 1. Overview of the methods. Each tibia was dissected and the extremities were embedded in resin. A microCT scan was acquired for each tibia. Morphometric and densitometric parameters were measured from microCT images. The microCT scans were also used to generate specimen-speciic microFE models. Subsequently, each tibia was tested in compression. From the experimental curves, stiffness (S) and failure load (Fu) were measured. Mechanical properties were estimated from the models and compared to the experimental measurements. Total bone mineral content (BMC) was used to normalize the mechanical properties. Regression analyses were used to determine the ability of morphometric or densitometric parameters in predicting the experimental mechanical properties.

Sample preparation
Twenty mouse tibiae were dissected from female mice of different strains (C57BL/6J and BALB/c), ages (16 and 24 weeks) and intervention groups, collected from previous studies (Roberts et al., 2019). Details about the specimens are reported in Table 1. Samples from different groups of mice were selected in order to obtain a wide range of mechanical properties and to test the model predictions in different conditions. Groups included wild type mice (WT), ovariectomized mice (OVX, surgery performed at week 14 of age) and mice treated with parathyroid hormone (PTH, daily injections, 5 days/weeks from week 18 to 22 of age). Both left and right tibiae were included in the study.
After carefully removing the soft tissues and the ibula (cut above the tibio-ibular junction) with a scalpel, the specimens were dehydrated in air at room temperature for 1 h. The length was measured with a caliper and the extremities were embedded in resin (Technovit 4071, Kulzer, Germany) up to 10% of the total length ( Fig. 1), which facilitated the positioning in the loading machine and correspondence between loading conditions in the experiments and in the models. The alignment of each tibia was controlled with a custom-made jig that was also used for the mechanical testing. The tibiae were kept frozen at −20 • C until testing.

Scanning procedures and reconstructions
Each tibia was defrosted and rehydrated at room temperature in saline solution for 3 h. The bone was wrapped in cling ilm in order to avoid dehydration during the microCT scan. The scanning procedure applied in this study has been previously deined for in vivo applications (VivaCT 80, Scanco Medical, Bruettisellen, Switzerland; 55 kVp, 145 μA, 10.4 μm voxel size, 100 ms integration time, 32 mm ield of view, 750 projections/180 • , no frame averaging, 0.5 mm Al ilter) as a compromise between nominal radiation dose and accuracy in the measurement of bone properties (Oliviero et al., 2017). This protocol is associated with a nominal radiation dose of 256 mGy, which has minimal effects on bone properties . All images were reconstructed using software provided by the manufacturer (Scanco Medical AG) and applying a beam hardening correction based on a phantom of 1200 mg HA/cc density, which has been shown to improve the local tissue mineralization measurement (Kazakia et al., 2008). MicroCT images were used to generate specimen-speciic microFE models, as well as to measure morphometric and densitometric parameters, in order to analyze their ability to predict the mechanical properties. Interested readers are welcome to contact the corresponding author who will share the data used in this study (https://doi.org/10.15131/shef.data.131761 31).

Standard morphometric analysis
The procedure applied for morphometric analysis has been published previously (Oliviero et al., 2017 and is briely summarized here. Standard morphometric analyses of trabecular and cortical regions of interest were performed in CTAn (Bruker, Belgium). For trabecular analysis, a reference cross-section was selected, identiied as the one where the medial and lateral sides of the growth plate merged. The trabecular VOI started at an offset of 0.2 mm from the reference slice and extended 1 mm distally. Trabecular bone was contoured manually by Table 1 Properties of the tested mouse tibiae. WT = wild type, OVX = ovariectomized, PTH = treated with parathyroid hormone injections. Parameters reported: BMC = bone mineral content, TMD = tissue mineral density, BV/TV = bone volume fraction. selecting 2D regions of interest every 5 slices. A single level threshold was used for segmentation, calculated for each tibia as the average of the grey levels corresponding to the bone and background peaks in the histogram. A despeckling ilter was applied to remove 3D white (bone) regions less than 10 voxels in volume, which were attributed to non-iltered noise. Trabecular bone volume fraction (Tb.BV/TV), thickness (Tb.Th), separation (Tb.Sp) and number (Tb.N) were computed (Bouxsein et al., 2010). For cortical analysis, a 1 mm thick region was centered at the tibial midshaft. After segmentation, pores within the cortex were removed by applying a closing function (2D round kernel, 10 pixels radius). Total cross-sectional area (Tt.Ar), cortical bone area (Ct.Ar), cortical area fraction (Ct.Ar/Tt.Ar) and cortical thickness (Ct.Th) were computed (Bouxsein et al., 2010). Cortical bone analyses were not performed in the proximal region due to the lack of reproducibility in identifying the cortical contour in that porous region.

Spatial distribution of BMC
The method used for analyzing the spatial distribution of BMC (implemented using Matlab and using the raw 16 bit images) has been developed in previous studies by our group (Lu et al., 2016;Oliviero et al., 2017) and is summarized here. The attenuation coeficients acquired in the microCT images were converted into tissue mineral density (TMD) using the calibration law provided by the manufacturer of the scanner. Weekly quality checks were performed on a densitometric phantom with ive insertions (800, 400, 200, 100 and 0 mg HA/cc) in order to monitor the stability of the calibration parameters. BMC in each voxel was calculated as its TMD multiplied by the volume of the voxel. A volume of interest (VOI) was selected by excluding the portions embedded in the resin (Fig. 1). Total BMC, TMD and BV/TV were computed in the VOI. The VOI was divided into ten longitudinal sections ("1" refers to the proximal tibia, "10" refers to the distal tibia).  divided into four quadrants (anterior, posterior, medial and lateral), deined for each cross-section by two perpendicular lines containing its centroid (40 partitions in total), and BMC was calculated for each partition.

Mechanical tests in uniaxial compression
Each tibia was tested using a Bose Electroforce 3200 mechanical testing machine with a 450 N load cell. Bones were carefully positioned to align the embedding blocks to the axis of the loading machine (Fig. 1). Ten preconditioning cycles were applied at 0.042 Hz between 1 N and 4 N to achieve a steady viscoelastic state and to ensure stable boundary conditions during the test (Zhao et al., 2018). Afterwards, each bone was loaded in compression until failure at 0.03 mm/s (Holguin et al., 2013). Stiffness [N/mm] was measured from the load-displacement curve as the slope of the linear portion of the curve (Fig. 1). Failure load [N] was deined as the maximum load from the load-displacement curve (Fig. 1). Normalized failure load and stiffness were calculated by dividing the failure load and the stiffness by the total BMC in order to account for the size and overall mineralization of the specimens.

Micro-Finite Element models
In order to replicate the experimental alignment in the microFE models, each image was rigidly rotated so that the longitudinal axis corresponded to the loading direction of the testing machine. The lower surface of the embedding material was identiied from the microCT image, itted to a plane (afine_it function, Matlab) and aligned to the horizontal direction with a rigid rotation (Amira 6.0.0, FEI Visualization Sciences Group, France). After alignment, images were resampled using Lanczos interpolator (Birkhold et al., 2014). A Gaussian ilter (kernel 3 × 3 × 3, standard deviation 0.65) was applied to reduce the high frequency noise (Bouxsein et al., 2010).
The image was segmented by using a single level threshold, calculated as the average of the grey levels corresponding to the bone and background peaks in the image histogram (Christiansen, 2016). A connectivity ilter was applied in order to remove unconnected voxels (connectivity rule equal to 6 keeping plane connectivity, bwlabeln function in Matlab). A Cartesian mesh was obtained by converting each bone voxel into an 8-noded hexahedral element (Chen et al., 2017;Costa et al., 2017;Patel et al., 2014) with isotropic linear elastic material properties (Young's Modulus = 14.8 GPa, Poisson's ratio = 0.3 (Oliviero et al., 2018),). The value of the Young's modulus chosen in this study for the homogenous microFE models is also in line with the mean elastic modulus measured from nanoindentation tests in the tibia of C57BL/6J and BALB/c female mice in a similar age range (Pepe et al., 2020). Uniaxial compression was simulated by fully constraining the distal end of the tibia and applying a displacement of 0.1 mm on each node of the proximal surface along the longitudinal direction. The apparent stiffness was calculated as the sum of reaction forces at the distal surface, divided by the applied displacement. For the estimation of failure load from linear microFE models, different failure criteria were deined based on the criterion introduced by (Pistoia et al., 2002) for the human distal radius, which assumed that the bone fails when a portion of the nodes reaches a critical strain level. For the distal radius, the parametric analysis determined the optimal parameters as failure volume equal to 2% of the nodes and critical effective strain level of 7000 με (Pistoia et al., 2002). Similarly, in this study a parametric analysis was performed to determine the optimal volume of failed elements and the critical strain value for the mouse tibia, which provided the strongest correlation with the experimental results and the lowest errors. The following failure criteria were tested, based on strain analyses across the whole tibia: • The tibia fails when a portion of the nodes (2%, 4%, 6%, 8% or 10%) reaches an effective strain level of 7000 με (Pistoia et al., 2002).
• The tibia fails when a portion of the nodes (2%, 4%, 6%, 8% or 10%) reaches a critical strain, either in tension (irst principal strain of 6400 με) or in compression (third principal strain of −14420 με), based on the optimal values from the analyses above.
• The tibia fails when the median irst principal strain or third principal strain in one section (tibia divided into ten portions) or in one sector (tibia divided into ten sections and each section divided into anterior and posterior partitions; 20 portions in total) reaches 6400 με or −14420 με (optimal values for the global strain analyses), respectively. Details about the methods and the results for these criteria are reported in the Appendix A.

Statistical analysis
Linear regression analysis was used to compare the experimental and predicted mechanical properties, as well as to assess if any morphometric or densitometric parameters could predict the measured stiffness or failure load. For each parameter, the following regression parameters were reported: slope and intercept of the regression line, coeficient of determination (R 2 ), root mean square error (RMSE), percentage error (mean and standard deviation). Statistical signiicance was deined at p = 0.05. For each regression (between experimental and microFE structural properties), the two-tailed Student's t-distribution (T.DIST.2T function, Excel) was used to determine if slope and intercept of the regression line were signiicantly different from 1 and 0 respectively. Statistical signiicance was deined at p = 0.05.
Multivariate regression analyses (IBM SPSS Statistics 25) were used to investigate sets of parameters that may predict the whole bone stiffness and failure load. Two models were investigated. In the irst model, morphometric measurements were included as independent variables (trabecular parameters, cortical thickness and total crosssectional area). After checking for multicollinearity, trabecular number was excluded from the model, as it was strongly correlated with trabecular bone volume fraction (R = 0.942) and trabecular separation (R = 0.885). In the second model, BMC and second moments of area were added to the above-mentioned morphometric parameters as independent variables. Iyy and Izz were excluded from the model, being strongly correlated with total cross-sectional area (R = 0.923) and Ixx (R = 0.926), respectively.

Results
All details about the correlations between bone mechanical properties and the morphometric or densitometric properties computed from the microCT images or the predicted structural properties estimated from microFE models are reported in Appendix A. The reader is asked to refer to this appendix for p-values in case of signiicant correlations.
All microFE models took 30-40 min to solve (HPC ShARC, University of Shefield; 8 cores, memory = 32GB/core). The tibia stiffness was best predicted by microFE models, while weak or no correlations were found with other parameters. Stiffness was weakly correlated with BV/TV in section 1 (proximal tibia, R 2 = 0.20) and with second moment of area in the medio-lateral direction in section 8 (distal tibia, R 2 = 0.22). No correlation was found between stiffness and any of the other morphometric or densitometric parameters (p > 0.06). Normalized stiffness was weakly or moderately correlated with total TMD (R 2 = 0.38), total bone volume (R 2 = 0.38), BMC in the longitudinal sections (R 2 = 0.38-0.61) and sectors (R 2 = 0.26-0.60), and TMD in the longitudinal sections (R 2 = 0.28-0.48). A few weak correlations were also found for average total area in some longitudinal sections (R 2 = 0.30 for section 1 and 10), bone area (R 2 = 0.28-0.29 for sections 1, 7

Fig. 7.
Strain distributions for three specimens, for which the highest, lowest and average failure load was measured respectively. Section 1 = proximal tibia, section 10 = distal tibia. The black bar in the legend indicates the failure strain level assumed in the optimized failure criterion. and 10), second moment of area in the antero-posterior direction (R 2 = 0.37-0.46 for sections 1, 9 and 10) or in the medio-lateral direction (R 2 = 0.32 for section 8), and polar moment of inertia (R 2 = 0.20-0.36 for section 1, 9 and 10). Lastly, normalized stiffness was weakly correlated with trabecular thickness (R 2 = 0.24).
Multiple regression models did not improve the prediction of stiffness. Stiffness could not be predicted by morphometric parameters (p = 0.280), nor by adding BMC and second moments of area to the model (p = 0.116).
Similar to the results obtained for stiffness, only weak correlations were observed between failure load and a few morphometric or densitometric parameters. Failure load was weakly correlated with average bone area (R 2 = 0.25), second moment of area in the antero-posterior direction (R 2 = 0.37) and polar moment of inertia (R 2 = 0.35) calculated in section 6 (tibial diaphysis). Failure load was also weakly correlated with second moment of area in the medio-lateral direction (R 2 = 0.25) calculated in section 9 weak correlation was found for total BV/TV (R 2 = 0.33) in section 1 (proximal tibia). No correlation was found between the other morphometric or densitometric parameters and failure load (p > 0.05).
Regression analyses between the experimental measurements and microFE predictions of failure load based on different failure criteria are reported in Fig. 3, while regression analyses for normalized failure load are reported in Fig. 4. In Figs. 5 and 6, Bland-Altman plots are reported for failure load and normalized failure load respectively. Failure load calculated using the Pistoia method dramatically underestimated the experimental measurement (slope = 1.74, intercept = 17N, absolute error = 63% ± 4%, Fig. 3). Also, prediction errors were dependent on the failure load magnitude (Figs. 5 and 6). The best correlation and lowest errors with respect to the experimental measurements were obtained by using the failure criterion based on third principal strain, a critical strain level equal to −14420 με and a failure volume of 10% (R 2 = 0.48 for failure load, slope = 0.64, intercept = 16N, absolute error = 9% ± 6%; R 2 = 0.81 for normalized failure load, slope = 1.07 not signiicantly different from 1, p = 0.56, intercept = 0 N/mg not significantly different from 0, p = 0.54, absolute error = 9% ± 6%, Figs. 3 and  4). Increasing the failure strain was associated with a decrease in slope of the regression line (from 1.93 to 0.64 for third principal strain of −6180 με to −14420 με and failed volume of 10%) and average error (from 56% to 9%). Similarly, an increase in failure volume was associated with a decrease in the slope of the regression line (from 0.83 to 0.64 for third principal strain of −14420 με, Fig. 3) and in the average error (from 23% to 9%). The failure criterion based on combined irst and principal strains did not improve the estimations. Since above 96% of nodes failed in compression irst, results were similar to those obtained for third principal strain.
Strain distributions for three specimens (lowest, highest and average measured failure load) are reported in Fig. 7, as well as the average strains in each of the ten longitudinal sections. Spatial distributions of strains were similar among specimens, with peaks of compressive strain located at the postero-lateral apex and on the antero-medial surface towards the distal end of the tibia (Fig. 7).
Multiple regression models showed that failure load could not be predicted by morphometric parameters alone (p = 0.437). By adding BMC and second moments of area, the model was able to predict failure load (R 2 = 0.714, p = 0.014). Details of the regression parameters are reported in the Appendix (Table A9). It should be noticed that the only parameters that contributed signiicantly to the regression model were BMC (p = 0.012) and Ixx (p = 0.002).
Regression analyses between stiffness and failure load showed very good correlation for both experimental tests (R 2 = 0.71) and FE predictions (R 2 = 0.80) (Fig. 8).

Discussion
In this study, the best predictors of experimentally measured structural mechanical properties of the mouse tibia among morphometric and densitometric properties, measured from in vivo microCT, and structural properties predicted from microCT-based microFE models, have been investigated. Moreover, the failure criterion for microFE models of the mouse tibia in compression has been optimized for the irst time.
Structural properties measured under compression were only weakly correlated with geometrical properties. This is in contrast with what was observed for bending tests, where the experimental failure load was strongly correlated with the moment of inertia at the midshaft (R 2 = 0.84) (Varga et al., 2020), highlighting the dependence on the different loading condition (four-point bending test vs apparent compression applied in this study). Nevertheless, it is interesting to notice that the failure load measured in compression was weakly correlated with cross-sectional moments of inertia at the midshaft and below the tibio-ibular junction (R 2 = 0.25-0.37), which correspond to the locations of peak strains (Fig. 7). Structural properties were not correlated with total bone mineral content or with local tissue mineral density in the longitudinal sections, which suggests that the geometry and loading conditions are the main drivers determining the measured structural properties. However, it should be noted that BMC in a few sections (section 1 and 7) or sectors (postero-lateral sectors at the proximal tibia, or medial sector at the distal tibia) of the bone showed very good correlation with normalized failure load (R 2 = 0.77-0.79), which is similar to the predictions from microFE models (R 2 = 0.81). Similarly, failure load could be predicted (R 2 = 0.71) using a combination of densitometric (total BMC) and geometric (second moment of area along the antero-posterior axis) parameters. While this result highlights that BMC in a speciic region of the tibia, or a combination of multiple parameters, can be used to estimate its normalized failure load, it should be noted that this is valid only for this speciic loading scenario, while the microFE models can potentially provide an estimate of the mechanical properties in different loading conditions after proper validation.
The correlations found in this study between experimental and microFE predicted structural properties were generally lower compared to previous validation studies on different mouse bone structures (R 2 = 0.62-0.89 for the mouse vertebra in compression (Nyman et al., 2015), R 2 = 0.93 for the mouse femur in four-point bending (Varga et al., 2020)). This difference may be due to the simpler geometry of the vertebral body compared to the tibia and considering that in four-point bending tests the mechanical properties are mainly driven by the small portion of bone between the loading pins. Due to the higher aspect ratio and the natural curvature of the tibia, when it is nominally loaded in compression along the longitudinal direction its tissue is subjected to a complex combination of compression and bending. We cannot exclude that a small misalignment between the direction of the experimental load and the load applied in the models may have a greater impact on the overall loading scenario, compared to the compression of shorter bones (vertebral body) or the four-point bending test. In order to minimize these differences, we acquired the microCT images after embedding the tibia extremities in resin, so that the embedding material provided the reference surface for identifying the direction of the applied displacement. The model predictions improved for structural mechanical properties normalized by total bone mineral content (BMC). This suggests that the variability in the size and mineral density of the specimens could have an effect on the model accuracy by leading to a larger scatter in the model predictions, which are compensated by the BMC normalization.
The failure criterion developed by Pistoia et al. (2002) for the human distal radius had to be adapted for its application to the mouse tibia. Errors of 9% ± 6% were obtained by assuming that the tibia fails when 10% of the nodes reach a third principal strain of −14420 με. This was in line with the indings of previous phenomenological models, showing that the failure criterion had to be tuned according to the speciic bone structure and analyzed loading scenario. In compression, a criterion based on third principal strain performed better than one based on effective strain, suggesting that the majority of the nodes failed in compression. The larger failure volume (10%) indicates greater resistance of the structure in compression compared to bending. It should be noted that the predictions based on the effective strain led to large errors with regression slopes far from 1 and a proportional bias as shown by the Bland-Altman plots, highlighting that this approach is not ideal for the mouse tibia. In Varga et al. (2020), the optimized parameters for predicting the failure load of the mouse femur in four-point bending were failure volume of 3% and critical strain of 10000 με. For the mouse vertebra, it was shown that the optimized parameters were dependant on the assigned material properties (Nyman et al., 2015). Lastly, in a previous study using time-lapsed compression and digital volume correlation analyses by our group (Oliviero et al., 2018), it was shown that a failure volume of 2% and critical strain of −10300 με led to errors of 9% ± 9% for the prediction of mouse tibia failure load. This difference in the optimal failure criterion parameters was likely due to the difference in the experimental testing modality. Nevertheless, in this study the experimental and predicted normalized failure load were very well correlated (R 2 = 0.81 for failure loaded divided by the total bone mineral content) with low absolute errors (9% ± 6%). This study has some limitations. First, the microFE models used for the predictions of the mechanical properties are relatively simple, with hexahedral meshes and homogeneous material properties. Differences in local TMD and Young's modulus were not modelled and could lead to a different strain distribution and, therefore, to a different value of predicted stiffness and strength. Nevertheless, the good quantitative predictions of structural stiffness suggest that the considered elastic modulus was reasonable. Moreover, homogeneous models based on hexahedral mesh can be very useful for application in longitudinal studies, as they require minimal pre-processing, operator interactions, and reasonable computation time. Furthermore, in a previous study we have shown that these models can accurately predict the local displacements over the tibia volume in compression (Oliviero et al., 2018), conirming the models accuracy for local predictions, important for bone remodeling algorithms (Cheong et al., 2020), and normalized structural properties. Also, boundary conditions were deined by assuming that the displacement applied to the top surface of the embedding material was perfectly transmitted to the top surface of the tibia free length. While this approach should minimize the differences between experiments and computational models, any compliance in the embedding material or in the ixation device of the machine was not accounted for, which could explain lower correction for stiffness. Another limitation of this study was the relatively small range of properties of the tested tibiae. In order to increase the range in mechanical properties, tibiae from different groups of mice were included in the study. Nevertheless, a larger age range may have increased the variability in bone mechanical properties due to differences in micro-structure, nano-porosities and mineral density distribution. Lastly, while the failure criterion has been calibrated for the reported dataset, in order to generalize the indings, the developed approach should be validated in the future using an independent group of specimens, possibly extending the age range and adding other inbred mouse strains.
In conclusion, an optimal failure criterion was identiied for predicting the failure load of the mouse tibia from linear microFE models generated from in vivo microCT images. These models can be applied in longitudinal preclinical studies for the non-invasive prediction of the structural mechanical properties of the mouse tibia.

Declaration of competing interest
The authors declare that they have no known competing inancial interests or personal relationships that could have appeared to inluence the work reported in this paper.

Table A1
Regression parameters for the different failure criteria analyzed. For every regression p-value < 0.001. * indicates that the Slope was not signiicantly different from 1 (p-value>0.05). ** indicates that the Intercept was not signiicantly different from 0 (p-value>0.05). Regression parameters and absolute errors for failure load predicted using different failure criteria are reported in Table A1.

Table A2
Regression parameters for the different failure criteria analyzed. For every regression p-value < 0.001. * indicates that the Slope was not signiicantly different from 1 (p-value>0.05). ** indicates that the Intercept was not signiicantly different from 0 (p-value>0.05).  Regression parameters and absolute errors for normalized failure load predicted using different failure criteria are reported in Table A2.
Two additional failure criteria were tested, based on strains spatially localized in critical regions of the bone: • The tibia was divided into 10 longitudinal sections and it was assumed that the tibia fails when the median irst principal strain in one section reaches 6400 με, or the median third principal strain in one section reaches −14420 με (optimal values for the global strain analyses), or when the irst among the two limits in tension or compression is reached.
• The tibia was divided into 10 longitudinal sections, and in the antero-posterior direction according to the centroid of each cross-section (20 partitions in total). It was assumed that the tibia fails when the median irst principal strain in one region reaches 6400 με, or the median third principal strain in one region reaches −14420 με (optimal values for the global strain analyses), or when the irst among the two limits in tension or compression is reached.

Table A3
Regression parameters for the different failure criteria analyzed. The tibia was divided into either 10 regions of interest (10 longitudinal sections) or 20 regions of interest (10 longitudinal sections, each divided into an anterior and a posterior partition). For every regression p-value < 0.001. These failure criteria based on dividing the tibia into partitions did not improve the estimation of strength (R 2 = 0.05-0.47, slope = 0.10-0.58). Regression parameters and absolute errors are reported in Table A3.

Table A4
Regression parameters for the different failure criteria analyzed. The tibia was divided into either 10 regions of interest (10 longitudinal sections) or 20 regions of interest (10 longitudinal sections, each divided into an anterior and a posterior partition). For every regression p-value < 0.001. * indicates that the Slope was not signiicantly different from 1 (p-value>0.05). ** indicates that the Intercept was not signiicantly different from 0 (p-value>0.05). Regression parameters and errors for normalized failure load predicted by dividing the tibia into partitions are reported in Table A4.

Table A5
Regression parameters for the different predictors of stiffness. BV/TV = bone volume fraction; Iyy = second moment of area in medio-lateral direction. Regression parameters and errors for stiffness predicted using different morphometric and densitometric parameters are reported in Table A5.

Table A6
Regression parameters for the different predictors of failure load. BV/TV = bone volume fraction; Ixx = second moment of area in antero-posterior direction; Iyy = second moment of area in medio-lateral direction; Izz = second moment of area in longitudinal direction. Regression parameters and errors for failure load predicted using different morphometric and densitometric parameters are reported in Table A6.

Table A7
Regression parameters for the different predictors of normalized stiffness. Tb.Th = trabecular thickness; BMC = bone mineral content; TMD = tissue mineral density; BV = bone volume; TV = total volume; BV/TV = bone volume fraction; BMD = bone mineral density; Ixx = second moment of area in antero-posterior direction; Iyy = second moment of area in medio-lateral direction; Izz = second moment of area in longitudinal direction.  Regression parameters and errors for normalized stiffness predicted using different morphometric and densitometric parameters are reported in Table A7.

Table A8
Regression parameters for the different predictors of normalized failure load. Ct.Th = cortical thickness; BMC = bone mineral content; TMD = tissue mineral density; BV = bone volume; TV = total volume; BV/TV = bone volume fraction; BMD = bone mineral density; Ixx = second moment of area in antero-posterior direction; Iyy = second moment of area in medio-lateral direction; Izz = second moment of area in longitudinal direction.   Regression parameters and errors for normalized failure load predicted using different morphometric and densitometric parameters are reported in Table A8. Regression parameters for failure load predicted by multivariate model (R 2 = 0.714, p = 0.014) are reported in Table A9. Seven independent variables were included: trabecular bone volume fraction, thickness and separation, cortical thickness, total BMC, total cross-sectional area, second moment of area along x direction.