Digital volume correlation can be used to estimate local strains in natural and augmented vertebrae: An organ-level study Journal of Biomechanics

on each specimen should always be performed before any in-situ micro-CT testing campaign. This study clearly shows that, when suf ﬁ cient care is dedicated to preliminary methodological work, different DVC computation approaches allow measuring the strain with a reduced overall error (approximately 200 microstrain). Therefore, DVC is a viable technique to investigate strain in the elastic regime in natural and augmented bones. & 2016 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Digital Volume Correlation (DVC) has been used to explore the full-field displacement and strain distribution inside specimens from 3D images (Bay et al., 1999;Grassi and Isaksson, 2015;Roberts et al., 2014). Since the introduction of DVC, several studies were performed to evaluate its reliability (measurement error). As no other experimental method allows measuring internal displacements and strains, validation experiments must be designed where the field of displacement and/or strain is known a priori. DVC is extremely powerful in measuring displacements (overall error of 1/50 to 1/10 of the voxel size (Bay et al., 1999;Dall'Ara et al., 2014;Freddi et al., 2015;Palanca et al., 2015;Tozzi et al., 2016a)). Conversely, DVC-computed strains are affected by significant errors. Tests in a zero-strain condition have been performed, from the tissue-level (trabecular or cortical bone (Bay et al., 1999;Dall'Ara et al., 2014;Gillard et al., 2014;Liu and Morgan, 2007;Palanca et al., 2015;Zhu et al., 2015)), to the organ-level (vertebral bodies (Hardisty and Whyne, 2009;Hussein et al., 2012)). Depending on the nature of the tissue type under investigation and on the voxel size of the input images, the accuracy of strain measurements can range between 300 and 794 microstrain, while the precision between 69 and 630 microstrain (Roberts et al., 2014). All these studies showed how the performance of DVC depends on the natural texture of the specimen (i.e. histomorphometric parameters in trabecular bone), and how DVC is suitable to examine the pre-and post-yield deformation in bone (Liu and Morgan, 2007;Tozzi et al., 2016b).
The above-mentioned studies provided deep basic knowledge about the reliability and main benefits/limitations of the DVC applied to bone with no information about the variability of such errors between specimens. In fact, in those studies the DVC uncertainties were evaluated using only one (Bay et al., 1999;Dall'Ara et al., 2014;Gillard et al., 2014;Palanca et al., 2015;Zauel et al., 2006) or two (Liu and Morgan, 2007) specimens.
It was probably (Bay et al., 1999) who first assessed the variability of errors between different trabecular bone cores. Later (Liu and Morgan, 2007) performed an evaluation on more bone types considering the intrinsic variability in different biological tissues (2 specimens for each type).
Another open issue relates to the reliability of DVC in bones interdigitated with biomaterials as opposed to natural bones. In fact, vertebroplasty has become increasingly popular to treat and/ or prevent osteoporotic vertebral fractures (Wilcox, 2004). Vertebroplasty requires the injection of bone cement inside the vertebral body, through a cannula. Due to the potential clinical implications in investigating augmented bone, the reliability of DVC on such composite structures must be investigated.
To the authors' knowledge, a systematic comparison of the output of two different DVC approaches (i.e. local and global), at the organ-level, on specimens including different materials such as an augmented vertebra, and including inter-specimen variability, is currently missing.
The aims of this work were therefore to compare the output of a local and a global DVC approach on a stationary test, and specifically: To quantify the reliability (in terms of systematic and random error) of DVC when applied to natural and augmented bones; To investigate the spatial distribution of the errors, and the presence of any preferential direction; To assess the variability between different specimens; In order to achieve these aims, zero-strain tests were performed on porcine natural and augmented vertebrae.

Specimens and images
Ten thoracic vertebrae were collected from six fresh porcine spines, obtained from the alimentary chain. Soft tissues, intervertebral disks and growth plates were removed. A sample of five vertebrae was used for augmentation (hereafter referred to as "augmented"). Acrylic vertebroplasty cement (Mendec Spine, Tecres, Italy) was injected in the vertebral body with its proprietary device, until the cement started leaking (typically $ 1 ml of cement). The cement contained BaSO 4 pellets (average size: 300 μm) to increase radiopacity. To facilitate cement injection and curing, the vertebrae were heated, before and after augmentation, in a circulating bath at 40°C (Ye et al., 2007). Another sample of five vertebrae was left untreated (hereafter referred to as "natural"). Sampling was arranged so that the augmented and natural samples were well distributed within the thoracic spine segment (T1-T4), in order to avoid potential effects related to morphology. The posterior processes were removed for both samples. To allow consistent alignment inside the micro-CT, the extremities of each vertebra were potted in poly-methylmethacrylate (PMMA) with a dedicated positioning device (Danesi et al., 2014).
In order to evaluate the reliability of DVC approaches, each specimen was scanned twice without any repositioning, in a zero-strain condition, similarly to Palanca et al. (2015). Micro-CT (XTH225, Nikon Metrology, UK) scans had an isotropic voxel size of 39 μm, and were performed with the following settings: voltage 88 kV; current 110-115 micro-A; exposure 2 s; rotation step 0.23°; total rotation 360°. The specimens were placed in the environmental chamber of a loading device (CT5000, Deben Ltd, UK) and immersed in saline-solution, in order to closely simulate in situ loading conditions. Two volumes of interest (VOIs, Fig. 1) were cropped from each reconstructed 3D-image (MeVisLab, Me Vis Medical Solution AG, http://www.mevislab.de/): VOI-0 contained the whole vertebral body, including the thin cortical shell and the interface between the bone and the surrounding saline solution. VOI-0 was a parallelepiped circumscribing the contour of the vertebral body in the transversal plane, including 432 slices. This region was analyzed to study how the strain error changes through the vertebra, the vertebral body edge and the surrounding interface; VOI-1 was inside the vertebral body. VOI-1 was a parallelepiped inscribed inside the vertebra of 300 Â 300 Â 432 voxels (consistent for all specimens). VOI-0 was analyzed to quantify the error only inside the vertebrae.
In order to allow comparison between the results obtained from other DVC approaches, the image datasets used in the present study will be made available to the scientific community at https://dx.doi.org/10.6084/m9.figshare.4062351.v1 or by contacting the corresponding author.

Local vs. global approach
Two DVC software packages, using either a local or a global approach, were compared in this work, similarly to (Palanca et al., 2015). The local approach is implemented in a commercial package (DaVis 8.2.1, LaVision, Germany) later referred to as "DaVis-DC". The global approach is a combination of a home-written elastic registration software ShIRT (Sheffield Image Registration Toolkit) (Barber and Hose, 2005;Barber et al., 2007;Khodabakhshi et al., 2013) and a Finite Element (FE) software package (Ansys v.14.0, ANSYS, Inc., Canonsburg, PA), later referred to as "ShIRT-FE" (Dall'Ara et al., 2014). The operating principles of the two DVC approaches were described in detail in (Palanca et al., 2015). Briefly, DaVis-DC independently correlates sub-volumes from deformed to undeformed state as a discrete function of grey-levels. The matching between the sub-volumes is done via direct correlation, which provided better results compared to FFT (Palanca et al., 2015) for bone. A piece-wise linear shape function and a cross-correlation function are employed to quantify the similarity between the reference and deformed image. The displacement field is evaluated at the center of each sub-volume and the strain field is computed via centered finite differences. ShIRT-FE focuses on the recognition of identical features in the whole 3D images by superimposing a grid with selectable nodal spacing (sub-volume) to the images. ShIRT solves the elastic registration equations at the nodes of the grid to evaluate the displacement field. The grid is then converted into an eight-noded hexahedrons mesh and the displacements computed by ShIRT at each node are imposed as boundary conditions. The strain field is obtained using the FE solver to differentiate the displacement field obtained with ShIRT.
In order to compute the measurement errors, eight sub-volume sizes (from 16 to 128 voxels, in steps of 16 voxels) were investigated (Table 1). Moreover, a multipass scheme with final sub-volume size of 48 voxels (Table 2) was tested to explore the potentialities of the local approach. The multipass scheme is available only on DaVis-DC and is explained in (Palanca et al., 2015). Based on the results reported in that study (Palanca et al., 2015), 0% overlap was also used in the current study.
Finally, to avoid misinterpretation of the results due to potential uncorrelated volumes, the percentage of correlated volume for each sub-volume size was computed as the ratio between the number of the correlated voxels and the total number of voxels (Table 1). The correlated volume is an essential indicator for the local approach, as the correlation of each sub-volume is independent from each The vertebra was aligned and potted in a PMMA support and then scanned with a micro-CT. In order to show the differences between VOIs, the slice at midheight is reported for an augmented and a natural specimen. The larger box represents VOI-0: the entire vertebra with part of the surrounding saline solution. The smaller box represents VOI-1: a parallelepiped inscribed inside the vertebra.
other. For the global approach, instead, a grid is superimposed on the entire volume, and displacements and strains are computed on the nodes of the grid; so no regions are excluded.

Quantification of the errors (error metrics)
Given the zero-strain condition, any strain value different from zero was accounted as an error. The following analyses were carried out: Errors by strain component: For each specimen, the systematic and random errors were quantified as the average and standard deviation, for each component of strain. This analysis was repeated for VOI-0 and VOI-1 for the different sub-volume sizes.
Error distribution: In order to identify the areas with larger errors, a qualitative analysis of the distribution of apparent strain (z-component) was performed on the cross-section of VOI-0, for both DVC approaches, both samples, for subvolume size of 48 voxels (this sub-volume size was chosen as it corresponds to an acceptable level of the error, see below).
Inter-specimen variability: The systematic and random errors for each component of strain in VOI-1, for a sub-volume size of 48 voxels, were compared between specimens. In order to investigate potential relation between the magnitude of the error and the morphology of each specimen, the bone volume fraction (BV/TV: bone volume, divided by total volume) for the natural vertebrae, or the solid volume fraction (SV/TV: sum of volume of cement and of bone, divided by total volume) for the augmented vertebrae were computed. The images were segmented using a single threshold, chosen in the valley between the first two peaks of the frequency distribution in the grey-scale (histograms). The threshold value was adapted by visual comparison of the segmented and grey-scale images, in order to separate bone (or bone and cement) from the background. Both BV/TV and SV/TV were calculated as ratio between the number of voxels in the solid volume divided by the total number of voxels (Rasband, W.S., ImageJ, U.S. National Institutes of Health, Bethesda, Maryland, USA, http://imagej.nih.gov/ij/, 1997-2015) (BoneJ plugin (Doube et al., 2010)).
All the analyses were performed with a script in MatLab 2014a (MathWorks, US). Data were screened for outliers applying the criterion of Peirce (Ross, 2003).

Errors over VOI-0
The systematic errors fluctuated around zero microstrain, apart from the peak for the smallest sub-volume size (Supplementary materials). For small sub-volume sizes DaVis-DC had errors up to two orders of magnitude larger than ShIRT-FE; only with subvolumes larger than 96 voxels the systematic errors were comparable (generally within 100 microstrain).
The random errors showed a clear decreasing trend towards larger sub-volume sizes (Supplementary materials). The differences between DaVis-DC and ShIRT-FE were as high as two orders of magnitude, with maximum values of 126,312 and 121,281 microstrain, respectively, for augmented and natural sample with DaVis-DC and 2957 and 1124 microstrain, for augmented and natural sample, with ShIRT-FE. The multipass scheme on DaVis-DC (Table 2) was able to reduce both the systematic and random errors by up to a factor ten, with respect to those with the equivalent sub-volume (48 voxels). The errors on augmented vertebrae were consistently larger, up to 50%, than the ones on natural vertebrae.
The distribution of apparent strain within VOI-0 ( Fig. 2) showed that the error increased passing from the trabecular tissue, rich of features, to the thin cortical bone, and finally to the surrounding saline solution. High gradients were localized at the interface between bone and saline solution, and in the regions outside the vertebral body. A similar trend was observed with ShIRT-FE, but maximal errors were three orders of magnitude lower than for DaVis-DC.

Errors over VOI-1
The systematic and random errors were of the same order of magnitude for both DVC approaches and showed similar trends (Figs. 3 and 4).
DaVis-DC was affected by slightly larger (tens microstrains) systematic errors compared to ShIRT-FE. The effect of sub-volume size on the systematic error was negligible (Fig. 3).
As expected, the random error had a decreasing trend towards larger sub-volume sizes, for both DVC approaches (Fig. 4). The highest random errors for DaVis-DC (at 16 voxels) were in the Table 1 Comparison of the correlated volume for the different approaches for both the augmented and the natural samples, and both VOIs, for each sub-volume size. The sub-volume was cubic in all cases, and its size is described by the side length, in voxels. The values reported for each sample are the median of the five augmented vertebrae and of the five natural vertebrae. DaVis-DC is trying to maximize the coverage when sampling the VOI with the requested sub-volume size. In order to do that part of the boundary sub-volumes can be largely outside of the structure under investigation, which in turn causes lower correlation in those regions that can affect the overall correlated volume. For ShIRT-FE a grid is superimposed on the entire volume, and displacements and strains are computed on the nodes of the grid; so no regions are excluded.  Table 2 Series of steps implemented in the multipass approach, mp(48), without any overlap. This feature is available only on DaVis-DC.

VOI
Step Sub-volume size (voxels) Number of iterations 1 128 1 2 1 1 2 2 3 9 6 2 4 8 0 2 5 6 4 2 6 4 8 2  range 960-1517 microstrain for the augmented vertebrae, and 807-1279 microstrain for the natural vertebrae. Random errors with DaVis-DC were generally lower than 200 microstrain with sub-volume size equal or larger than 48 voxels. The multipass scheme produced slightly reduced random errors in both samples augmented and natural vertebrae (from 69 to 103 microstrain for augmented vertebrae and from 43 to 69 microstrain for natural vertebrae) when compared to the same sub-volume size of 48 voxels without multipass (from 142 to 274 microstrain for augmented vertebrae and from 81 to 159 microstrain for natural vertebrae). For ShIRT-FE the highest random errors (at 16 voxels) were in the range 359-606 microstrain for the augmented vertebrae, and 445-1003 microstrain for the natural vertebrae. For larger sub-volumes random errors for ShIRT-FE were in most cases smaller than 200 microstrain. The two DVC approaches provided comparable random errors for sub-volume size larger than 48 voxels, and were consistently lower than 200 microstrain above 64 voxels. While for DaVis-DC the random error steadily decreased for the range of sub-volumes explored, ShIRT-FE reached a plateau after 48 voxels. The random errors for the augmented vertebrae for DaVis-DC, were consistently higher, up to 50%, than the natural ones. For ShIRT-DC such differences between augmented and natural samples were smaller. No significant differences were found between the errors for the different components of strain, for any given sub-volume size, for both ShIRT-FE and DaVis-DC. Random errors showed large inter-specimen differences (Fig. 5), with maximum differences up to 2882 microstrain for DaVis-DC (augmented, Exz, specimen-1 vs. specimen-2) and up to 429 microstrain for ShIRT-FE (augmented, Exz, specimen-1 vs specimen-2). In particular, within the augmented sample, considerably higher errors were found for specimen-1, with both DVC approaches. Similarly, specimen-3 (from a different donor) was associated with the largest error in the natural sample. The reason is not clear, as the error was not associated with the highest/ lowest values of solid volume fraction, or bone volume fraction ( Table 3). The Peirce's criterion identified these two specimens as outliers in terms of error values, but not in terms of volume fraction.

Discussion
The aim of this work was to quantify the measurement uncertainties of different DVC approaches applied to augmented bones at the organ-level. More specifically, we intended to investigate how such uncertainties vary between specimens and if there is any anisotropy-related directionality in the measurement error.
Two DVC approaches were investigated: a local correlation algorithm (DaVis-DC) and a global strategy (ShiRT-FE). As no robust alternative reference method is available for measuring internal strains, repeated scans (zero-strain condition) of vertebrae were shared between our institutions in a sort of round-Robin test.
Our results showed that applying a local approach directly on images without masking (bone including the surrounding saline solution, VOI-0) yielded to large errors due to the lack of features provided by the saline solution. The analysis of the spatial distribution of the errors (Fig. 2) confirmed this hypothesis: the areas with large noise were mainly the outer boundaries of the bone and the saline solution; the areas where errors were substantially lower were all inside the specimen (which are typically the areas of interest). Therefore, average measurements over a volume including regions lacking features should be used with care if a local algorithm is applied. This effect could be an issue for specimens such as osteoporotic vertebrae, where fewer features are present compared to healthy denser vertebrae. Conversely, the global approach was almost insensitive to the surrounding saline solution. This suggests that a global approach may be more robust for strain measurements at the border of the specimen.
Inside the vertebra (VOI-1), the errors had the same order of magnitude for the local and global approaches. For both approaches, the systematic error (bias) fluctuated generally within 100 microstrain, meaning that the average of the strain components were close to zero, independently of the selected sub-volume size. Both approaches showed a decreasing trend of the random error towards larger sub-volumes. Results for sub-volumes of 48 voxels and larger were comparable for the two approaches.
The difference between augmented and natural samples was rather consistent, but small. This confirms the robustness of both DVC approaches on biomaterial interdigitation. This is confirmed in another tissue-level study (Tozzi et al., 2016a). It must be noted that the present results were obtained with cement for vertebroplasty, which includes a radiopacifier (300 μm BaSO 4 pellets): this could have provided suitable features to the correlation algorithms. The multipass scheme available in DaVis-DC was able to reduce the random error (both natural and augmented) in both VOI-0 and VOI-1, when compared to the corresponding subvolume of 48 voxels without multipass. Obviously, the effect of such scheme was less pronounced in VOI-1, where the errors were already much lower compared to the same sub-volume in VOI-0.
For both approaches and both natural and augmented vertebrae, the systematic and random errors did not show any correlation with the scan direction and/or specimen directionality: similar uncertainties values were found for all directions.
Some differences existed between specimens in absolute terms. To the authors' knowledge, inter-specimen variations and potential outliers have not been considered before at the organ level. In a sample of five specimens it is questionable to perform an outlier analysis (Ross, 2003). However, two specimens (Specimen-1 augmented, and Specimen-3 natural, Fig. 5) were clearly outliers for both DVC approaches.  5. Variability of the random error inside the augmented and natural vertebrae, for VOI-1, for a sub-volume size of 48 voxels. Similar trends were found for the systematic error.

Table 3
Solid Volume Fraction (SV/TV) evaluated as the ratio between the sum of the volume of the cement and the bone, and the total volume for the augmented vertebrae, and Bone Volume Fraction (BV/TV) evaluated as the ratio between the bone volume and the total volume for the natural vertebrae.

Augmented
SV/TV (%) Outliers were found both among the augmented (T4) and the natural (T2) vertebrae. The outliers did not come from the same animal. Other T4 and T2 vertebrae did not show large errors. All the scan sessions started in the morning, after a standard warm-up (as suggested by (Gillard et al., 2014)), and followed the same protocol. The outliers were not associated with any remarkable event from the log files and the lab diaries, nor with a specific day of the week.
The grey-scale distribution (over each slice of each vertebra and over the entire vertebra) of the outliers could be overlapped to those of the "regular" specimens. To understand if some scans contained higher noise, we analyzed the standard deviation of the grey-scale distribution in a parallelepiped (150 Â 150 Â 400 voxels) containing only saline solution: the standard deviation of all scans and all specimens were comparable (range: 221-946, 16-bit greyscale count).
Despite all these checks, we could not identify a single event or parameter that could explain such outliers.
This inter-specimen variability in the DVC uncertainties can be a warning for future studies, because a sequence of apparently high-quality images can unexpectedly result in large strain errors. Because of this variability, the authors recommend performing always a zero-strain test, before loading a specimen (repeated scan in the unloaded or preloaded condition). Unfortunately this kind of methodological analysis is frequently missing (Hardisty and Whyne, 2009). In case this approach would be inefficient for projects with large sample size, we suggest performing a zerostrain analysis on a reasonable number of specimens (e.g. five or more). A question left open with this work is whether some robust parameters exist and whether these are able to predict such errors.
A similar zero-strain study on human, bovine and rabbit trabecular bone was performed by (Liu and Morgan, 2007). They analyzed 4.3 mm cubes with a voxel size of 36 μm, and explored computation sub-volume of 20, 30, 40 and 50 voxels, with three DVC methods (based on home-written algorithm of digital particle image velocimetry and ultrasound elastography). In that paper a scalar indicator (which contains no information about the single strain components) was computed: the mean absolute error (MAER), referred to as accuracy, and the standard deviation of the error (SDER), referred to as precision, were quantified as average Fig. 6. Accuracy and precision (with interpolated power laws) for the local (DaVis-DC) and global (ShIRT-FE) DVC approaches, evaluated for VOI-1 in the augmented and natural vertebrae for sub-volume sizes ranging from 16 to 128 voxels. The multipass computation for DaVis-DC (mp(48); 6 passes from 128 to 48 voxels) is also reported. The median over the five augmented and the five natural specimens is plotted. The plots report the MAER and SDER defined as in (Liu and Morgan, 2007), where "ε" is the strain; the subscript "c" identifies the strain components; the subscript "k" identifies the measurement points; N is the number of measurement points. and standard deviation of the average of the absolute values of the six components of strain for each sub-volume. For the human vertebrae at 40 voxels sub-volume they found MAER in the order of 500 microstrain, and SDER of 150-200 microstrain. They found slightly lower errors for the bovine distal femur. The smallest total error they found was 345 microstrain. To allow comparisons, we computed the same scalar indicators for the augmented and natural sample for VOI-1 (Fig. 6). In order to compare the results, interpolated power laws were used to estimate the MAER and SDER for the same sub-volume size of (Liu and Morgan, 2007). DaVis-DC showed a MAER of 275 and 215 microstrain for the augmented and natural vertebrae, respectively; ShIRT-FE had a MAER of 159 and 139 microstrain respectively. The SDER with DaVis-DC were 116 and 68 microstrain for the augmented and natural vertebrae; ShIRT had a SDER of 68 and 61 microstrain respectively. MAER and SDER of the present study confirmed the trend found in previous studies (Dall'Ara et al., 2014;Liu and Morgan, 2007;Palanca et al., 2015).
An estimate of the measurement uncertainty was provided for human vertebrae in (Hussein et al., 2012). The voxel size (37 μm) was similar to the present work. They analyzed just a sub-volume of 4.8 mm (approximately 130 voxels). They found larger errors than in the present study: MAER¼740 microstrain, SDER ¼630 microstrain. Their analysis was performed as a preliminary check before the actual compression test.
The current study has shown that, when sufficient care is dedicated to a preliminary methodological optimization, the strain measurement uncertainties of DVC may be not only adequate to investigate bone failure (7000-10,000 microstrain (Bayraktar et al., 2004;de Bakker et al., 2009)), but also the strain distribution associated with physiological loads (strain of the order of 1000-2000 microstrain (Aamodt et al., 1997;Cristofolini, 2015)). The present findings suggest that for whole vertebrae DVC methods are sensitive enough for proper validation of the strain predictions from computational models only when sub-volumes equal or larger than 48 voxels (equivalent to approximately 2 mm in side length) are used. However, in order to validate the strain at spatial resolutions of 10-30 μm, typical of micro-FE (Van Rietbergen et al., 1995), the measurement uncertainties of the current DVC approaches need to be reduced.
A limitation of this work is the use of porcine vertebrae instead of human ones. In this explorative study this decision was driven by an ethical choice. While the present results might not directly translate to human specimens in absolute terms, the trends and the general observations can certainly be applied.
This study demonstrated the suitability of local and global DVC approaches to investigate natural and augmented bones. Systematic and random errors were rather isotropic, with no relation to bone anisotropy or micro-CT scanning planes. While the errors were rather consistent between specimens, some specimens caused unpredictably and inexplicably larger errors: for this reason, it is highly recommended to perform a preliminary zerostrain check on each specimen.
With the measurement uncertainties evaluated for a reasonable sub-volume size (i.e. 100-200 microstrain for sub-volume of 48 voxels), DVC becomes an attractive tool for the measurement of local properties (displacements and strains) in the elastic regime. This could be useful per se, to investigate bone micromechanics, but also to reliably validate computational models at the tissue level for spatial resolutions of approximately 2 mm.

Conflict of interest
None