Mechanical Behavior Biomedical Materials The inferomedial femoral neck is compromised by age but not disease: Fracture toughness and the multifactorial mechanisms comprising reference point microindentation

and osteoarthritis (OA). The femoral neck, in close proximity to the most per- tinent osteoporotic fracture site and near the hip joint a ﬀ ected by osteoarthritis, is a site of particular interest for investigation. We have recently shown that Reference Point micro-Indentation (RPI) detects di ﬀ erences between cortical bone from the femoral neck of healthy, osteoporotic fractured and osteoarthritic hip replacement patients. RPI is a new technique with potential for in vivo bone quality assessment. However, interpretation of RPI results is limited because the speci ﬁ c changes in bone properties with pathology are not well understood and, further, because it is not conclusive what properties are being assessed by RPI. Here, we investigate whether the di ﬀ erences previously detected between healthy and diseased cortical bone from the femoral neck might re ﬂ ect changes in fracture toughness. Together with this, we investigate which additional properties are re ﬂ ected in RPI measures. RPI (using the Biodent device) and fracture toughness tests were conducted on samples from the inferomedial neck of bone resected from donors with: OA (41 samples from 15 donors), osteoporosis (48 samples from 14 donors) and non age-matched cadaveric controls (37 samples from 10 donoros) with no history of bone disease. Further, a subset of indented samples were imaged using micro-computed tomography (3 osteoporotic and 4 control samples each from di ﬀ erent donors) as well as ﬂ uorescence microscopy in combination with serial sectioning after basic fuchsin staining (7 osteoporotic and 5 control samples from 5 osteoporotic and 5 control donors). In this study, the bulk indentation and fracture resistance properties of the inferomedial femoral neck in osteoporotic fracture, severe OA and control bone were comparable (p>0.05 for fracture properties and<10% di ﬀ erence for indentation) but fracture toughness reduced with advancing age (7.0% per decade, r = − 0.36, p = 0.029). Further, RPI properties (in particular, the indentation distance increase, IDI) showed partial cor- relation with fracture toughness (r = − 0.40, p = 0.023) or derived elastic modulus (r = − 0.40, p = 0.023). Multimodal indent imaging revealed evidence of toughening mechanisms (i.e. crack de ﬂ ection, bridging and microcracking), elastoplastic response (in terms of the non-conical imprint shape and presence of pile-up) and correlation of RPI with damage extent (up to r = 0.79, p = 0.034) and indent size (up to r = 0.82, p < 0.001). Therefore, crack resistance, deformation resistance and, additionally, micro-structure (porosity: r = 0.93, p = 0.002 as well as pore proximity: r = − 0.55, p = 0.027 for correlation with IDI) are all contributory to RPI. Consequently, it becomes clear that RPI measures represent a multitude of properties, various aspects of bone quality, but are not necessarily strongly correlated to a single mechanical property. In addition, osteoporosis or osteoarthritis do not seem to further in ﬂ uence fracture toughness of the inferomedial femoral neck beyond natural ageing. Since bone is highly heterogeneous, whether this ﬁ nding can be extended to the whole femoral neck or whether it also holds true for other femoral neck quadrants or other material properties remains to be shown.


A B S T R A C T
The influence of ageing on the fracture mechanics of cortical bone tissue is well documented, though little is known about if and how related material properties are further affected in two of the most prominent musculoskeletal diseases, osteoporosis and osteoarthritis (OA). The femoral neck, in close proximity to the most pertinent osteoporotic fracture site and near the hip joint affected by osteoarthritis, is a site of particular interest for investigation. We have recently shown that Reference Point micro-Indentation (RPI) detects differences between cortical bone from the femoral neck of healthy, osteoporotic fractured and osteoarthritic hip replacement patients. RPI is a new technique with potential for in vivo bone quality assessment. However, interpretation of RPI results is limited because the specific changes in bone properties with pathology are not well understood and, further, because it is not conclusive what properties are being assessed by RPI. Here, we investigate whether the differences previously detected between healthy and diseased cortical bone from the femoral neck might reflect changes in fracture toughness. Together with this, we investigate which additional properties are reflected in RPI measures. RPI (using the Biodent device) and fracture toughness tests were conducted on samples from the inferomedial neck of bone resected from donors with: OA (41 samples from 15 donors), osteoporosis (48 samples from 14 donors) and non age-matched cadaveric controls (37 samples from 10 donoros) with no history of bone disease. Further, a subset of indented samples were imaged using micro-computed tomography (3 osteoporotic and 4 control samples each from different donors) as well as fluorescence microscopy in combination with serial sectioning after basic fuchsin staining (7 osteoporotic and 5 control samples from 5 osteoporotic and 5 control donors). In this study, the bulk indentation and fracture resistance properties of the inferomedial femoral neck in osteoporotic fracture, severe OA and control bone were comparable (p > 0.05 for fracture properties and < 10% difference for indentation) but fracture toughness reduced with advancing age (7.0% per decade, r = −0.36, p = 0.029). Further, RPI properties (in particular, the indentation distance increase, IDI) showed partial correlation with fracture toughness (r = −0.40, p = 0.023) or derived elastic modulus (r = −0.40, p = 0.023). Multimodal indent imaging revealed evidence of toughening mechanisms (i.e. crack deflection, bridging and microcracking), elastoplastic response (in terms of the non-conical imprint shape and presence of pile-up) and correlation of RPI with damage extent (up to r = 0.79, p = 0.034) and indent size (up to r = 0.82, p < 0.001). Therefore, crack resistance, deformation resistance and, additionally, micro-structure (porosity: r = 0.93, p = 0.002 as well as pore proximity: r = −0.55, p = 0.027 for correlation with IDI) are all contributory to RPI.

Introduction
Osteoporosis and osteoarthritis are two of the most prevalent and impactful musculoskeletal disorders. However, the primary means of clinically assessing osteoporosis (Bone Mineral Density, BMD) has poor accuracy. BMD does not detect a high proportion of individuals who go on to fracture when used as a binary test (based on a t-score of −2.5) (Schuit et al., 2004;Siris et al., 2004). As a result, other differences in bone quality such as structure (e.g. cortical thinning, increased porosity or reduced trabeculae connectivity (Poole et al., 2010;Bell et al., 1999;Keaveny and Yeh, 2002)), composition and material properties may contribute to osteoporosis. This rationale has moved the definition of osteoporosis away from BMD alone towards a condition of compromised mechanical integrity and increased fracture risk (NIH, 2000). Osteoarthritis, however, is primarily a condition of joint degeneration, which causes considerable pain and disability. There is increasing evidence of changes to bone in osteoarthritis and not just cartilage including; stiffening of the trabeculae and subchondral bone, elevated BMD and deformities/altered biomechanics of the femoral head and neck (Baker-LePain and Lane, 2012;Bobinac et al., 2013;Arden and Nevitt, 2006;Sun et al., 2008). Therefore, both in osteoporosis and osteoarthritis there may be influence of changes of bone material properties. Of particular interest is the femoral neck site, which is in close proximity to the most clinically severe osteoporotic fracture and is also close to the affected joint in osteoarthritis. Although there is evidence for deterioration in bone material properties with age (Zioupos and Currey, 1998;Burstein et al., 1976;Nalla et al., 2006;Koester et al., 2011;Jepsen, 2003), a risk factor for both osteoporosis and osteoarthritis, there is surprisingly limited research whether these properties deteriorate as a function of these two pathologies.
With ageing, there may be deteriorations to bone quality including the susceptibility to microcracks and microdamage. The ability to withstand propagation of existing cracks and, ultimately, the resistance to fracture, is therefore a valuable material property to consider. This property in particular, relating to fracture resistance and toughness, deteriorates with age (by 2.9-18.9% per decade (Nalla et al., 2006;Koester et al., 2011;Granke et al., 2015;Brown et al., 2000)) but it is unclear whether it is further compromised with osteoporosis or osteoarthritis, particularly at the femoral neck, the most clinically relevant fracture site. It may be fairly logical to assume that fracture toughness, the ability to resist fracture, is compromised with osteoporosis. Additionally, the discussed influence of osteoarthritis on bone mechanics also warrant investigation into further material properties including fracture toughness. However, there are surprisingly few studies that directly compare OA or osteoporotic bone to non-diseased controls. A small number of studies have investigated properties including, but not limited to; microhardness (Dall'Ara et al., 2011), energy absorption (Dickenson et al., 1981), ultrasound stiffness (Li and Aspden, 1997), and reference point indentation properties utilising the cyclic indentation technique of this study (the Biodent™) or a sudden impact indent proposed for clinical use (the Osteoprobe™) (Jenkins et al., 2016;Malgo et al., 2015;Diez-Perez et al., 2010;Gueerri-Fernandez et al., 2013;Milovanovic et al., 2014;Coutts et al., 2016). However, the comparison between either discussed disease and a control is still limited, particularly if considering cortical bone. Therefore, beyond the effects of ageing, the influence of both OA and osteoporosis on the material properties of bone demands further exploration. This is of particular importance in terms of fracture toughness and considering the femoral neck where the authors are not aware of any published research.
Reference Point micro-Indentation (also referred to in the literature as RPI, microindentation and Reference Point Indentation) is a technique that has been proposed for measuring the material properties of bone in vivo with the aim to supplement BMD (Jenkins et al., 2016;Malgo et al., 2015;Diez-Perez et al., 2010). This aims to overcome limitations of current fracture risk assessment techniques by introducing assessment of mechanical properties. The technique, which uses a reference probe to establish the surface and a test probe to cyclically indent into the bone, has shown some ability to discriminate osteoporotic (Jenkins et al., 2016;Malgo et al., 2015;Diez-Perez et al., 2010;Gueerri-Fernandez et al., 2013;Milovanovic et al., 2014) and osteoarthritic (Coutts et al., 2016) bone from non-diseased controls. Notably the technique has also been applied in vivo at the tibia, discriminating individuals who have fractured from non-fractured controls and reporting no complication (Diez-Perez et al., 2010;Gueerri-Fernandez et al., 2013). Further studies also investigate the Osteoprobe RPI method, also reporting no complications (Randall et al., 2013), yet this uses a different loading regime (one single impact cycle). In vivo, neither technique can be used directly at the site of interest, the most significant fracture site, the femoral neck, so in vitro studies are required to study this important location.
RPI has also been suggested to be distinct from conventional indentation testing (such as nanoindentation), in that the imprints are associated with microdamage (Diez-Perez et al., 2010;Beutel and Kennedy, 2015;Schneider et al., 2013) and it has therefore been purported to assess fracture resistance properties to varying extents (Granke et al., 2015;Diez-Perez et al., 2010;Katsamenis et al., 2015;Carriero et al., 2014). Specifically, RPI properties have shown high correlation with fracture toughness (Diez-Perez et al., 2010) but also a higher degree of independence (Granke et al., 2015;Katsamenis et al., 2015) as well as complete lack of correlation (Carriero et al., 2014). Furthermore, RPI has also shown correlation with elastoplastic resistance to deformation such as strength and toughness (Granke et al., 2015;Gallant et al., 2013) so it is still unclear what property this technique is assessing. Additionally, we have demonstrated that indentation properties vary with location (Coutts et al., 2015) and machining of the femoral neck . Though other indentation techniques are better understood for measuring the localised properties of bone (e.g. the established relationship between elastic modulus derived from nanoindentation), it is the clinical potential of RPI that makes it of particular interest in this study. Further, both osteoporosis and OA likely influence the RPI properties of the bone, including the surface properties of the femoral neck. However, it is still unclear how the indentation properties of the bulk of the femoral neck are influenced by disease or what (material) property or properties is/ are being assessed by the technique.
In terms of RPI, or the development and interpretation of any clinical fracture risk assessment technique, it is critical to understand both the bone properties influenced by the disease state and how the technique may assess these deteriorations in properties. Therefore, this study, investigates two research questions: 1) what are the differences in selected bone material properties between osteoporosis, OA and controls and 2) what properties are being assessed by reference point microindenation?
Both RPI (having demonstrated in vivo potential at the tibia) and fracture toughness (likely critical to fracture) are useful for answering these two research questions. For the second question, RPI and fracture toughness can be supplemented with imaging techniques such as fluorescence microscopy (Beutel and Kennedy, 2015) and micro computed-tomography. This allows for imaging of the RPI imprints and surrounding damage to provide a more mechanistic understanding of RPI measurement.
With this study, we apply both imaging and mechanical testing within the cortical bone of the inferomedial femoral neck with the perspective to increase understanding of key musculoskeletal disorders and potential techniques for mechanical assessment (i.e. RPI) of these disorders.

Human femoral neck samples
Human femoral neck samples were collected to form three groups: 1) the osteoporotic (OP) fractured group (a total of 16 participants across the mechanical testing and imaging studies), 2) the osteoarthritic total hip replacement (OA) group (14 participants) and 3) the cadaveric control (C) group (10 participants). These samples were a subset of those described in our previous publications (Jenkins et al., 2016;Coutts et al., 2016).
The osteoporotic and osteoarthritic group were collected from patients undergoing hip arthroplasty at University Hospital Southampton NHS Foundation Trust (UHS). The indication for surgery in the OP group was intracapsular fracture of the femoral neck following a low trauma fall. This makes the assumption that individuals that suffer a fragility fracture of the femoral neck must be 'osteoporotic' (i.e. have fragile bone predisposed to fracture (NIH, 2000)) yet this does not incorporate falls risk or many other factors linked to fragility fractures. Patients in the OA group were undergoing an elective total hip replacement (THR) for osteoarthritis of the femoral head. Cadaveric control samples had no known history of fracture or bone disease and were obtained from Innoved Institute LLC (Besenville, Illinois). All samples were stored at −80°C and defrosted overnight (approximately 15 h) in Hanks' Balanced Salt Solution (HBSS) prior to machining.
All samples were obtained under full ethical approval (12/SC/0325 Southampton A REC and 10/H0604/91 Oxford A REC).

Machining of small-scale samples
In the OP fracture and OA (THR) group the femoral head and neck were removed from the femur by the surgeon as part of the arthroplasty. For the control group, a cut was made with a junior hack saw at a distance approximately equal to the femoral head diameter as shown in Fig. 1a. A second cut was made 5-10 mm proximal of the first cut with the junior hack saw to section the femoral neck in the OA and control group. For the OP group, the fracture and surgeon's cut isolated the femoral neck making a hack saw cut unnecessary (Fig. 1a).
The femoral neck sections were split into quadrants using the junior hack saw and plate shaped samples were machined from the inferomedial section (the thickest section, suitable for RPI and fracture toughness measurements (Poole et al., 2010;Coutts et al., 2015)) using a low-speed saw and diamond wafering blade (Buehler, Germany) as well as polishing with 600 grit sandpaper (Fig. 1b). These plates were cut into beam specimens using a low-speed saw for imaging (Fig. 1c) and mechanical testing (Fig. 1d). Hydration was maintained during preparation through constant irrigation via the low-speed saw water bath, through periodic application of HBSS to samples during polishing and through minimising duration out of HBSS for hack saw cutting to approximately 1 min. The flow of samples is shown in Fig. 1, where samples were machined for fracture toughness testing and subsequent RPI. Completely separate samples (though with some overlap with the donors used for mechanical testing) were prepared for indentation and subsequent imaging.

Fracture toughness testing
Fracture toughness testing was performed as previously described by (Katsamenis et al. (2015); Katsamenis et al. (2013)) with some modification for increased automation (primarily automatic selection of whitening area), allowing increased throughput, as indicated in Supplementary Fig. S1. 158 samples (48 OP fracture, 41 OA hip replacement and 37 control samples) were machined. The mean ± standard deviation dimensions of these samples were: width (w) of 1.23 ± 0.09 mm and a thickness (t) of 0.72 ± 0.06 mm as described (2.2) and shown in Fig. 1d (i.e. t/w of 0.59 ± 0.07 compared to 0.5 in the ASTM standard (ASTM, 2001) due to difficulties in achieving this level of accuracy using the 0.5 mm thickness low-speed saw). These samples were from 39 donors (with 1-6 and a median of 4 samples per donor): 14 OP (54-97 years, mean 75.9 ± 11.0 years, 8 female and 6 males), 15 OA (26-84 years, mean 65.1 ± 17.6 years, 5 female and 10 male) and 10 control (58-92 years, mean 65.7 ± 9.6 years, 6 female and 4 male). Machined samples were notched in the longitudinal antiplane direction using the low speed saw and sharpened using a scalpel and 1 µm diamond solution as described by Kruzic et al. (2005). The diamond solution was washed away with a water-jet and ultrasonic cleaner (VWR Symphony, Radnor, Pennsylvania) leaving a mean prenotch of 0.45 ± 0.10 mm (a 0 , Fig. 1) and hence a mean a 0 /w of 0.37 ± 0.09. This ratio is marginally lower than the ASTM E1820 standard of 0.45-0.55 (ASTM, 2001) due to difficulties in reproducibly machining larger notches without damaging the sample and to leave a larger uncracked ligament to more clearly visualise crack propagation/ whitening.
Samples were placed in a water bath in HBSS on a 6 mm span threepoint bending rig (giving a mean span/w of 4.91 ± 0.40 compared to 4 for the ASTM standard (ASTM, 2001)). Though limitations of sample machining and quantity of bone stock necessitated discrepancies from the ASTM standard (ASTM, 2001), this standard is not specific to bone samples or samples of this size yet the dimensions were still intended to be as close to the standard as practically possible in these human bone samples. Loading was performed at 1 µm/s (Electroforce 3200, Bose, Eden Prairie, Minnesota) to place the notch in tension and propagate the crack at a quasistatic rate. Fiber optic lights (KL 1500 LCD, Schott, Mainz, Germany and DC950H Fiber-Lite, Dolan Jenner, Boxborough, Massachusetts) illuminated the sample at ± 45°from the axis of the camera and 0°/−45°in the field of view of the camera. The resulting whitening, the visualised damage propagation, was recorded using a 2 MPixel camera (Q4003D, Limess, Krefeld, Germany) coupled with a macro-lens (28-105 mm Nikkor zoom lens, Nikkon, Tokyo).
The "Whitening Front Tracking" technique, as described in detail by (Katsamenis et al. (2015), Katsamenis et al. (2013)), allowed for the generation of crack extension resistance curves (R-curves). In brief, the algorithm was used to select the region of interest (where the whitening develops around the sharp notch) then register the displacement of subsequent frames to the initial unloaded image. The initial image, without whitening, can then be subtracted from the registered frames to show the development of a binary whitening area that, relating to the crack propagation, can be tracked to generate R-curves. Further development of this MATLAB algorithm (Mathworks, Natick, Massachusetts) increased automation to allow for a larger number of samples to be processed. The principal developments were the automatic selection of; the whitening area, linear portion of the force-displacement curve, selection of the region of interest and selection of the initial notch aided by microscopy (S02 USB microscope, HOT technology Co., Shenzhen, China) as summarised by Supplementary Fig. S1. The R-curves were generated in terms of strain energy release rate (J-integral) and derived stress intensity factor (K-effective) with the slope of the curve giving the fracture resistance (K slope and J slope ) and the maximum value giving the fracture toughness (K max and J max ) as demonstrated in Fig. 2. Additionally, an elastic modulus (E mod ) was derived from the linear portion of the force displacement curve where no crack propagation occured, using a finite element derived correction factor to take into account the stress field surrounding the notch as previously described (Katsamenis et al., 2013). The finite element model allowed for a relationship between the material elastic modulus and the empirical loaddisplacement curve of a notched specimen. This has been referred to as the 'derived modulus'. Additionally, the whitening area (W Area ) was recorded at the maximum point (i.e. at maximum load).

Reference point microindentation
Following fracture toughness testing, the same bone samples were again frozen at −80°C wrapped in HBSS soaked gauzes until further experiments. Samples were defrosted overnight (approximately 15 h). The Biodent Hfc™ system (Activelife Scientific, Santa Barbara, California) was then employed for RPI using a BP2 shape reference probe and a test probe with a 90°conical tip, 5 µm radius and 350 µm maximum diameter. The sharp reference probe wasn't specifically necessary to displace the periosteum considering the samples were machined, yet was still employed for similarities with the clinically applied technique. Prior to indenting bone, indentation was performed on a poly(methyl methacrylate) block to achieve a suitable touchdown distance (50-250 µm) and ensure consistent indentation measurements were reported. Five RPI measurements were taken per sample at least 1 mm away from the fracture site at 10 N, 2 Hz and 10 cycles. These indents were made in the transverse inwards (endosteal) direction, approximately perpendicular to the long axis of the osteons and equivalent to the direction of in vivo measurement. The TID (Total Indentation Distance), IDI (Indentation Distance Increase) and CID (first cycle Creep Indentation Distance) were calculated as described in our previous publications (Coutts et al., 2015;Jenkins et al., 2015) and by other researchers (Diez-Perez et al., 2010;Hansma et al., 2008). These measures were used for their prevalence in the literature and because they were established in our previously published work as highlighting differences between OA (THR), OP fracture and control cohorts (Jenkins, 2016;Coutts, 2016).

Imaging of RPI Imprints
Single indents were made in additional machined osteoporotic and control samples, utilising the described RPI technique (i.e. 10 N, 2 Hz and 10 cycles in the transverse inwards direction to calculate TID, IDI and CID). Though there was an overlap in donors, completely different samples were used from the fracture toughness testing. This subset was formed from donors that had sufficient tissue within the same inferomedial region. Residual indent imprints from the individual RPI measurements that had received no previous mechanical testing were then imaged using micro computed-tomography (µCT, 3 osteoporotic samples and 4 control samples each from different donors) and serial sectioning with fluorescence microscopy (FLM, 7 osteoporotic samples from 5 osteoporotic donors and 5 control samples each from different donors). This gave a total of 16 samples (10 osteoporotic and 6 control) from 14 donors, 8 osteoporotic (54-88 years, mean 72.0 ± 10.3 years, 4 female and 4 male) and 6 control (58-92 years, mean 67.7 ± 11.9 years, 5 female and 1 male). Additionally, atomic force microscopy (AFM) imaging and polarised light microscopy (PLM) were used in a single osteoporotic sample to further illustrate the µCT and FLM findings.

Micro computed-tomography (µCT)
Firstly, three osteoporotic and four control indents were imaged using µCT (Xradia Versa 510, Zeiss X-ray Microscopy Inc, Pleasanton, California). The scans were conducted at a peak voltage of 110 kV and the beam was pre-filtered using a 150 µm SiO 2 filter to reduce beamhardening artefacts. To achieve sufficient flux, the power was set at 10 W (91 μA) and the 2048×2048 pixels detector was binned twice resulting in effective detector dimensions of 1024×1024 pixels. The source to detector distance was set at 111.5 mm (SrcZ: −13.7, DetZ: 97.8), which in combination with the 4x lens resulted in a pixel size of 825 nm; i.e. a spatial resolution of approximately 2.4 µm. To ensure optimal sampling, the total number of collected radiographs varied from 2201 to 3201 (angular step of~0.11°-0.16°) over a 360°rotation, depending on how small the field of view was with regards to the effective diameter of the sample. Following acquisition, the data were reconstructed using commercial reconstruction software (XMreconstructor, Zeiss X-ray Microscopy Inc, Pleasanton, California), which uses a filtered back-projection algorithm and reconstructed slices were exported as 16-bit tiff files.
The image stacks were realigned by rotation (x-y plane) and translation (x-y plane as a function of zequivalent to rotation in x-z and yz) using Fiji (ImageJ) (Schindelin et al., 2012) and segmented in Avizo (Avizo Fire v. 8.0, FEI, Hillsboro, Oregon). The reconstructed slices had sufficient contrast for semi-automatic segmentation of the imprint, associated microdamage and Haversian canals from the surrounding cortical bone. In brief, the datasets were cropped to the region of interest (approximately 500 × 500 × 150 voxels) and, retaining the raw 16 bit volume throughout, then thresholded based on the lower 60% of the histogram intensity range. Morphological opening (6 voxel radius) was used on the binary image to remove thin (e.g. microcracks) or unconnected elements (e.g. lacunae) and segment the larger/connected indent imprint and canals. Small (less than 5000 voxels) unconnected elements, as well as the previously segmented indent and canals were removed from the initial thresholded image to segment the microcracks aided by manual selection (i.e. blowout and paintbrush tools). The segmentation method and the parameters were selected to visually discriminate the features of interest based on multiple image stacks and was reviewed for each sample. Fig. 3a shows a µCT slice in the xy plane and Fig. 3c and Fig. 5a show the segmented features from which measurements were made of the indent (volume, depth and diameter), pores (porosity and distance between indent and pore) and mean crack length. The segmentation process is shown schematically in greater detail in Supplementary Fig. S2. Measurements were made using inbuilt Avizo functions: the 'volume' measure of the labelled indent, crack and Haversian canal gave the respective volumes and macro porosity and the 'volume length' measure gave the maximum length of each individually segmented crack, the mean of which defined the 'damage extent'. A custom MATLAB algorithm (Mathworks, Natick, Massachusetts) was used to measure the maximum depth and central slice area. The diagonal distance from the surface centre of the indent to the imprint edges (mean) and pore (minimum) defined the diameter and pore proximity respectively (Fig. 5b).

Serial sectioning and fluorescence microscopy
A further seven osteoporotic samples and two control samples were first indented and then stained alongside three of the imaged µCT samples. Based on the previously described staining technique Burr and Hooser (1995) and Lee et al. (2003), the samples (10 OP, 6 C) were dehydrated in 80% ethanol overnight (approximately 15 h), submerged in 1% basic fuchsin for 3 h and rinsed in 80% ethanol for 5 min. Using only one staining step and no vacuum is a simplification of the previously published technique (Burr and Hooser, 1995;Lee et al., 2003) and perhaps leads to poorer penetration of stain in our samples, but still more than sufficient for the surface and indent. This simpler and shorter protocol, which is not typical en-bloc staining, had previously been established in our laboratory with acceptable results for visualising the indent and the surface of the bone while avoiding overstaining. Microdamage associated with indentation will be localised to the indent site and discriminated from any microdamage formed by machining which would be present throughout the sample.
Surface indent location was identified using FL microscopy (Zeiss Axio Imager. Z1m, Zeiss Microscopy Inc, Oberkochen, Germany) followed by mounting with the indented face perpendicular to a glass slide. An ultramiller (SM250, Leica Microsystems, Solms, Germany) was used to cut approximately transversely through the osteons/indent with a 4 mm/s feed rate, 1000 rpm speed and 10 µm removed per slice (5 µm slices when close to the centre of the indent). Following each ultramiller cut, a "red" (excitation 546/12 nm, emission at 575-640 nm, Rhodamine 575-640 nm to detect the stain/damage) and "green" (excitation 450-490 nm, emission at 515-565 nm, FITC 515-565 nm, collagen auto-fluorescence to detect the bone structure) fluorescence image was taken using the FL microscope with a 10 times objective (Fig. 3b). Higher resolution images using a 20 times objective (e.g. Fig. 7a) were additionally taken for the central slice of 6 of the OP and 2 of the control samples.
The indent, the associated stained or microdamage area and closest pore were segmented using a custom MATLAB algorithm as detailed in Supplementary Fig. S3. The intensity of each pixel of the image captured using the FITC filter cube (where the indent and stain area both appeared dark) was subtracted from the image captured using the Rhodamine filter cube (where the indent appeared dark but the stain area appeared bright) to improve contrast between the stain, indent and surrounding tissue ( Fig. 3d and Fig. 4). The central slice of the indent imprint was first manually segmented to give an 'initial' outline. This initial outline was overlaid onto the subsequent slice in the image stack and the largest steps in intensity near to this outline (within 20 pixels) was found to give a 'refined' outline. This process, using the 'initial' segmentation from the previous slice as a guideline for the subsequent slice, was repeated through the image stack to segment the indent (Fig. 4). The damage was manually drawn around for the central slice and through thresholding of the total stack and the closest pore was semi-automatically segmented for each slice in the image stack (thresholding with a user check). Measurements of the pore proximity and indent size were made using the custom MATLAB algorithm, similarly to the µCT method. For the stain/damage, the area was found for each slice and interpolated by the trapezium rule to a volume, from which a crack "extent" was approximated through assuming a hemi- ) or semi-circular area (for the central slice, 5b).

Atomic force and polarised light microscopy
For one osteoporotic fracture sample from the FLM imaging, at the central point of ultramilling, the indent and associated microcracks were imaged using AFM and PLM. AFM was performed in air using the NanoWizard ULTRA Speed A system (JPK instruments, Berlin) in contact mode with a 0.32 N/m nominal spring constant, V-shaped AFM cantilever (PNP-DB NanoWorld AG, Neuchâtel, Switzerland). Image size was 10 µm by 10 µm with a 512 pixel x 512 pixel resolution (hence a 19.5 nm/pixel resolution) with a scanning frequency of 0.8 Hz and multiple images forming an approximately 50 µm by 50 µm image area. PLM images were taken with a 20 times and 50 times objective (Zeiss Axio Imager. Z1m, Zeiss Microscopy Inc, Oberkochen, Germany). Because only a single sample was used, these images are purely illustrative to support the μCT and FLM findings (where there are a larger number of samples) but not to draw new conclusions. By only using one (osteoporotic fractured) sample, these results have very limited generalisability so it should be stressed that these are only used to back up the more extensive imaging. PLM was performed for clearer visualisation of the cracks alone (without the influence of the stained damage area) and AFM allowed for higher resolution visualisation of the microcracking and diffuse damage.

Statistical analysis
A mean RPI measurement was taken for the 5 repeat measurements on the fracture toughness samples. The mean was also taken across the repeat fracture toughness samples per donor 1-6 samples) to give a single value per donor. The Mann Whitney U-Test was used to compare the OP, OA and control groups because these were not normally distributed based on the approximately bell-shaped histograms for fracture mechanics and indentation parameters.
Factors including age, sex, height, weight, storage duration and test location have not been adjusted for in the analysis unlike our previous publication (Jenkins et al., 2016). This is because, as discussed previously (Jenkins et al., 2016) and below, these factors only have a minimal confounding effect. Furthermore, these potential confounding factors are deemed to, if having any effect, be amplifying rather than minimising the measured differences. For example, the OP group is older than the control with a higher proportion of females, both factors that could be postulated to exaggerate the differences between cohorts.  Therefore, because the differences between cohorts are generally not significant or minimal, further adjustment for confounding factors or post-hoc adjustment for multiple comparisons would only reduce this already marginal effect, not affecting the conclusions of this study.
Indent imaging parameters (e.g. indent size, damage extent, pore proximity etc.) and RPI measures (i.e. TID, IDI and CID) were also not normally distributed based on the histogram (e.g. skew and/or relatively small numbers) so Spearman's correlation was used. Due to the relatively small number of samples for µCT (7 samples) and FLM imaging (9 samples with complete image stack and 12 samples with at least the central image), the techniques were also combined to give additional information with a larger sample size (16 samples). Importantly, each technique was also presented on its own.
A level of significance (α) of 0.05 was used throughout. In a considerable number of cases p values between 0.05 and 0.1 were encountered and these have also been discussed, though these results should not be considered statistically significant. Further, because three comparisons are being made (OA vs. control, OP vs. control and OP vs. OA), the significance level should be dropped to a third (i.e. p=0.0167), this would leave very few significant comparisons, highlighting that there are only minimal, if any, differences between these cohorts.
One osteoarthritic donor who was uncharacteristically young (26year-old male) and had an anomalously high toughness has been excluded from the analysis throughout. Where this point would have influenced the findings it has been labelled as 'young OA' i.e. it would falsely increase the differences between the cohorts (Fig. 6b) and the strength of the correlations with RPI (Fig. 10).

Age and disease
The bulk material properties of the cortical bone samples in terms of fracture resistance (K slope and J slope ), fracture toughness (K max and J max ), derived elastic modulus (E mod ) and indentation properties (TID, IDI and CID) are generally not significantly different across groups (0.8% to 34.2% difference, p > 0.05, Fig. 6a,b and Table 1). Exceptions to this are a marginally higher derived elastic modulus in the osteoporotic group relative to the control (by 9.5%, p = 0.001) and a higher TID in the osteoarthritic (THR) group compared to the osteoporotic fracture group (by 8.5%, p = 0.006). Therefore, the differences in bulk RPI and fracture toughness properties of the osteoporotic fracture, osteoarthritic (THR) and control groups are either not statistically significance (p > 0.05) or are minimal (< 10%) as shown in Fig. 6a and b and Table 1.
Despite the similarities in fracture toughness between groups, the whitening area (W Area ) is larger in the control compared to either the osteoporotic fracture (by 49%) or osteoarthritic hip replacement (by 33%) groups (p = 0.039, Fig. 6c and Table 1). The size of the whitening area is not a material property, perhaps also relating to geometrical properties, but does correlate with fracture toughness to some extent (r = 0.33, p = 0.049 for K and r = 0.45, p = 0.006 for J max , indicated in Fig. 10c). Additionally, these differences do remain when normalized against the sample width (W Area /w, where close to significant, OP: p = 0.053 and OA: p = 0.083), pre-notch length (W Area /a 0 , OP: p = 0.013 and OA: p = 0.004) or thickness (W Area /t, OP: p = 0.028 and OA: p = 0.016).
The correlation between donor age and properties measured through fracture toughness testing are significant in terms of fracture toughness (J max , r = −0.36, p = 0.029) and derived elastic modulus (r = 0.40, p = 0.014). The slope of the best fit lines indicate a 7.0% reduction in fracture toughness per decade (Fig. 10a) and an 18.3% increase in derived elastic modulus per decade. Additionally, the whitening area may reduce with age (by 7.5% per decade, r = −0.47, p = 0.084 not significant) but this is only significant when normalized to geometry (W Area /w: r = −0.48, p = 0.003, W Area /a 0 : r = −0.53, p = 0.001, W Area /t: r = −0.47, p = 0.004). Though significant in places, these correlations are still marginal with a high degree of independence as shown by the correlation coefficients (i.e. only 36% of the reduction in fracture toughness explained by age) and by the sparse cloud of points shown in Fig. 10.
The increased automation of the whitening front tracking technique is in good agreement with the published technique (Katsamenis et al., 2013), having strong correlation (r = 0.79-0.82 for growth and r = 0.99 for toughness, p < 0.05 and the median difference is only 2-7%) across the fracture resistance parameters in 10 samples analysed with both methods. Some discrepancy relates to the inability of the algorithm to accurately identify the whitening front where there are smaller unconnected elements ahead of the binary whitening area.
It should also be noted that though sex was not adjusted for in these group, there were no significant (p > 0.05) differences measured between male and female donors in any of the three groups or across all groups (data not presented).

Imaging techniques and relation to RPI parameters
Using µCT and FLM, the indent could clearly be visualised as shown Fig. 6. Comparison between donors in the osteoporotic fracture (OP), osteoarthritic THR (OA) and control groups in terms of a) indentation distance (TID), b) fracture toughness (K max ) and c) maximum whitening area (W Area ) with p values displayed where significant (p < 0.05) or close to significance (p < 0.1). The young osteoarthritic donor is considered anomalous and has been excluded from the analysis. in Fig. 7a-b, and segmented as shown in Fig. 3c-d and Fig. 5a. Both techniques identified the indent and damage area similarly ( Fig. 3a and b) but µCT visualised linear microcracks (Fig. 7b) of 36.6-116.9 µm mean length 5-12 cracks per indent) within this damage area whereas with FLM, though some linear microcracks could be observed, this was within an area that appeared as a red cloud of stain with flame-like edges, the so-called 'diffuse microdamage' (Fig. 7a).
With both imaging techniques, the damage appeared to deflect circumferentially around the osteons and linearly along their length ( Fig. 7a and blabelled D). Crack bridging (Fig. 7b, c and dlabelled  B) is another crack extension resistance mechanism observed with both techniques and in multiple samples. AFM and PLM of only one osteoporotic fracture sample, showed that the red stain cloud contained both linear microcracks (Fig. 7c) as well as small scale micro-or diffuse damage (< 10 µm and down to 1-2 µm, Fig. 7dlabelled M). As well as evidence of microdamage and resistance mechanisms, elastic resilience and plastic deformation can be visualised through the non-conical imprint shape (Fig. 7 labelled I) and material pile-up at the indent edge ( Fig. 7alabelled P). Fig. 8a and Table 2 shows that there are significant positive correlations between indent area, depth and diameter (measured with combined μCT and FLM) with TID, IDI and CID. The relationship between indent volume and RPI is only significant in terms of IDI. Considering μCT independently, the correlation is significant between indent depth and TID (r = 0.86, p = 0.013, for IDI r = 0.75, p = 0.052, Table 2) or indent diameter with TID (r = 0.93, p = 0.0023) and IDI (r = 0.79, p = 0.034). For the 2D FLM measurements there is a positive correlation between indent area and TID (r = 0.70, p = 0.036 r =0.64 for IDI, p = 0.062) or indent diameter and all three RPI measures (r = 0.70-0.87).
It should also be noted that the residual indent depth is half the measured TID based on the fitted linear slope (Indent depth = 0.49 x TID, Fig. 8a) indicating an elastic response.
Crack length, as measured by µCT, is significantly correlated with TID as shown in Fig. 8c and Table 2. FLM measured 'damage extent' is positively correlated with IDI or, when FLM and µCT are combined, with both IDI and TID.
Finally, the µCT measured porosity (based on the Haversian canals) correlates with indentation depth and the relationship is significant in terms IDI and CID (for TID, r = 0.68, p = 0.094). As a result, the proximity between the indent and the closest pore, negatively correlates with IDI when the µCT and FLM measurements are combined (r = −0.55, p = 0.027).

Comparison of single and repeat RPI measurements
The indentation depth is significantly higher when one measurement per sample (on those used for further imaging) is made in the control compared to the osteoporotic fracture group (by 20.7-37.4% with p = 0.011, 0.007 and 0.042 for TID, IDI and CID respectively - Fig. 9a). This is contrary to when repeat measurements (15-25 measurements on the fracture toughness samples) are made on samples machined from the same donors. As discussed above, and indicated in Fig. 9b for these samples specifically, there are no significant differences between the osteoporotic fracture and control groups (< 2% difference in these samples).
The contrary findings taking a single RPI measurement (samples used for imaging) or 15 repeat measurements on samples from the same donors (samples following fracture toughness testing) can likely be related to local heterogeneities such as porosity. Using the µCT and FLM techniques combined, the indent measurements are closer to a pore in the control (median 21.6 µm) compared to the osteoporotic fracture samples (median 55.7 µm, p = 0.022). There is additionally an approximately hyperbolic relationship between pore proximity (P prox ) and individual indent depth normalized against the median measurement (RPI s/m ) as shown in Fig. 9c. The hyperbolic relationship (RPI s/m = P 1 / (P prox -P 2 )) incorporates two asymptotes; the horizontal asymptote (single to median ratio, P 1 is 0.87 for TID, 0.70 for IDI and 0.75 for CID) and the vertical asymptote (proximity to pore, P 2 is 8.7 µm for TID, 5.2 µm for IDI and 6.9 µm for CID). Excluding one measure with very close pore proximity (5.8 µm, only 11% of the mean 12 other measures), this relationship is significant in terms of TID (p < 0.001), IDI (p < 0.001) and CID (p = 0.025).

Correlation between indentation and fracture mechanics properties
On a sample by sample basis for all groups there is no significant correlation between properties measured through fracture toughness testing and those measured by RPI (92 samples consisting of 35 OP, 25 OA and 32 control). Considering the osteoarthritic hip replacement group alone, there is a significant negative correlation between TID and the fracture resistance (K slope , r = −0.40, p= 0.048). There are also marginally significant (p ≤ 0.1) negative correlations between CID (r = −0.38, p = 0.10) or TID (r = −0.38, p = 0.064) and fracture resistance (J slope ) in the osteoarthritic (THR) group and between IDI and derived elastic modulus (r = −0.30, p = 0.10) in the control group. Therefore, in all cases, RPI has a large degree of independence from fracture resistance measures (|r| ≤ 0.4 and generally p > 0.05).
In the control group, there is also a marginal positive correlation between fracture resistance (J slope ) and IDI (r= 0.58, p = 0.10) but in all other cases, p > 0.1 and |r| < 0.5 for all groups. This correlation analysis is summarized in Supplementary Table S1 and it can be concluded that, though there may be some significant correlation between RPI and fracture resistance or elastic properties, these relationships are not all encompassing and there remain a large degree of independence. Table 1 Comparison between donors in the osteoporotic fracture (OP), osteoarthritic THR (OA) and control groups in terms of crack growth resistance (K slope and J slope ), fracture toughness (K max and J max ), derived elastic modulus (E mod ), maximum whitening area (W Area ) and indentation properties (TID, IDI and CID). Data are presented as median (lowerupper quartile Cortical bone at the inferomedial femoral neck suffers a 7.0% reduction in fracture toughness per decade, as demonstrated through our results. This is in agreement with the 2.9-18.9% reported at the femoral, tibial and humeral midshaft (Zioupos and Currey, 1998;Nalla et al., 2006;Koester et al., 2011;Granke et al., 2015;Brown et al., 2000). Brown and co-workers (Brown et al., 2000) are, to the best of our knowledge, the only study to previously investigate fracture toughness of the femoral neck but reported no significant correlation with age. The difference may arise because Brown et al. (2000) tested fewer donors (26 compared to 36 in the present study), longer crack length (4 mm instead of the mean of 0.45 mm), a different sample geometry (compact tension compared to single edge bend samples) and a shorter age range 50-90 years compared to 33-97 years). Additionally, Brown et al. (2000) did not report the correlation coefficient or linear regression gradient, but only that the relationship was nonsignificant (p > 0.05). This makes unclear if the weak, (yet significant) negative correlation that we observe here (r = −0.36) may also be present in their study.
The 18.3% increase in derived elastic modulus per decade presented here is contrary to the previously reported 1-2% reduction (Burstein et al., 1976). This may relate to the mode of testing used here (derivation of modulus from notched fracture toughness samples rather than conventional flexural modulus from un-notched samples) or may relate to preferential stiffening at the femoral neck. Regardless, the strength of these correlations was weak (r = 0.4) so age is clearly not the only factor impacting the material properties of the bone at the studied location.

Table 2
Correlation between RPI beween measured indentation parameters (TID, IDI and CID) and imaging measurement of the indent size, the damage extent an porosity/pore proximity. The 'Technique' column indicates how many of the n sample measures are from micro computed tomography (CT) or fluorescence microscopy of the serial sectioned image stack (SS i.e. three dimensional measurements) or only the central slice (CS i.e. two dimensional measurements). The embolden cells indicate significant correlations (p < 0.05) and the italicisied cells indicate a p value less than 0.1 and hence close to significance.

Influence of osteoporosis and osteoarthritis
Previously, the material properties of bone, including fracture toughness, have been shown to deteriorate with age (Zioupos and Currey, 1998;Nalla et al., 2006;Koester et al., 2011;Granke et al., 2015;Brown et al., 2000). Here, we demonstrate that the fracture toughness properties of the inferomedial femoral neck also deteriorate with age but do not appear to be further compromised by severe osteoarthritis or osteoporosis. The lack of influence of these two musculoskeletal diseases on both fracture toughness and bulk indentation properties of the inferomedial femoral neck may at first glance appear Fig. 8. Correlation beween RPI measured indentation parameters (IDI and TID) and indent imaging measurement: a) Indent Depth, b) Pore Proximity and c) Damage/Crack Extent across the imaging modalities of micro computed tomography and fluorescence microscopy of the serial sectioned stack or the central slice of the indent. Each circle indicates one indent (which may have been imaged across multiple imaging modalities)preferentially selecting measurements based on, or most similar to, micro-CT. Fig. 9. Effect of an individual measurement (a) compared to the median of multiple repeat measurement on the same sample (b). c) Indicates a hyperbolic relationship indicating that a single measurement in close proximity to a pore is higher than the median measurement (the circled measurement, in very close proximity to a pore, is excluded from the analysis). Fig. 10. Correlation of fracture toughness measurements with: a) age, b) RPI indentation parameters (IDI) and c) whitening area. A young osteoarthritic donor (circled) has an anomalously high toughness and has been excluded from the analysis.
surprising. However, as discussed previously (Dickenson et al., 1981;Li and Aspden, 1997;Jenkins et al., 2016;Milovanovic et al., 2014;Coutts et al., 2016), there is rather limited evidence of differences in material properties with bone pathology, particularly when considering fracture toughness and the femoral neck.
These findings may only be indicative of location, i.e. only valid for the inferomedial neck. This hypothesis is supported by Poole et al. (2010) who found that the thickness and BMD were most affected in the superposterior and superoanterior regions and Bell et al. (1999) who found that porosity was only increased in the anterior quadrant with osteoporosis. A rationale behind some material (e.g. fracture toughness and bulk indentation properties as reported here) and structural (i.e. porosity and thickness as reported previously (Poole et al., 2010;Bell et al., 1999;Coutts et al., 2015)) properties in the inferomedial neck being relatively unaffected by disease, may relate to bone biomechanics in stance and fall. In stance, the inferomedial neck is loaded in compression and therefore may be protected from resorption and, similarly, maintain its material properties while the properties of the superolateral neck, in tension, could be compromised over time. During fall, loading is suddenly reversed to place the compromised superolateral neck in compression, initiating fracture in this location (de Bakker et al., 2009;Juszczyk et al., 2013). Therefore, the inferomedial region may be protected from increased fracture risk, maintaining its structure (Poole et al., 2010;Bell et al., 1999;Coutts et al., 2015) and material properties. It may, therefore, be that the material properties of the superolateral region, where structure is compromised with disease (Poole et al., 2010;Bell et al., 1999;Coutts et al., 2015) and being the location of fracture initiation (de Bakker et al., 2009;Juszczyk et al., 2013), are more affected by osteoporosis and OA. This would therefore be a valuable location to investigate, though in this study, the sample thickness, porosity and curvature limited indentation and the machining of fracture toughness samples to the inferomedial quadrant (Poole et al., 2010;Bell et al., 1999;Coutts et al., 2015;Jenkins et al., 2015).
We also present a negative correlation with age and maximum whitening area, which is also lower in both the osteoporotic fracture and osteoarthritic (THR) groups compared to the control. Though this effect is still observed when the whitening area is normalised against sample and pre-notch dimensions, this is not a definitive material property. Whitening may relate to a several factors including the geometry of the sample and experimental conditions; e.g. lighting, image resolution, testing environment, etc. However, it is very likely that the whitening area is indicative of the ability of bone experiencing tensile strains to dissipate energy, which in turn may affect its overall fracture risk, even if this does not translate to compromised fracture toughness.
This subset of samples that do not show a difference between health and disease in the bulk inferomedial properties, can also be considered representative of the larger population from our previous study. The indentation depth is significantly higher in the osteoporotic group when measuring on the surface around the circumference (15.9%, p = 0.059 with TID, 20.9%, p = 0.14 with IDI and 22.8%, p = 0.013 with CID) and, to a lesser extent, on the surface of the inferomedial region (1.8%, p = 0.54 with TID, 9.9%, p = 0.002 with IDI and 10.1%, p = 0.052), achieving similar results to our previously published work (Jenkins et al., 2016). Given these previous results, we concluded that a higher indentation depth measured around the femoral neck does not necessarily translate to the inferomedial region. We can now go further and suggest that a higher indentation depth measured on the surface of the bone does not necessarily translate to the indentation properties of the bulk cortical bone. Therefore, the surface properties of the bone may be more critical in fracture resistance. This speculation seems logical, as bone when it breaks in a bending configuration (and indeed any material) is more susceptible to a crack that forms on or very close to its surface (highest stress and strains) than within its bulk. The means that crack initiation begins towards the outer layer of bone (Nalla et al., 2003), with the properties of these layers therefore being more crucial than the bulk in resisting fracture initiation and propagation.
A limitation worth mentioning of this study is that, unlike our previous presented studies (Jenkins et al., 2016;Coutts et al., 2016), no statistical adjustment has been made to accommodate donor selection to the different group. There are differences between the groups, for example the osteoporotic fracture group is older and the osteoarthritic (THR) group has a lower proportion of females to males. However, as the differences between cohorts are minimal and already generally not significant, and these factors (as well as BMI and height) have been discussed to have minimal correlation with RPI depth (Jenkins et al., 2016;Coutts et al., 2016), adjustment has been deemed superfluous. Regardless of the rigorousness of the statistical analysis, the message remains clear, though fracture properties may be compromised with age, the differences between cohorts (OP, OA and control) in terms of fracture or bulk indentation properties of the inferomedial femoral neck is minimal.

Interpretation of RPI measurements
4.2.1. As a measure influenced by crack extension resistance Indentation in the transverse direction (approximately normal to the osteonal direction) is hypothesised to compress the lamellar layers at the point of indentation and cause relative motion between these layers. In this way shear leads to delamination, failure of the lamellar interfaces or interlamellar areas. The observed cracks experience deflection around osteons and along their long axes, therefore in the direction of the lamellae, providing evidence that is supportive of delamination. Crack deflection, relating to variation between layers (whether compositional or structural), is one mechanism that acts to impede propagation (Launey et al., 2010). Crack bridging or uncracked ligaments that may absorb crack driving energy are also visualised within the indent associated cracking, another crack resistance mechanism that significantly contributes to the total fracture toughness of bone (Ager et al., 2006;Nalla et al., 2004). It would be of future interest to quantify these features relative to RPI measurement and between health and pathology, however the applied visualisation techniques and relatively low numbers, especially of samples scanned with µCT, meant this was not possible for our study. Finally, it was illustrated using one single osteporotic fracture sample that the underlying structural features in the fuchsin-stained regions are indeed microcracks and diffuse microdamage. These appear on the order of microns and were uncovered through AFM imaging. This type of damage has been suggested to contribute to fracture toughness by dissipating the crack tip over an area rather than single point (Vashishth, 2007), in a similar manner to the differences between cohorts in whitening area as discussed above. It would be of interest to investigate, through further AFM imaging, whether the presence of micro-damage in this single image is generalisable to other indentations or non-pathological bone, although the fact that all samples were stained in the same manner this seems logically to be the case. By quantifying the associated microcracks in this damaged area with μCT and FLM, there is a correlation between crack length or damage 'extent' and RPI assessed indentation depth (up to r = 0.79). The presence of crack resistance mechanisms surrounding the indent and the relations to crack length, imply that RPI is, to some extent, assessing a crack initiation or growth property.
The extent of this microdamage or length of the microcracks (median 73.4 µm) on top of the indent diameter (median 155.4 µm) gives a 200-400 µm diameter of interaction. This means, that our previously recommended spacing  of 500 µm is suitable in terms of not affecting subsequent measurements/variability, but at this spacing there may be a risk of cracks coalescing to a critical length (100-300 µm (O'Brien et al., 2000)). This poses an important question of whether RPI measures, particularly in the intended cohort with an impaired remodelling response (though there may also be an intrinsic non-remodelling repair mechanism (Seref-Ferlengez et al., 2014)), may cause critical length cracks and further contribute to fracture risk. This question must certainly be investigated prior to any large-scale clinical use.
Though crack length contributes to the RPI measures, when related to the tissue-level fracture resistance or toughness there is only a small negative correlation with RPI (r ≤ −0.4). This is contrary to the strong negative correlation (r = −0.90) reported by Diez-Perez et al. (2010), likely due to the larger number of donors in the presented study (92 samples from 32 donors compared to 8 samples from 5 donors). Other differences could relate to the site of interest (femoral neck compared to the tibial midshaft) or the techniques used (whitening front tracking method, shown to correlate well with the crack front propagation (Katsamenis et al., 2015) as compared to direct crack propagation with environmental scanning electron microscopy (Diez-Perez et al., 2010)). However, our finding is supportive of Katsamenis et al. (2015) (for 20 samples from 4 donors r = −0.35 to −0.50) and Granke et al. (2015) (for 62 samples from 62 donors r = −0.26 to −0.44 for crack initiation, else not significant) when considering the human femoral midshaft. Therefore, it can be concluded that crack initiation and propagation do contribute to indentation measurements but this does not mean that RPI is a direct measurement of fracture, i.e. crack initiation, resistance or toughness from which the technique has a large degree of independence.
An interesting note is that our findings are also contrary to the lack of correlation (r < 0.03) reported by Carriero et al. (2014). This is likely due to the difference in species (human compared to mice specimens) but also that they were grouped by mutant models of disease. In this instance, different mechanisms for fracture could influence the RPI measured properties in the same way e.g. one cohort could be more susceptible to local deformation and another to localised cracking, both of which could increase IDI but have an altogether different effect on fracture toughness. This further highlights a need to understand the underlying mechanism and properties being measured in interpreting the measurement and, similarly, is a consideration between our presented OA, osteoporotic and control groups.

As a measure influenced by elastoplastic deformation resistance
Though some attempts to measure fracture toughness through indentation have previously been made such as through the Vickers Indentation Fracture technique (Mullins et al., 2007;Lewis and Nyman, 2008;Kruzic et al., 2009), this technique is controversial (Kruzic and Ritchie, 2008;Quinn and Bradt, 2007). Instead, indentation is conventionally related to the assessment of elastoplastic deformation such as elastic modulus or hardness through the Oliver and Pharr method (Oliver and Pharr, 1992;Fischer-Cripps, 2006). In RPI, both elastic (TID is greater than indent depth and some evidence of pile-up) and plastic deformation (residual imprint) are also observed through the shape of the indent. Both elastic and plastic deformation contribute to the residual indent depth and the measured size of this indent correlates strongly with RPI measures (r = 0.56-0.93 where significant). As with crack length and fracture toughness, the correlation with indent depth only translates to a minimal negative correlation with derived elastic modulus (r ≤ −0.4). This is supportive of the minimal and not statistically significant correlation between flexural stiffness or modulus and indentation depth reported by Granke et al. (2015) (26 human cadaveric samples) and Gallant et al. (2013) (18 rat femora and 19 dog ribs). The weak relation and partial independence from elastic modulus, and by extension conventional indentation techniques (e.g. Oliver and Pharr), can be related to the high loading rate (up to 60 N/s) and large scale (350 µm diameter) conical tip causing significant plastic damage (Schwiedrzik and Zysset, 2015;Chen and Bull, 2006;Paietta et al., 2011). This plastic damage, visualised both by the surrounding microcracking and also as the residual imprint, goes some way to explaining the previously reported relationship with toughness and strength (Granke et al., 2015;Gallant et al., 2013).

As a measure influenced by micro-structural properties
Based on the literature, RPI indent depth (e.g. TID, IDI or CID) can be considered a measure with a reasonably large contribution from plastic deformation as well as some contribution from elastic deformation and microdamage. This research supports this statement and provides evidence that it holds true in the human femoral neck but also adds another important contributor to the multifactorial measure. The direct impact of porosity on RPI has not previously been considered but this research displays both Haversian canal level porosity (r = 0.93 for IDI) and proximity to the closest pore (r = −0.55 for IDI) play a significant role. The effect of porosity, and other such local heterogeneities, appear to be greatly influential on a single indentation measure. There is a hyperbolic relationship between indent depth (normalized against the median depth for multiple repeat measurements) and proximity to a pore, reminiscent of the relationship with sample thickness that we previously reported (Jenkins et al., 2016). The vertical asymptote ranges from 5.2µm to 8.7 µm, indicating that an indent in very close proximity to a pore measures an infinitely high indent depth. The horizontal asymptote ranges between 0.70 and 0.87, indicating that an individual measurement far away from a pore is lower than the median measure for the sample, likely because the median is inclusive of measures in close proximity with higher indentation depths. Furthermore, the structure such as surface properties and circumferential heterogeneity in thickness or porosity, likely contribute to the differences when comparing osteoporotic fracture and osteoarthritic (THR) donors to controls and the RPI heterogeneity as previously reported (Jenkins et al., 2016;Coutts et al., 2016). Comparing the osteoporotic fracture, osteoarthritic and control samples, differences are observed when indenting around the surface of the neck (Jenkins et al., 2016;Coutts et al., 2016) but not when the surface is machined only from the inferomedial region as displayed here. Furthermore, the variation in indentation depth, both circumferentially and longitudinally, corresponds to variation in thickness and porosity (Poole et al., 2010;Bell et al., 1999) as reported by Coutts et al. (2015). Therefore, for a "true material measure", recommendations could be made to indent a certain distance from a pore (using the hyperbolic relationship as per thickness recommendations ) or to machine the sample, but this would likely exclude important micro-structural elements of fracture risk (i.e. porosity and surface properties).
It was necessary to combine FLM and µCT imaging techniques to increase the sample size but this has associated limitations. In particular, the techniques assess different forms of microdamage; µCT imaging reveals distinct linear microcracks whereas FLM images reveal a cloud of diffuse damage as well as linear microcracks. Confocal microscopy, which could improve the assessment of linear microcracks in FLM was not available. Despite this inconsistency, both techniques were also assessed individually and similar correlations between imaging and RPI measurements are observed using both techniques. Furthermore, any discrepancies between the techniques have been aimed to be minimized as follows. Where there were FLM measurements from the full stack as well as the central slice, the measurement in strongest agreement with µCT (for the three samples where there was overlap between techniques) was adopted preferentially. For example, the manual segmentation of the damage from the higher resolution central slice gave better agreement with µCT for damage extent (based on the hemisphere approximation). That is, where a single indent had been imaged with multiple techniques, the µCT measure was used preferentially in the analysis or, where there was no µCT, the FLM measure in best agreement with µCT was used, giving one measure per sample. In this respect, though combining these techniques has limitations, it was justified as it did allow for an increase in the number of indents imaged to support the findings of the individual techniques. Nevertheless, in Table 2 some of the correlations are also reported for individual techniques to further clarify the situation.
The imaging techniques support that the porosity, resistance to crack propagation and resistance to elastoplastic deformation all contribute to RPI measurements. However, the large degree of independence of RPI when correlated with material properties indicates that a single property (e.g. fracture toughness) can, as yet, not be directly inferred from RPI measures. This study makes progress into understanding the measurements made by the Biodent RPI device (which has primarily been using in laboratory studies), yet there are no similar studies investigating the Osteoprobe technique (proposed for clinical fracture risk assessment using an altogether different loading mechanism). Therefore, for the Osteoprobe technique, with arguably more clinical potential, the extent to which fracture toughness (or any other property) relates to the measured indentation depth ('bone material strength') is also not well understood. In terms of effective fracture risk assessment, it could be argued that assessment of a single material or structural property is not essential. However, as presented in this research, we believe it is important to understand what properties are being assessed for any clinical diagnostic technique and how these relate to the disease state being assessed. This is important to allow accurate interpretation of results and also aid targeted development of such a technique.

Conclusions
Contrary to earlier reports we find that RPI is a multifactorial measure that is influenced by structure (porosity, cortical thickness and the outer layers of the bone) as well as material properties (resistance to cracking and elastoplastic deformation). If the properties represented by RPI parameters can be isolated further, separating the contribution of different material and micro-structural properties, the technique has potential for in vivo assessment of multiple facets of bone quality. To target further development of such a mechanical assessment tool in the clinical setting (regardless of whether that is the discussed Biodent RPI technique or not), it must first be established which parameters are significant to predicting increased fracture risk. This is crucial as, without a mechanistic basis, the interpretation of any such technique is limited, regardless of its efficacy in fracture risk assessment.
Further, while we demonstrate a reduction of fracture toughness with age at the inferomedial femoral neck, the fracture toughness, derived elastic modulus and indentation depth of the bulk inferomedial neck do not appear further compromised in either osteoporosis or severe osteoarthritis. Whether this also extends to other quadrants and locations of the femoral neck remains to be shown.